Hi Radek, I have some thoughts to share that might be useful. First, recognize that you get different output from infinite thinning than from infinite skeletonization. (bwmorph(bw, 'skeleton', Inf) vs bwmorph(bw, 'thin', Inf) ), and that the latter might give you a better starting point for your training. (You might not even have spurs to worry about.)
Second, regardless of which thinning approach you start with, if you calculate a bwdistgeodesic transform on your thinned bw image--using a mask that is true at all of the endpoints and false elsewhere--the longest constrained path will be the one that contains the maximum value in transformed image. You can reconstruct that spur-less path by tracing along that path, keeping only largest neighbors. (Spurs will necessarily have smaller distance values.)
I have attached a bit of code that will recreate that path from the original binary image. It runs (including the thinning) on the screen capture of your image of a "one" in about 2 msec.