Hi Han-Wen, I was actually thinking about this situation upside-down from how you're seeing it, details below
On Mon, Jul 29, 2024 at 10:30 AM Han-Wen Nienhuys <hanw...@gmail.com> wrote: > On Mon, Jul 29, 2024 at 8:56 AM Luca Fascione <l.fasci...@gmail.com> > wrote: > > [shifts are] going to be some random, non-integer quantity, right? > > Yes, but since the comparison works on pixel images, you can't see the > non-integer part of the shift. > Also, the rasterization that gets performed, is it anti aliased? > > Usually it is; we could turn it off and possibly make the images > larger, but IIRC that slows things down. > Actually my thought here is that if you have antialiasing on and you translate the image my a small amount in x and/or y, you alter _all_ the non-white pixels: this is because the renderer will account for the coverage of each object wrt each pixel slightly differently, and change all the shades of gray it generates because of this. If you render a square that is 1 pxl to the side aligned with the raster you get one black pixel. But if you translate it half pixel right and down you get 4 pixels each 0.25 grey (after unapplying gamma, assuming you're using "box 1 1" filtering). If you're trying to realign one image to the other in this scenario, you can see it'll be quite annoying to do this and recover one image from the other (in my example going from first image to second would work but going from second to first won't, the heart of the problem is that you're double-convolving with antialiasing filter, instead of deconvolving). > > It would seem that though shifts and changes in the lengths of the > staves are "common", small and relatively benign problems, rotations and > scales (magnifications) should be considered major disasters, right? > > Rotations do not generally happen. Virtually all the positioning is > rectilinear, and scaling is also not common. What happens that objects > end up in different locations, and sometimes variable objects (slurs, > beams) have different sizes. > Yes I meant the case that this would happen as a result of new defects introduced in the code, not as requests from the source. In other words: if lilypond emits everything rotated 0.2 degrees clockwise, some major coding disaster has certainly happened, and tests should fail loudly. On the contrary, if half the staves in a sheet move up by 0.35 pixel, this is probably because the bbox of a glyph is now a touch tighter, or the placing of an articulation mark is slightly different, all of which could in general be considered relatively benign. Not necessarily desirable, but certainly not a complete "all is on fire" situation either. Very agreed on the suggestion elsewhere that was brought up that tests should be small. I must admit I'm not familiar with how specifically lilypond is tested, but the ideal situation is - each test runs quickly - each test demonstrates a very small amount of features (ideally one or two) - the verification checks only specific aspects of the output (a test that renders articulations, should not check that the console output reports the right version of lilypond, for example) - this is useful, because in many cases it'll let you use fairly coarse thresholds for accept/reject, in that the checking part should have a wildly different outcomes from "good" vs "bad" This will give you hundreds of tests. Running hundreds of tests will takes a long time. This is usually not something most people look forward to dealing with. So once you have the above, you add hierarchies to the above so you can deploy a branch-and-bound strategy - Make bigger tests that check several things at once (these are probably approximately the tests you have now, I suspect) - These will fail using much tighter acceptance criteria (if they pass you're very sure it's all good) - When these "supertests" fail, the inner tests they cover are run, and a report is made containing the outcome of those - When these "supertests" pass, the inner tests are skipped: this is where you get major time savings But I digress. -- Luca Fascione