On Thu, Oct 24, 2019 at 7:10 PM Andres Freund <and...@anarazel.de> wrote: > I really don't think it's ok to have as many abbrev abort related paths > without any coverage - the relevant code isn't that trivial. And > something like amcheck really doesn't strike me as sufficient. For one, > it doesn't provide any coverage either. For another, plenty sorts don't > end up in a form that amcheck sees.
I agree. > Tests aren't just there to verify that the current behaviour isn't > broken, they're also there to allow to develop with some confidence. And > I don't think tuplesort as is really allows that, and e.g. abbreviated > keys made that substantially worse. That happens, but I think it'd be > good if you could help improving the situation. I would like to improve this. I am mostly just pointing out that there has been resistance to this historically. I am in favor of mechanically increasing test coverage of tuplesort.c along the lines you describe. I'm just a bit concerned that Tom or others may see it differently. > E.g. > SELECT * FROM (SELECT ('00000000-0000-0000-0000-'||to_char(g.i, > '000000000000FM'))::uuid uuid FROM generate_series(15000, 0, -1) g(i)) d > ORDER BY uuid > reliably triggers abbreviated keys, and it looks to me like that should > be portable. With a few tweaks it'd be fairly easy to use that to > provide some OK coverage for most the abbrev key cases. I agree. > Yes, that's kind of my point? Either that patch reduced coverage, or it > created dead code. Neither is good. I agree. > > mergeruns() doesn't use abbreviated keys -- this code disables their > > use in the standard way. > > Well, then reformulate the point that we should have coverage for > mergeruns() when initially abbreviated keys were set up. That doesn't seem essentially, but I'm okay with it. > > I had to convince Tom to get the coverage of external sorts we have > > now. Apparently there are buildfarm animals that are very sensitive to > > that cost, that could have substantially increased test runtimes were > > we to do more. Perhaps this could be revisited. > > Hm. I'm a bit confused. Isn't all that's required to set a tiny amount > of work_mem? Then it's easy to trigger many passes without a lot of IO? Yes, but Tom felt that this might not be good enough when this was discussed in 2016. However, I seem to recall that he was pleasantly surprised at how small the overhead turned out to be. It's hard for me to test how much overhead this will have on a machine with horribly slow I/O. Though I just bought a new Raspberry Pi, and could test on that when I get back home from my trip to Europe -- it uses an SD card, which is pretty slow. > I'm not saying that tuplesort has gotten worse or anything. Just that > there's been too much development without adding tests. I agree. -- Peter Geoghegan