I'd like to thank you all for getting this analyzed, especially Tom! Your rigor is pretty impressive. Seems like otherwise it'd impossible to maintain a DBS, though. In the end, I know a lot more of postgres internals and that this idiosyncrasy (from a user perspective) could happen again. I guess it is my first time where I actually encountered an unexpected worst case scenario like this... Seems it is up to me know to be a bit more creative with query optimzation. And in the end, it'll turn out to require an architectural change... As the only thing to achieve is in fact to obtain the last id (currently still with the constraint that it has to happen in an isolated subquery), i wonder whether this requirement (obtaining the last id) is worth a special technique/instrumentation/strategy ( lacking a good word here), given the fact that this data has a full logical ordering (in this case even total) and the use case is quite common I guess.
Some ideas from an earlier post: panam wrote: > > ... > This also made me wonder how the internal plan is carried out. Is the > engine able to leverage the fact that a part/range of the rows ["/index > entries"] is totally or partially ordered on disk, e.g. using some kind of > binary search or even "nearest neighbor"-search in that section (i.e. a > special "micro-plan" or algorithm)? Or is the speed-up "just" because > related data is usually "nearby" and most of the standard algorithms work > best with clustered data? > If the first is not the case, would that be a potential point for > improvement? Maybe it would even be more efficient, if there were some > sort of constraints that guarantee "ordered row" sections on the disk, > i.e. preventing the addition of a row that had an index value in between > two row values of an already ordered/clustered section. In the simplest > case, it would start with the "first" row and end with the "last" row (on > the time of doing the equivalent of "cluster"). So there would be a small > list saying rows with id x - rows with id y are guaranteed to be ordered > on disk (by id for example) now and for all times. > Maybe I am completely off the mark but what's your conclusion? To much effort for small scenarios? Nothing that should be handled on a DB level? A try to battle the laws of thermodynamics with small technical dodge? Thanks again panam -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-PERFORM-Hash-Anti-Join-performance-degradation-tp4443803p4446629.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers