On Thu, Feb 21, 2019 at 7:27 PM David Rowley <david.row...@2ndquadrant.com> wrote: > As for the ArrayList patch that I'd been working on, I was a bit > blocked on it due to Tom's comment in [1], after having quickly looked > over your patch I see there's no solution for that complaint. > > (I'd been trying to think of a solution to this as a sort of idle > background task and so far I'd only come up with a sort of List > indexing system, where we could build additional data structures atop > of the list and have functions like list_nth() check for such an index > and use it if available. I don't particularly like the idea, but it > was the only way I could think of to work around Tom's complaint. I > felt there was just too much list API code that relies on the list > being a linked list. e.g checking for empty lists with list == NIL. > lnext() etc.) > > So in short, I'm in favour of not just having braindead linked lists > all over the place to store things. I will rejoice the day we move > away from that, but also Tom's comment kinda stuck with me. He's > often right too. Likely backpatching pain that Tom talks about would > be much less if we used C++, but... we don't.
Right, in this patch-set I wasn't really focused on how to replace the existing lists in a back-patch friendly style (a very legitimate concern), I was focused on how to get the best possible machine code for basic data structures and algorithms that are used all over the place, by inlining as many known-at-compile-time parameters as possible. And using the right data-structures in the first place by making them easily available; I guess that's the bit where your ideas and mine overlapped. I'm also interested in skipping gratuitous allocation and pointer chasing, even in cases where eg linked lists might not be "wrong" according to the access pattern, just because it's cache-destructive and a waste of cycles. As Alexander Stepanov said: "Use vectors whenever you can. If you cannot use vectors, redesign your solution so that you can use vectors", and I think there is also something interesting about keeping objects inside in their containing object, instead of immediately using pointers to somewhere else (obviously tricky with variable sized data, but that's what the small vector optimisation idea is about; whether it's worth it after the overhead of detecting the case, I don't know; std::string implementors seem to think so). I was primarily interested in new code, though I'm pretty sure there are places all over the tree where microoptimisations are possible through specialisation (think partitioning, sorting, searching, uniquifying in cases where the types are fixed, or vary at runtime but there are some common cases you might want to specialise for). For the cases you're interested in, maybe piecemeal replication of the planner lists that are measurably very hot is the only practical way to do it, along with evidence that it's really worth the future backpatching pain in each case. Or maybe there is some way to create a set of macros that map to List operations in back branches, as you were hinting at, to keep client code identical... I dunno. -- Thomas Munro https://enterprisedb.com