Hackers, Our List implementation internally uses linked lists which are certainly good for some things, but pretty bad at other things. Linked lists are pretty bad when you want to fetch the Nth element, as it means looping ever each previous element until you get to the Nth. They're also pretty bad for a modern CPU due to their memory access pattern.
We could get around a few of these problems if Lists were implemented internally as arrays, however, arrays are pretty bad if we want to delete an item out the middle as we'd have to shuffle everything over one spot. Linked lists are much better here... at least they are when you already have the cell you want to remove... so we can't expect that we can just rewrite List to use arrays instead. I've attached a first cut implementation of "AList". I was originally going to call it "ArrayList", but my fingers were getting tired, so I change it. This first cut version replaces nothing in the code base to use ALists. This email (and due to the timing of it) is more of a marker that this is being worked on. I know in particular that Andres has complained at least once about List usages in the executor, which I agree is not great. Looking at the usage of list_nth() is quite scary! My plans for this include: * Make targetlists from createplan onwards use ALists. Every time I look at build_path_tlist() I gawk at the number of palloc calls that are going on inside those lappend() calls. We already know how many items will be in the list, so with AList we could just alist_premake(list_length(path->pathtarget->exprs)) and we'd never have to palloc() another element for that list again. We do the same again in various mutator functions in setrefs.c too! * list_nth usage in adjust_appendrel_attrs_mutator() to speed up translation between parent and child Vars * And also, umm, <mumble> simple_rte_array and simple_rel_array</mumble>. Well, we'd still need somewhere to store all the RelOptInfos, but the idea is that it would be rather nice if we could not expand the entire inheritance hierarchy of a relation, and it would be rather nice if we could just add the partitions that we need, rather than add them all and setting dummy paths for the ones we're not interested in. Anyway, please don't debate the usages of the new type here. As for all the above plans, I admit to not having a full handle on them yet. I wrote the Array List implementation so I could get a better idea on how possible each of those would be, so, for now, to prevent any duplicate work, here it is... (Including in Andres because I know he's mentioned his dislike for List in the executor) Comments on the design are welcome, but I was too late to the commitfest, so there are other priorities. However, if you have a strong opinion, feel free to voice it. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
0001-Basic-implementation-of-array-lists-AList.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers