[HACKERS] linked list rewrite

Neil Conway Tue, 23 Mar 2004 02:09:11 -0800

I'd like to wrap up the linked list rewrite, and get the code into CVS reasonably soon. If you're unfamiliar with the motivations for redesigning the linked list code, check the archives for previous discussions, such as:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg02204.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg01320.html

The basic idea is to ditch the Lisp-style linked list representation (in which a "list" is merely a pointer to the head node), and adopt a new List struct like so:

typedef struct List
{
        NodeTag          type;  /* T_List, T_IntList, or T_OidList */
        int                      length;
        ListCell        *head;
        ListCell        *tail;
} List;

This allows us to implement most of the important list operations (length, prepend, append and concatenation) with constant time complexity, and speed up some other operations (such as equality).

The known remaining issues that need to be addressed are:

(1) API naming

In the latest prototype implementation, I've adopted a completely new naming convention for the functions in the list API. The new API looks like:

List *list_append(List *list, void *datum);
List *list_prepend(List *list, void *datum);
List *list_conc(List *list1, List *list2);
bool list_member(List *list, void *datum);
List *list_remove(List *list, void *datum);
etc.

Function variants which operate on integer lists (type = T_IntList) have an "_int" suffix, whereas OID list functions use an "_oid" suffix. By default, list operations use structural equality (i.e. equal()) -- if a list function uses pointer equality to determine if two list elements are the same, it has the "_simple" suffix. And so on.

In contrast, the existing List API is a lot more inconsistent (IMHO). I elaborated on some of the deficiencies of the existing API naming scheme here:

http://archives.postgresql.org/pgsql-patches/2004-01/msg00188.php

Tom objected to changing the names:

     http://archives.postgresql.org/pgsql-patches/2004-01/msg00186.php
     http://archives.postgresql.org/pgsql-patches/2004-01/msg00191.php

(I won't summarize Tom's cogent points here so as not to put words in his mouth; however, please read those posts.)

I think changing the list API names is a good idea for a few reasons. First and foremost, it improves the readability of the code and the consistency of the list API. Second, one objection to changing the API names is that it requires more changes to the rest of the tree. While that's indisputable, I think we'll realistically need to visit the vast majority of the List API call sites in order to implement the list rewrite anyway, so in the long run I don't think changing the API names will be that much extra work. Third, it has the benefit of cleanly and totally breaking existing code that uses the List API, so that we don't miss any List API call sites whose semantics have subtly changed (e.g. a switch of a particular function from structural equality to pointer equality).

So, would people prefer that we keep the old API names, or adopt the new naming conventions?

(2) Remaining implementation work

The code has drifted a fair bit underneath the latest list rewrite patch (which was incomplete anyway). It's unfortunately been a little while since I've looked at the patch, but I don't recall there being any showstopping problems that needed to be resolved --- it's just a matter of wrapping up remaining loose ends. I'm going to get started on this on today.

(3) Apply the work to CVS, and update the rest of the tree for the new API

The amount of integration work that needs to be done partially depends on the resolution to #1, but either way the list rewrite will require a lot of (relatively minor) changes scattered throughout the tree. What is the best way to land these changes in HEAD with minimal disturbance to other developers?

One possibility is to create a private CVS branch, implement any necessary changes throughout the source tree and test all those changes, and then merge the branch into HEAD once it is ready. This would still break outstanding patches, as Tom has pointed out, but it might reduce the disturbance to other developers. It doesn't really matter to me whether this is done or not (I can do my own revision control locally if there's a need for it), so speak up if you think this would be worth doing.

Another possibility is to create some wrapper macros / functions that would insulate the changes in the list API, so that we could make the changes incrementally. Not sure how viable this is.

That's about it. Does anyone know of anything else that will need to be tackled before we can merge the list rewrite?

Cheers,

Neil


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] linked list rewrite

Reply via email to