PEP 372 -- Adding an ordered directory to collections
Abstract This PEP proposes an ordered dictionary as a new data structure for the ``collections`` module, called "odict" in this PEP for short. The proposed API incorporates the experiences gained from working with similar implementations that exist in various real-world applications and other programming languages. Rationale = In current Python versions, the widely used built-in dict type does not specify an order for the key/value pairs stored. This makes it hard to use dictionaries as data storage for some specific use cases. Some dynamic programming languages like PHP and Ruby 1.9 guarantee a certain order on iteration. In those languages, and existing Python ordered-dict implementations, the ordering of items is defined by the time of insertion of the key. New keys are appended at the end, keys that are overwritten and not moved. The following example shows the behavior for simple assignments: >>> d = odict() >>> d['parrot'] = 'dead' >>> d['penguin'] = 'exploded' >>> d.items() [('parrot', 'dead'), ('penguin', 'exploded')] That the ordering is preserved makes an odict useful for a couple of situations: - XML/HTML processing libraries currently drop the ordering of attributes, use a list instead of a dict which makes filtering cumbersome, or implement their own ordered dictionary. This affects ElementTree, html5lib, Genshi and many more libraries. - There are many ordererd dict implementations in various libraries and applications, most of them subtly incompatible with each other. Furthermore, subclassing dict is a non-trivial task and many implementations don't override all the methods properly which can lead to unexpected results. Additionally, many ordered dicts are implemented in an inefficient way, making many operations more complex then they have to be. - PEP 3115 allows metaclasses to change the mapping object used for the class body. An ordered dict could be used to create ordered member declarations similar to C structs. This could be useful, for example, for future ``ctypes`` releases as well as ORMs that define database tables as classes, like the one the Django framework ships. Django currently uses an ugly hack to restore the ordering of members in database models. - Code ported from other programming languages such as PHP often depends on a ordered dict. Having an implementation of an ordering-preserving dictionary in the standard library could ease the transition and improve the compatibility of different libraries. Ordered Dict API The ordered dict API would be mostly compatible with dict and existing ordered dicts. (Note: this PEP refers to the Python 2.x dictionary API; the transfer to the 3.x API is trivial.) The constructor and ``update()`` both accept iterables of tuples as well as mappings like a dict does. The ordering however is preserved for the first case: >>> d = odict([('a', 'b'), ('c', 'd')]) >>> d.update({'foo': 'bar'}) >>> d collections.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')]) If ordered dicts are updated from regular dicts, the ordering of new keys is of course undefined again unless ``sort()`` is called. All iteration methods as well as ``keys()``, ``values()`` and ``items()`` return the values ordered by the the time the key-value pair was inserted: >>> d['spam'] = 'eggs' >>> d.keys() ['a', 'c', 'foo', 'spam'] >>> d.values() ['b', 'd', 'bar', 'eggs'] >>> d.items() [('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', 'eggs')] New methods not available on dict: ``odict.byindex(index)`` Index-based lookup is supported by ``byindex()`` which returns the key/value pair for an index, that is, the "position" of a key in the ordered dict. 0 is the first key/value pair, -1 the last. >>> d.byindex(2) ('foo', 'bar') ``odict.sort(cmp=None, key=None, reverse=False)`` Sorts the odict in place by cmp or key. This works exactly like ``list.sort()``, but the comparison functions are passed a key/value tuple, not only the value. >>> d = odict([(42, 1), (1, 4), (23, 7)]) >>> d.sort() >>> d collections.odict([(1, 4), (23, 7), (42, 1)]) ``odict.reverse()`` Reverses the odict in place. ``odict.__reverse__()`` Supports reverse iteration by key. Questions and Answers = What happens if an existing key is reassigned? The key is not moved but assigned a new value in place. This is consistent with existing implementations and allows subclasses to change the behavior easily:: class movingcollections.odict): def __setitem__(self, key, value): self.pop(key, None) odict.__setitem__(self, key, value) What happens if keys appear multiple times in the list passed to the constructor? The same as for regular dicts: The latter item overrides the former. This
Re: PEP 372 -- Adding an ordered directory to collections
Martin v. Löwis v.loewis.de> writes: > > > I think I have lost the thread here, sorry. So I explain again what I > > mean. I think for this data structure it's important to keep all the > > normal dict operations at the same speed. If you use a C > > implementation vaguely similar to my pure python recipe you can > > perform the del in O(1) too, because pairs are joined in (double) > > linked list. But such data structure is O(n) to find the n-th item > > inserted into the sequence. > > Right. So byindex(n) would be O(n) then, right? If so, what's the > purpose of having that method in the first place? What's the purpose of having list.insert? > The PEP doesn't give a rationale, but just proposes that the method > be there. My guess is that it includes it for performance reasons. > However, I think the PEP (author) is misguided in assuming that > making byindex() a method of odict, you get better performance than > directly doing .items()[n] - which, as you say, you won't. Without byindex the only way to cherry pick an item is either doing something like i = od.iteritems() for idx in xrange(offset): value = idx.next() return value Or return od.items()[offset] One creates tons of unnecessary method calls, the other creates a full blown list object just to throw it away later. Both less than optimal solutions that can be implemented in a more efficient way on the C layer where one only has to iterate over the linked list offset times and return the item. And iteration for that linked list is most likely something like "for (n = 0; n != offset; ++n) iter = iter->next". Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 372 -- Adding an ordered directory to collections
Small Status update of the changes incorporated in the PEP: - The PEP Title was fixed. Of course it's a dictionary not a directory :-) - A questions and answers section was added that covers some of the questions raised here and on the original thread on python-devel. - Comparison behavor was documented - a note on Python 3 support was added (about keys-views) Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list
Re: New python.org site
I don't like the new python.org But i have I (in my mind) nice idea. Dedicate python.org to the language developers and python interna. And create a nice small page on go-python.org dedicated to the users. It should *only* feature a documentation with a comment box on the bottom of each side, a download section and links to important python pages as well as a nice designed moinmoin wiki. Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list
Try Python!
Hiho, One week ago I came across the nice `Try Ruby!`_ demonstration which features an ajax based ruby console and a 20 minutes ruby tutorial. I really liked that application and so I started to port that to python. Since I got a bit confused by the very complex javascript code I wrote a webconsole from scratch. The result is a very basic python console which behaves like the CLI one, except that it can't handle `raw_input` or any other method call trying to access `sys.stdin`. At the moment the application is multithreaded and evaluated expressions in a dict holding the sessions variables of the client connections. Because of the behaviour the application breaks down easily and isn't secure. This happens because I haven't finished it yet. Additionally sessions don't have a timeout so you have to restart the server if it's eating to much RAM. If someone is interested in putting up that application on a public server I can tell the application to spawn from inside XEN hosts and to use forking instead of the multithreaded approach currently used. The application is licensed under the GNU GPL, the sourcecode is available via svn from:: http://trac.pocoo.org/repos/trypy Since it requires Paste, PasteDeploy and the current colubrid checkout, here the installation for copy/pasteing: - easy_install Paste - easy_install PasteDeploy - svn co http://trac.pocoo.org/repos/trypy - cd trypy - svn co http://trac.pocoo.org/repos/colubrid/trunk/colubrid - python trypy.py The last command starts the application. And here a screenshot of a running session: http://trac.pocoo.org/wiki/TryPy Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list
Re: Try Python!
BartlebyScrivener wrote: > Armin, > > Mike Meyer already took a crack at this, and his starts right up just > by clicking on the link. > > http://www.mired.org/home/mwm/try_python/ Hm. Looks not that useful since you can't create any functions and you can remove the prompt :-) > Yours looks prettier, but I don't think novices are going to be able to > figure out how to start it. They don't have to figure out if someone would install that on a public host. But therefore the application has to run inside of a jail or a XEN since python doesn't have a secure sandbox. Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list
Re: New Python website, new documentation ?
I paste my idea (original posted in #pythonpaste a few days ago): Chairos: what I really miss is a "go-python.org" page :) which would be what? showing up python gui application, python wsgi applications, a python documentation with a comment section, tutorials and a wiki and everything compact and python.org for the companies and core developers Something really basic. Only small tutorials, a documentation with comments and a list of python applications. With a clean design and only few links per page. Something like the rubyonrails page for example. Regards, Armin -- http://mail.python.org/mailman/listinfo/python-list