Hi,

I'm looking for a data type to store numerically labelled hierarchical data, 
such as section.subsection.paragraph numbers (e.g. '1.5.3', '1.10.2', '6.30').

The closest data type that I have found is ltree.  However, the collation order 
is inappropriate: it would put '1.10.2' before '1.5.3', since it performs a 
naïve memcmp() at each level.[1]

One way to get the desired sort order would be to use the semver extension.  
However, that's not really appropriate, as I don't want to store version 
numbers, and my data do not fit server's mandatory X.Y.Z three-level scheme.

Of course, I could define a hierarchy-of-integers data type and implement my 
own comparison functions.  I'm reluctant to cause a proliferation of data 
types, though, as ltree is semantically the type I want.  I'm just unhappy with 
its sort order.

Therefore, I would like to suggest that ltree be modified to use a smart 
comparator that recognizes numbers within strings and sorts them in a 
human-friendly way.  Apple[2] and recent versions of Windows[3] handle 
filenames this way.  One sample implementation of such a comparator is 
natsort.[4]

The performance impact of the enhanced comparator would probably be negligible, 
compared to I/O bottlenecks.  A bigger issue would be backwards compatibility, 
especially for ltrees with existing btree indexes.

Feedback?  Suggestions?

Derek


[1]: 
http://doxygen.postgresql.org/ltree__op_8c.html#a635600ad7aad78addf3c14a6e2d67fed

[2]: 
https://developer.apple.com/LIBRARY/IOS/#documentation/FileManagement/Conceptual/FileSystemProgrammingGUide/FileSystemDetails/FileSystemDetails.html

[3]: 
http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

[4]: http://sourcefrog.net/projects/natsort/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to