
We have a research database (GeoDjango 1.5.1 on Postgres 9.2/PostGIS 2.0) 
including one model for words in any human language, where these words are 
entered in locally legible scripts (thus Sanskrit or Newari terms are in 
Devanagari, Persian in Perso-Arabic, Mandarin in Traditional Chinese, and 
Latin or English in Roman script). The problem is that the only script that 
sorts correctly is the Roman. This makes sense given the limits of 
LC_COLLATE in Postgres; collations are always local, such as en_gb.UTF8, 
and the only ‘global’ collations are C and POSIX, which sort by bit order 
and not by the Unicode Collation Algorithm. So far as I know there is no 
implementation of the Unicode Collation Algorithm within Postgres.

There is a pyuca module available on pypi, but I'm not a good enough coder 
to see how to wire it into the Django ORM to enable true Unicode sorts. Has 
anyone tackled this problem before? 

The relevant bit of the model reads:

class Name(models.Model):
    def __unicode__(self):
        return self.nomen + u" (" + self.language + u")"
    class Meta:
        ordering = ('language', 'nomen',)
        verbose_name = 'lexical item'
        verbose_name_plural = 'lexical items'
    name_id = models.AutoField(primary_key=True)
    uuid = uuidfield.UUIDField(auto=True)
    nomen = models.CharField(max_length=200, help_text="Please enter the 
name in an accurate script.")
    language = models.CharField(max_length=40, default="Latin", 
help_text="Please use standard language names or codes as defined in the 
ISO 639 standard.")

Many thanks,

- - -- --- ----- -------- -------------
Will Tuladhar-Douglas
University of Aberdeen

You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To post to this group, send email to
Visit this group at
For more options, visit

Reply via email to