Hello everybody,

I recently found a problem with sorting german 'Umlaute' . I hope the encoding 
of this mail works ;-)  :

Postgres puts Umlaute (i.e., ÄäÖöÜü)  at the very end of the Alphabet, and 
this is not the way it should be.  I didn't check for the special Character 
'ß', but its probably similar.

The canonical sort order for Umlaute is to treat them as two characters, like 
this:
ä -> ae
ö -> oe
ü -> ue
ß -> ss
( and the same for upper case 'ÄÖÜ'. 'ß' does not have an upper case )

Well, I guess this might be difficult to implement and might have quite an 
impact on performance. The solution I know from other databases consists of 
inserting ä after a, ö after o, ü after u and ß after s. Afaik this is 
generally accepted.

upper() does not handle Umlaute correctly as well. It leaves äöü unchanged 
instead of converting them to upper case.

All this happens with a database  created with encoding ='latin1'. If there 
are better results with a different encoding (I didn't try it yet), I'd 
suggest adding some information about this in the documentation.

Thanks for your work,

N.Erichsen

-- 
HSH Soft-und Hardware Vertriebs GmbH
Rudolf-Diesel-Straße 2 - 16321 Lindenberg
Tel. (030) 94004 - 509  Fax (030) 94004 - 400

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to