Peter Landgren <peter.tal...@telia.com> added the comment:
The È... comes from French surnames and our French developer wants to group all
versions
of E together. The É... can be found in French surnames in Sweden as well as in
Germany.
The program, GRAMPS is a genealogy program used in about 20 languages, so there
is no
preferred language.
I know. However, Swedish telephone books and dictionaries are sorted the same:
A,B,C... X,Y,Z,Å,Ä,Ö.
True. I agree.
GRAMPS runs in the locale of the user, but must be able to handle information
coming from
many other languages/countries. That's why it's hard to be universal.
We can have them in names. See above.
I think we have found a solution that can handle most cases.
We treat surnames beginning with "ÅÄÖ" special. I don't think that there are
many surnames
outside the Nordic countries that starts with any of these three letters.
Vielen dank!
/Peter
Added file: http://bugs.python.org/file13034/unnamed
_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue5200>
_______________________________________
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
"http://www.w3.org/TR/REC-html40/strict.dtd">
<html><head><meta name="qrichtext" content="1" /><style type="text/css">
p, li { white-space: pre-wrap; }
</style></head><body style=" font-family:'Sans Serif'; font-size:10pt;
font-weight:400; font-style:normal;">
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
I don't quite understand why you want to place Ã, Ã, Ã, Ã all along</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
with E, yet Ã
,Ã,Ã after Z. Because that's what the Swedish alphabet</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
says? </p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">The
Ã... comes from French surnames and our French developer wants to group all
versions of E together. The Ã... can be found in French surnames in Sweden as
well as in Germany.</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">The
program, GRAMPS is a genealogy program used in about 20 languages, so there is
no preferred language.</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
Please understand that collation varies across languages. For example</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
in German, we also have Ã, but it does *not* come after Z. Instead,</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
there are two ways to collate à (telephone book vs. dictionary):</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
1. Ã sorts exactly like A</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
2. Ã sorts as if it was transcribed as Ae</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">I
know. However, Swedish telephone books and dictionaries are sorted the same:</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">A,B,C... X,Y,Z,Ã
,Ã,Ã.</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
So there is no one true collation of Ã, but you have to take into</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
account what language rules you want to follow.</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">True.
I agree. </p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">GRAMPS runs in the locale of the user, but must be able to
handle information coming from many other languages/countries. That's why it's
hard to be universal.</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
If you want to implement Swedish rules, why then do you also want</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
to support Ã, Ã, Ã, Ã? Do you have these letters in Swedish at all?</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">We
can have them in names. See above.</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
If you want to use obscure collation rules, you might have to</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
implement the collation algorithm yourself. For example, assign</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
each letter a unique number (different from the Unicode ordinal),</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
and then sort by these numbers.</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
Take a look at ICU, which already includes collation algorithms</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">>
for many locales.</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">I
think we have found a solution that can handle most cases.</p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px; -qt-user-state:0;">We
treat surnames beginning with "Ã
ÃÃ" special. I don't think that there are
many surnames outside the Nordic countries that starts with any of these three
letters.</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">Vielen dank!</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px;
margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;">/Peter</p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px;
margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;
-qt-user-state:0;"></p></body></html>
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com