(In reply to Jwtiyar Nariman from comment #62) > > Other characters not in this test file are sorted according to the defaults > > from > > > > copy "iso14651_t1" > > Sorting is good now, but adding these > reorder-after <S0631> % ر > > <S0695> % ڕ > > > > reorder-after <S0646> % ن > > <S0648> % و > > <S06C6> % ۆ > iam not understanding because for example this " <S0695> % ڕ " how you > order it?
copy "iso14651_t1" contains copy "iso14651_t1_common" and some modifications which affect only Chinese and Japanese. So we look into the iso14651_t1_common file to see what the default sort order is. We find for example: ... <S0631> % ARABIC LETTER REH <S0632> % ARABIC LETTER ZAIN <S0691> % ARABIC LETTER RREH <S0692> % ARABIC LETTER REH WITH SMALL V <S0693> % ARABIC LETTER REH WITH RING <S0694> % ARABIC LETTER REH WITH DOT BELOW <S0695> % ARABIC LETTER REH WITH SMALL V BELOW <S0696> % ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE ... Looking at this you see that ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW is sorted right after ڔ U+0694 ARABIC LETTER REH WITH DOT BELOW by default. That is not what you want for Kurdish. For Kurdish, you want ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW to be sorted right after ر U+0631 ARABIC LETTER REH. This is achieved by the rule: reorder-after <S0631> % ر <S0695> % ڕ Which removes U+0695 from its default position in the sort order and inserts it again after U+0631. reorder-after <S0646> % ن <S0648> % و <S06C6> % ۆ does a similar thing to change the sorting of U+0648 and U+06C6. To find out which of these rules I need, I created the ckb_IQ.UTF-8.in test file first and wrote the Kurdish characters in the order you wanted into that file. Then I ran a test sort using a ckb_IQ locale which had *only* LC_COLLATE copy "iso14651_t1" END LC_COLLATE and *nothing* else. The test sort showed that only U+0695, U+0648, and U+06C6 were sorted incorrectly. All other characters from your list of Kurdish characters were sorted correctly already. So I needed only to add rules to fix the sort order for these 3 characters. You can see the same by just reading the iso14651_t1_common and find out which of the Kurdish characters are already in the correct order in that file and which are not. You have to do nothing for the characters which are already in correct order. For the characters which are in a wrong position in iso14651_t1_common, you add rules like reorder-after <... collating-symbol after which to reorder ...> <... the collating-symbol which should be reordered ...> I found writing the test file and checking which characters are sorted wrongly by default easier than staring at iso14651_t1_common. And it is a good idea to have the test file anyway to make sure that the Kurdish sort order always stays correct when something is changed in glibc. If we have the test file, we will notice when some change causes a problem. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1388808 Title: Request for new language packages for Kurdish Sorani (ckb) To manage notifications about this bug go to: https://bugs.launchpad.net/glibc/+bug/1388808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs