Re: [sword-devel] imp2ld and alphabetization

2007-10-29 Thread DM Smith
Chris Little wrote: > DM Smith wrote: > >> On Oct 29, 2007, at 12:49 AM, Chris Little wrote: >> >>> It's possible to have multiple keys share a single entry. So >>> pointed and >>> an unpointed keys can point to the same entry. We've done this >>> experimentally with dictionaries in the p

Re: [sword-devel] imp2ld and alphabetization

2007-10-29 Thread Chris Little
DM Smith wrote: > On Oct 29, 2007, at 12:49 AM, Chris Little wrote: >> It's possible to have multiple keys share a single entry. So >> pointed and >> an unpointed keys can point to the same entry. We've done this >> experimentally with dictionaries in the past to permit lookup by a >> Strong's

Re: [sword-devel] imp2ld and alphabetization

2007-10-29 Thread DM Smith
I have entered this into our "bugs" database: http:// www.crosswire.org/bugs/browse/API-91 Serving Him Together, DM On Oct 29, 2007, at 1:00 AM, Troy A. Griffitts wrote: > Yes, everyone is correct that the .next() method on a Lexicon/ > Dictionary > module will show the next value in t

Re: [sword-devel] imp2ld and alphabetization

2007-10-29 Thread DM Smith
On Oct 29, 2007, at 12:49 AM, Chris Little wrote: > DM Smith wrote: >> I'm not sure if I am reading the Sword code correctly, but it appears >> that it is sorting at a byte level and not a character level. That >> isn't by code points. > > I'm pretty sure you're right about what Sword is actually

Re: [sword-devel] imp2ld and alphabetization

2007-10-29 Thread Frank
DM Smith wrote: > I'm not sure if I am reading the Sword code correctly, but it appears > that it is sorting at a byte level and not a character level. That > isn't by code points. > > I think that we discussed this a little bit ago and concluded that > some work needs to be done in the engin

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Troy A. Griffitts
Yes, everyone is correct that the .next() method on a Lexicon/Dictionary module will show the next value in the index-- not necessarily the next value alphabetized in any humanly useful order. The purpose for the index is fast lookups. We have a few issues to solve here and DM and others have g

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Chris Little
DM Smith wrote: > I'm not sure if I am reading the Sword code correctly, but it appears > that it is sorting at a byte level and not a character level. That > isn't by code points. I'm pretty sure you're right about what Sword is actually doing, but I believe it's also codepoint order, just b

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread DM Smith
I'm not sure if I am reading the Sword code correctly, but it appears that it is sorting at a byte level and not a character level. That isn't by code points. I think that we discussed this a little bit ago and concluded that some work needs to be done in the engine. Her is my thought on th

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Frank
peter wrote: > Is this really only a Vietnamese problem, but will not all latinate > scripts with extra signs have exactly the same problem? > > Or actually all scripts which are treated as derrived scripts - Farsi, > urdu and Malay from Arabic, Tajik, Uzbek, Azeri from Russian etc - the > code poi

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread peter
Is this really only a Vietnamese problem, but will not all latinate scripts with extra signs have exactly the same problem? Or actually all scripts which are treated as derrived scripts - Farsi, urdu and Malay from Arabic, Tajik, Uzbek, Azeri from Russian etc - the code points are initially for th

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Daniel Owens
Chris, I imagine that with most languages, sorting according to unicode codepoint order works, but for Vietnamese it doesn't, probably because the majority of letters are standard Latin characters, but then some are less usual ("đ" being a good example). This is probably very low on the prio

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Chris Little
Daniel, The order of keys in an LD module is according to the codepoint order in Unicode. They keys are kept in this order in order to permit binary searching. There is currently no way to perform localized collation. The platform and locale shouldn't play a role in this. If they do, it's a bu

Re: [sword-devel] imp2ld and alphabetization

2007-10-28 Thread Frank
Daniel Owens wrote: > I am working on creating dictionary modules based on the Free Vietnamese > Dictionary Project. The Vietnamese-English dictionary is working, but > some words are not in alphabetical order, and I am trying to find out > how to maintain the original alphabetization. > > I not

[sword-devel] imp2ld and alphabetization

2007-10-28 Thread Daniel Owens
I am working on creating dictionary modules based on the Free Vietnamese Dictionary Project. The Vietnamese-English dictionary is working, but some words are not in alphabetical order, and I am trying to find out how to maintain the original alphabetization. I noticed this when all of the words