In article <[EMAIL PROTECTED]>
[EMAIL PROTECTED] writes:

>> > > This is a showstopper, as we have 2 old and 2 new Chinese lists in the
>> > > archive. :/ It would be very bastardish to leave those lists with 
>> > > Glimpse.
>> > > Then again, I'm not sure Glimpse works fine with those lists, being so 
>> > > old,
>> > > made in a rather i18n-deprived times... Can someone confirm that 
>> > > searching
>> > > -chinese-* lists produces correct output, please? (Anthony?)
>> > 
>> > I've talked to the upstream people and while they are keen, they have no
>> > idea how to implement it.  The main problem is no of their programmers
>> > live in a place with dual-byte character sets.
>> 
>> Maybe they can get help from the people who developed namazu... just a hint.

Do you call me? :-)

Hmm... I don't know about Chinese. I think, it is hard to determine
word boundary in Chinese. So some word segmentation tools need for
processing Chinese (like kakasi, chasen in Japanese). I looked in the
output of "apt-get search chinese", but it seems there are no such
tool...

There is the another solution. It is "letter indexing
approach". However, that approach is more difficult to implement than
"word indexing approach". It sould be hard to implement it in Glimpse.
-- 
NOKUBI Takatsugu
E-mail: [EMAIL PROTECTED]
        [EMAIL PROTECTED] (Debian-JP)

Reply via email to