Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread DM Smith
Because it is module.createSearchFramework, it has access to the conf and could vector to the right analyzer. It would be a very small change to the code, but with big impact. -- DM On Oct 7, 2012, at 7:40 PM, Karl Kleinpaste wrote: > DM Smith writes: >> For JSword, we use the language code

Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread Karl Kleinpaste
DM Smith writes: > For JSword, we use the language code as supplied in the conf to vector > into the selection of the best analyzer. OK, well, considering that the regular Sword interface to this is particularly generic, i.e. module.createSearchFramework(...), providing no way to pick a desired a

Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread DM Smith
SWORD uses an English analyzer (StandardAnalyzer) that works well for Latin-1 languages and for languages that bear some passing similarity to English (e.g. spaces between words, phonetic spelling, ...), but it does not do well with others. The Lucene project has a few Chinese analyzers. Basica

[sword-devel] Chinese lucene problem

2012-10-07 Thread Karl Kleinpaste
We've got a bug report in Xiphos saying that Chinese modules can't be searched well with CLucene indices. https://sourceforge.net/p/gnomesword/bugs/488/ I know nothing at all about Chinese, and can't address this. Can anyone supply some info? ___ swor