You can have multiple languages in the same index. Just make sure that
your language identification process is consistent.
You might still get some false positives, for example, if there's a
German root that has the same letters as a French root, but means
something different, then it might still
20 apr 2006 kl. 13.34 skrev Daniel Cortes:
How do you do to obtain the most used words of and Index?
Use the terms() and termDocs() from IndexReader. Or if available,
use the term frequency vectors.
-
To unsubscribe, e-
Hi all,
I'm working at the search api of a multi language CMS, and I'm using the
latest Lucene release. I'm using the SnowballAnalyzer in order to have
stemmers for various languages. I know that I must use the same analyzer
for indexing and searching, in order to obtain correct hits, but can
Hi everybody,
I have a simple question for you. How do you do to obtain the most used
words of and Index?
In my case I want to obtain the 10 most used words in a group. I thinked
in use a TreeSet with all words and their frequencies of hits (whit the
restriction of GROUPID).
Someone have any
Normally the default setup for BooleanCluase count is 1024, may be your
query produce more query than 1024, one work around is that you set the
BooleanCluase count to more than 1024. You can do that by just invoking
the static method
BooleanQuery.setMaxClauseCount(2048);
supriya
Flávio Marim