Hi,
you cannot change the behavior of predefined analyzers! But since Lucene 5
there is no need to write your own subclass to define a custom analyzer. Just
use CustomAnalyzer and define via fluent builder API how your analysis should
look like (see example in javadocs):
https://lucene.apache.
Dear Users,
I need to develop my language specific analyzer that:
1) does not remove punctuations
2) lowercases and stems each term in the text.
I have tried some of the pre-implemented language analyzer (e.g. German and
Italian analyzers), but they remove punctuation. I/m not sure, but
probably
Return "false" for "out of order", save 1 sec for 1M records, at the end it
save 500 sec or ~10 minutes!
Thank you!
> On 14 нояб. 2015 г., at 15:54, Uwe Schindler wrote:
>
> For performance reasons, I would also return "false" for "out of order"
> documents. This allows to access stored fiel
Thank you!
Will follow you suggestion.
> On 14 нояб. 2015 г., at 15:54, Uwe Schindler wrote:
>
> For performance reasons, I would also return "false" for "out of order"
> documents. This allows to access stored fields in a more effective way
> (otherwise it seeks too much). For this type o
For performance reasons, I would also return "false" for "out of order"
documents. This allows to access stored fields in a more effective way
(otherwise it seeks too much). For this type of collector the IO cost is higher
than the small computing performance increase caused by out of order docu
Thank you very much!
> On 14 нояб. 2015 г., at 15:49, Uwe Schindler wrote:
>
> Hi,
>
> This code is buggy! The collect() call of the collector does not get a
> document ID relative to the top-level IndexSearcher, it only gets a document
> id relative to the reader reported in setNextReader
Hi,
This code is buggy! The collect() call of the collector does not get a document
ID relative to the top-level IndexSearcher, it only gets a document id relative
to the reader reported in setNextReader (which is a atomic reader responsible
for a single Lucene index segment).
In setNextReader
Hi, Uwe.
Thanks for you advise.
After implementing you suggestion, our calculation time drop down from ~20 days
to 3,5 hours.
/**
*
* DocumentFound - callback function for each document
*/
public void iterate(SearchOptions options, final DocumentFound found, final
Set loadFields) throws Exc
Thank you all,
I will further fix and investigate!
On Nov 14, 2015 10:00, "Uwe Schindler" wrote:
> I agree. On Linux it is impossible that MMapDirectory is the reason! Only
> on windows you cannot delete still open/mapped files.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
I agree. On Linux it is impossible that MMapDirectory is the reason! Only on
windows you cannot delete still open/mapped files.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Michael McCandless [mailto:
10 matches
Mail list logo