Please forceMerge only one time not every time (only to clean up your index)!
If you are doing a reindex already, just fix your close logic as discussed
before.
Scott Smith schrieb:
>Unfortunately, this is a production system which I can't touch (though
>I was able to get a full reindex sch
Hi Robert,
On Mar 15, 2013, at 11:29 AM, Robert Muir wrote:
> 2013/2/28 Steve Rowe :
>> EnglishAnalyzer has used PorterStemmer instead of the English Snowball
>> stemmer since it was created in 2010 as part of LUCENE-2055[2]. I think
>> this is an oversight: EnglishAnalyzer should incorporate
Hi lukai, thanks for the reply. Do you mean WAND is a way to resolve this
issue? For "native support", do you mean there is no built-in (existing
ready to use externally open source) module in Lucene to implement WAND? If
so, the performance will really be bad.
regards,
Lin
On Sat, Mar 16, 2013 a
Unfortunately, this is a production system which I can't touch (though I was
able to get a full reindex scheduled for tomorrow morning).
Are you suggesting that I do:
writer.forceMerge(1);
writer.close();
instead of just doing the close()?
-Original Message-
From: Simon Willnauer [ma
To answer your first question: "good guess" :-). Yes, this is running on
windows. Sorry, I should have mentioned this.
Your second point was very interesting. My assumption was that the IndexReader
would get closed when the garbage collector realized that these objects were no
longer being us
OK, your configuration seems fine. I would have the following idea:
- Are you using windows? If yes, then IndexWriter cannot remove unused files
when they are still in use (e.g. hold by an open IndexReader)
- When you get a new IndexReader after changes to the index, do you close the
old ones? If
On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith wrote:
> " Do you always close IndexWriter after adding few documents and when
> closing, disable "wait for merge"? In that case, all merges are interrupted
> and the merge policy never has a chance to merge at all (because you are
> opening and clo
" Do you always close IndexWriter after adding few documents and when closing,
disable "wait for merge"? In that case, all merges are interrupted and the
merge policy never has a chance to merge at all (because you are opening and
closing IndexWriter all the time with cancelling all merges)?"
F
Here's the code for the writer:
IndexWriterConfig iwc = new IndexWriterConfig(Constants.LUCENE_VERSION,
_analyzer);
LogByteSizeMergePolicy lbsm = new LogByteSizeMergePolicy();
lbsm.setUseCompoundFile(true);
iwc.setMergePolicy(lbsm);
Directory fsDir = FSDire
A little more data, of the 3330 files in the index, 2173 are CFS files and
average 120k. Another 1116 files are .del's and average about 4kB. The
remaining .prx, .frq, etc. consists of 41 files and total only 101MB. The
largest files are 3 .prx files which total less than 60MB and 2 .frq of a
Hi,
with standard configuartion, this cannot happen. What merge policy do you use?
This looks to me like a misconfigured merge policy or using the NoMergePolicy.
With 3,000 segments, it will be slow, the question is, why do you get those?
Another thing could be: Do you always close IndexWriter
Can you tell us a little more about how you use lucene, how do you
index, do you use NRT or do you open an IndexReader for every request,
do you maybe us a custom merge policy or somthing like this, any
special IndexWriter settings?
On Fri, Mar 15, 2013 at 11:15 PM, Scott Smith wrote:
> We have a
Great to read this, there is hope!
And Luke definitely deserves to be a Lucene module.
Wouter
> If anyone is able to donate some effort, a nice future scenario could be
> that Luke comes fully up to date with every Lucene release:
> https://issues.apache.org/jira/browse/LUCENE-2562
>
> - Mark
>
>
We have a system that is using lucene and the searches are very slow. The
number of documents is fairly small (less than 30,000) and each document is
typically only 2 to 10 kilo-characters. Yet, searches are taking 15-16 seconds.
One of the things I noticed was that the index directory has sev
If anyone is able to donate some effort, a nice future scenario could be that
Luke comes fully up to date with every Lucene release:
https://issues.apache.org/jira/browse/LUCENE-2562
- Mark
On Mar 15, 2013, at 5:58 AM, Eric Charles wrote:
> For the record, I happily use Luke (with Lucene 4.1)
I had implemented wand with solr/lucene. So far there is no performance
issue. There is no native support for this functionality, you need to
implement it by yourself..
On Fri, Mar 15, 2013 at 10:09 AM, Lin Ma wrote:
> Hello guys,
>
> Supposing I have one million documents, and each document ha
On Mar 15, 2013, at 11:25 AM, "Uwe Schindler" wrote:
> Hi,
>
> The API did not really change.
The API definitely did change, as before you would override the now-final
tokenStream method. But you are correct that this was not the root of the
problem.
> The bug is in your test:
> If you wou
Hi,
The API did not really change. The bug is in your test:
If you would carefully read the javadocs of the TokenStream interface, you
would notice that your consumer does not follow the correct workflow:
http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/analysis/TokenStream.html
In sh
Hi everyone,
I am trying to port forward to 4.2 some Lucene 3.2-era code that uses the
ASCIIFoldingFilter.
The token stream handling has changed significantly since them, and I cannot
figure out what I am doing wrong.
It seems that I should extend AnalyzerWrapper so that I can intercept the
To
Hello guys,
Supposing I have one million documents, and each document has hundreds of
features. For a given query, it also has hundreds of features. I want to
fetch most relevant top 1000 documents by dot product related features of
query and documents (query/document features are in the same feat
You have to reindex.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: ash nix [mailto:nixd...@gmail.com]
> Sent: Friday, March 15, 2013 4:57 PM
> To: java-user@lucene.apache.org
> Subject: re-indexing a f
Hi,
I have time stamp field which I should have indexed as DoubleField for
numericrange queries/filter to work.
I got it indexed as DoubleDocValuesField.
Is it possible to reindex this field?
Don't want to create a new index as it will take lot of time.
Pointer to some document or blog on reindexi
2013/2/28 Steve Rowe :
> EnglishAnalyzer has used PorterStemmer instead of the English Snowball
> stemmer since it was created in 2010 as part of LUCENE-2055[2]. I think this
> is an oversight: EnglishAnalyzer should incorporate the best English stemmer
> we've got, and Martin Porter says the
Awesome Steve, I'll try that and let you know. Thank you all for answers.
On Fri, Mar 15, 2013 at 12:24 AM, Steve Rowe wrote:
> Hi Bratislav,
>
> LUCENE-4517 sounds like what you want: <
> https://issues.apache.org/jira/browse/LUCENE-4517>: "Suggesters: allow to
> pass a user-defined predicate/f
For the record, I happily use Luke (with Lucene 4.1) compiled from
https://github.com/sonarme/luke. It is also mavenized (shipped with a
pom.xml).
Thx, Eric
On 14/03/2013 09:10, dizh wrote:
OK , tomorrow I will put it on spmewhere such as GitHub or googlecode.
But, I really don't look into
25 matches
Mail list logo