"Antony Bowesman" <[EMAIL PROTECTED]> wrote:
> The writer method does not return the number of deleted documents. Is
> there a
> technical reason why this is not done.
>
> I am planning to see about converting my batch deletions using
> IndexReader to
> IndexWriter, but I'm currently using the
The writer method does not return the number of deleted documents. Is there a
technical reason why this is not done.
I am planning to see about converting my batch deletions using IndexReader to
IndexWriter, but I'm currently using the return value to record stats.
Does the following give th
Suman Ghosh wrote:
The search functionality must be available during the index build. Since a
relatively small number of documents are being affected (and also we
plan to
perform the build during a period of time we know to be relatively quiet
from last 2 years site access data) during the buil
The search functionality must be available during the index build. Since a
relatively small number of documents are being affected (and also we plan to
perform the build during a period of time we know to be relatively quiet
from last 2 years site access data) during the build process, we hope tha
This looks correct to me. It's good you are doing the deletes
"in bulk" up front for each batch of documents. So I guess you
hit the error (& 5000 segments files) while processing batches
of 200 docs (because you then optimize in the end)?
Do you search this index while it's building, or, only
Yonik Seeley wrote:
Actually, in previous versions of Lucene, it *was* possible to get way
too many first level segments because of the wonky logic when the
IndexWriter was closed. That has been fixed in the trunk with the new
merge policy, and you will never see more than mergeFactor first lev
On 11/27/06, Michael McCandless <[EMAIL PROTECTED]> wrote:
Suman Ghosh wrote:
> On 11/27/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>> On 11/27/06, Suman Ghosh <[EMAIL PROTECTED]> wrote:
>> > Here are the values:
>> >
>> > mergeFactor=10
>> > maxMergeDocs=10
>> > minMergeDocs=100
>> >
>> >
Mike,
Below is the pseudo code of the application. A few implementation
points to understand the pseudo-code:
- We have a home grown threadpool class that allows us to index
multiple documents in parallel. We usually submit 200 jobs to the
pool (2-3 worker threads usually for the pool). O
Mike,
I've not tried it yet, but I think the problem can be reproduced.
However, it'll take a few hours to reach that threshhold since my code
also needs to extract text from some very large PDF documents to store
in the index.
I'll post the pseudo-code of my code tomorrow. Maybe that'll help
poi
Suman Ghosh wrote:
On 11/27/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 11/27/06, Suman Ghosh <[EMAIL PROTECTED]> wrote:
> Here are the values:
>
> mergeFactor=10
> maxMergeDocs=10
> minMergeDocs=100
>
> And I see your point. At the time of the crash, I have over 5000
> segments. I'll tr
Yonik,
Thanks for the pointer. I'll try the nightly build once the change is committed.
Suman
On 11/27/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 11/27/06, Suman Ghosh <[EMAIL PROTECTED]> wrote:
> Here are the values:
>
> mergeFactor=10
> maxMergeDocs=10
> minMergeDocs=100
>
> And I se
On 11/27/06, Suman Ghosh <[EMAIL PROTECTED]> wrote:
Here are the values:
mergeFactor=10
maxMergeDocs=10
minMergeDocs=100
And I see your point. At the time of the crash, I have over 5000
segments. I'll try some conservative number and try to rebuild the
index.
Although I don't see how thos
Here are the values:
mergeFactor=10
maxMergeDocs=10
minMergeDocs=100
And I see your point. At the time of the crash, I have over 5000
segments. I'll try some conservative number and try to rebuild the
index.
On 11/27/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 11/27/06, Suman Ghosh <[E
On 11/27/06, Suman Ghosh <[EMAIL PROTECTED]> wrote:
The last line [at
org.apache.lucene.index.MultiTermDocs.next(MultiReader.java:349)]
repeats another 1010 times before the program crashes.
I understand that without the actual index or the documents, it's
nearly impossible to narrow down the ca
ith a StackOverflowError while
calling indexreader.deleteDocuments(new Term()) method (even for the
document that was indexed earlier). Here is the partial stacktrace:
Exception in thread "main" java.lang.StackOverflowError
at java.lang.ref.Reference.(Reference.java
On 10/16/06, EDMOND KEMOKAI <[EMAIL PROTECTED]> wrote:
Can somebody please clarify the intended behaviour of
IndexReader.deleteDocuments()?
It deletes documents containing the term. The API docs are correct,
the demo docs are incorrect if they say otherwise.
-Yoni
Can somebody please clarify the intended behaviour of
IndexReader.deleteDocuments()?, between the various documentations and
implementations it seems this function is broken. API doc says it should
delete docs containing the provided term but instead it deletes all
documents not containg the
rote:
The javadoc is right. :)
Otis
- Original Message
From: EDMOND KEMOKAI <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, October 15, 2006 12:49:21 AM
Subject: IndexReader.deleteDocuments
Hi guys,
I am a newbee so excuse me if this is a repost. From the java
The javadoc is right. :)
Otis
- Original Message
From: EDMOND KEMOKAI <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, October 15, 2006 12:49:21 AM
Subject: IndexReader.deleteDocuments
Hi guys,
I am a newbee so excuse me if this is a repost. From the javadoc it
Hi guys,
I am a newbee so excuse me if this is a repost. From the javadoc it seems
Reader.deleteDocuments deletes only documents that have the provided term,
but the implementation examples that I have seen and from the behaviour of
my own app, deleteDocuments(term) deletes documents that don't ha
20 matches
Mail list logo