I'm unable to reproduce this.
Jason have you tried on other computers (to rule out eg bad RAM/IO)?
Mike
On Wed, Mar 25, 2009 at 6:39 PM, Jason Rutherglen
wrote:
> LuceneError when executed should reproduce the failure. The
> contrib/benchmark libraries are required. MultiThreadDocAdd is a
> m
Another thing is to limit the max # merge threads CMS will run at
once. It defaults to 3 now.
Mike
On Thu, Mar 26, 2009 at 2:08 PM, Jason Rutherglen
wrote:
> I used the NoMergePolicy to build the index as I noticed the indexing is
> faster, meaning the system simply creates large multi-megabyte
I used the NoMergePolicy to build the index as I noticed the indexing is
faster, meaning the system simply creates large multi-megabyte segments in
the ram buffer, flushes them out and doesn't worry about merging which
causes massive disk trashing. I am pondering some benchmarks to find the
optima
LuceneError when executed should reproduce the failure. The
contrib/benchmark libraries are required. MultiThreadDocAdd is a
multithreaded indexing utility class.
On Wed, Mar 25, 2009 at 1:06 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> Each document is being created in a single
Each document is being created in a single thread, and the fields of the
document are not being updated elsewhere. I haven't posted the full code
yet as it needs to cleaned up. Thanks Mike!
On Tue, Mar 24, 2009 at 2:43 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> It looks like y
It looks like you are reusing a Field (the f.setValue(...) calls); are
you sure you're not changing a Document/Field while another thread is
adding it to the index?
If you can post the full code, then I can try to run it on my
wikipedia dump locally.
Mike
Jason Rutherglen wrote:
> Mike,
>
> It
Mike,
It only happens when at least 1 million documents are indexed in a
multithreaded fashion. Maybe I should post the code? I will try indexing
without the payload field, I assume it won't fail because I indexed
wikipedia before with no issues.
Thanks!
Jason
On Tue, Mar 24, 2009 at 12:25 PM
Using StandardAnalyzer. It's probably the payload field?
This is the code that creates the payload field:
private static class SinglePayloadTokenStream extends TokenStream {
private Token token = new Token(UID_TERM.text(), 0, 0);
private byte[] buffer = new byte[4];
I was just able to index all of wikipedia, using StandardAnalyzer,
with assertions enabled, without hitting that exception. Which
analyzer are you using (besides your payload field)?
Mike
Michael McCandless wrote:
> H.
>
> Jason is this easily/compactly repeated? EG, try to index the N doc
H.
Jason is this easily/compactly repeated? EG, try to index the N docs
before that one.
If you remove the SinglePayloadTokenStream field, does the exception
still happen?
Mike
Jason Rutherglen wrote:
> While indexing using
> contrib/org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker
While indexing using
contrib/org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker. The
asserion error is from TermsHashPerField.comparePostings(RawPostingList p1,
RawPostingList p2). A Payload is added to the document representing a UID.
Only 1-2 out of 1 million documents indexed generates th
11 matches
Mail list logo