Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-29 Thread Michael McCandless
I'm unable to reproduce this. Jason have you tried on other computers (to rule out eg bad RAM/IO)? Mike On Wed, Mar 25, 2009 at 6:39 PM, Jason Rutherglen wrote: > LuceneError when executed should reproduce the failure.  The > contrib/benchmark libraries are required.  MultiThreadDocAdd is a > m

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-26 Thread Michael McCandless
Another thing is to limit the max # merge threads CMS will run at once. It defaults to 3 now. Mike On Thu, Mar 26, 2009 at 2:08 PM, Jason Rutherglen wrote: > I used the NoMergePolicy to build the index as I noticed the indexing is > faster, meaning the system simply creates large multi-megabyte

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-26 Thread Jason Rutherglen
I used the NoMergePolicy to build the index as I noticed the indexing is faster, meaning the system simply creates large multi-megabyte segments in the ram buffer, flushes them out and doesn't worry about merging which causes massive disk trashing. I am pondering some benchmarks to find the optima

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-25 Thread Jason Rutherglen
LuceneError when executed should reproduce the failure. The contrib/benchmark libraries are required. MultiThreadDocAdd is a multithreaded indexing utility class. On Wed, Mar 25, 2009 at 1:06 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > Each document is being created in a single

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-25 Thread Jason Rutherglen
Each document is being created in a single thread, and the fields of the document are not being updated elsewhere. I haven't posted the full code yet as it needs to cleaned up. Thanks Mike! On Tue, Mar 24, 2009 at 2:43 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > It looks like y

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Michael McCandless
It looks like you are reusing a Field (the f.setValue(...) calls); are you sure you're not changing a Document/Field while another thread is adding it to the index? If you can post the full code, then I can try to run it on my wikipedia dump locally. Mike Jason Rutherglen wrote: > Mike, > > It

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Jason Rutherglen
Mike, It only happens when at least 1 million documents are indexed in a multithreaded fashion. Maybe I should post the code? I will try indexing without the payload field, I assume it won't fail because I indexed wikipedia before with no issues. Thanks! Jason On Tue, Mar 24, 2009 at 12:25 PM

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Jason Rutherglen
Using StandardAnalyzer. It's probably the payload field? This is the code that creates the payload field: private static class SinglePayloadTokenStream extends TokenStream { private Token token = new Token(UID_TERM.text(), 0, 0); private byte[] buffer = new byte[4];

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Michael McCandless
I was just able to index all of wikipedia, using StandardAnalyzer, with assertions enabled, without hitting that exception. Which analyzer are you using (besides your payload field)? Mike Michael McCandless wrote: > H. > > Jason is this easily/compactly repeated?  EG, try to index the N doc

Re: Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Michael McCandless
H. Jason is this easily/compactly repeated? EG, try to index the N docs before that one. If you remove the SinglePayloadTokenStream field, does the exception still happen? Mike Jason Rutherglen wrote: > While indexing using > contrib/org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker

Assertion Error in TermsHashPerField.comparePostings - Lucene 2.4

2009-03-24 Thread Jason Rutherglen
While indexing using contrib/org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker. The asserion error is from TermsHashPerField.comparePostings(RawPostingList p1, RawPostingList p2). A Payload is added to the document representing a UID. Only 1-2 out of 1 million documents indexed generates th