Re: Help me with this error on indexing

2009-11-20 Thread Fabrício Raphael
This hapened only on Windows, on Ubuntu it don't happen. And I corrected this problem by removing the commit, and put it in the end of the addition of all documents. On Fri, Nov 20, 2009 at 9:14 PM, Erick Erickson wrote: > What operating system are you running on? This sounds like Windows behavi

RE: ConcurrentMergeScheduler, Exception and transaction

2009-11-20 Thread Teruhiko Kurosaka
Jason, Even if there is no harm to the index, I think the error needs to be reported back to the caller. I am assuming a system error like a filesystem error (filesystem full etc.) that causes Lucene not be able to write new docs. If this happens, the caller needs to be reported of the error so t

Re: best way to iterate through all docs from a query

2009-11-20 Thread Erick Erickson
Unless I'm way off base, that's what the ScoreDocs[] array is all about Best Erick On Fri, Nov 20, 2009 at 3:21 PM, it99 wrote: > > Thanks for the info! > I was comparing from the Hits the nth number in the set number to > the documentId so that's why they are different. Is there anyway to

Re: Help me with this error on indexing

2009-11-20 Thread Erick Erickson
What operating system are you running on? This sounds like Windows behavior when some other process is holding the file open. Erick 2009/11/20 Fabrício Raphael > Hi, > > I am evaluating several search algoritms, and I iterate on each. In each > interation I delete the index directory, index

Re: ConcurrentMergeScheduler, Exception and transaction

2009-11-20 Thread Jason Rutherglen
Teruhiko, The index remains consistent even when a background merge fails, meaning commit truly represents a valid index after it's called. You can share merge schedulers, though in practice it's not going to improve anything. Jason 2009/11/20 Teruhiko Kurosaka : > I was experimenting how Lucene

ConcurrentMergeScheduler, Exception and transaction

2009-11-20 Thread Teruhiko Kurosaka
I was experimenting how Lucene handles 2-phase commit. Then I noticed I am not catching all Exceptions from Lucene. And I think this is because Lucene's default MergeScheduler is ConcurrentMergeScheduler, which spawns threads to its job, and Exceptions thrown in child threads are never reported to

NearSpansUnordered payloads

2009-11-20 Thread Jason Rutherglen
I'm interested in getting the payload information from the matching span, however it's unclear from the javadocs why NearSpansUnordered is different than NearSpansOrdered in this regard. NearSpansUnordered returns payloads in a hash set that's computed each method call by iterating over the SpanCe

Re: best way to iterate through all docs from a query

2009-11-20 Thread it99
Thanks for the info! I was comparing from the Hits the nth number in the set number to the documentId so that's why they are different. Is there anyway to get the 'nth number in set' if you have the docId without using the Hits object? Or is that a Hits only thing? Erick Erickson wrote: > > T

Help me with this error on indexing

2009-11-20 Thread Fabrício Raphael
Hi, I am evaluating several search algoritms, and I iterate on each. In each interation I delete the index directory, index the docs and I run the evaluation on the algoritm. The end of the iteration I close the indexReader. Then, in the second iteration the following error occurs int the doc 115

Re: best way to iterate through all docs from a query

2009-11-20 Thread Erick Erickson
The doc IDs should be consistent *unless* you did anything to the index, things you might not think would change anything. For instance, any kind of commit (assuming you'd ever deleted a document, say). etc. So if you haven't changed your index at all, your doc IDs won't change. But as I said, som

Re: best way to iterate through all docs from a query

2009-11-20 Thread it99
Thanks that helped a lot with the speed!! I am getting same search results but with different docIds. Is this expected and OK? Are they just arbitrar numbers If I changed from Hits hits = mSearcher.search(query, filter); To the following TopDocCollector collector = new

[ANN] Luke 0.9.9.1 release

2009-11-20 Thread Andrzej Bialecki
Hi all, I'm happy to announce a new release of Luke - the Lucene Index Toolbox. Please note that Luke development has moved to Google Code, and the downloads are available from the following page: http://code.google.com/p/luke/ In addition, all other parts of the Google Code infrastr

how to score in lucene

2009-11-20 Thread Wilson Wu
hi, I have a problem with scoring a document in lucene. I know there are some factors such as docNum,boost,idf,docFreq,lengthNorm and so on. And I also know how to count docNum,docFreq,idf, but I really have no idea about counting the lengNorm . thx. ---