Re: Upgrading Lucene 2.0.0 TermQuery to 4.2 QueryParser

2013-04-03 Thread Lewis John Mcgibbney
Hi Uwe, This is dynamite and makes much more sense now when Iook back at the numerous occasions where I've been upgrading this. Plenty of work tomorrow! Your feedback is great, thank you Lewis On Wednesday, April 3, 2013, Uwe Schindler wrote: > Hi, > > You can use TermQuery and BooleanQuery in Lu

Re: Necessary to close() IndexSearcher in 4.X?

2013-04-03 Thread Lewis John Mcgibbney
Thanks for feeback Uwe. I'll not be looking at this until again tomorrow so at least this gives me time to think it through. On Wednesday, April 3, 2013, Uwe Schindler wrote: > Hi, > > In Lucene before 4.0 there was a close method in IndexSearcher, because you were able to create IndexSearcher us

RE: Upgrading Lucene 2.0.0 TermQuery to 4.2 QueryParser

2013-04-03 Thread Uwe Schindler
Hi, You can use TermQuery and BooleanQuery in Lucene 4.x in exactly the same way like in 2.0. No need to use QueryParser (and it's not a good idea to use QP for non-analyzed fields like product IDs). TermQuery is the way to go. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www

RE: Document scoring order?

2013-04-03 Thread Uwe Schindler
Hi Otis, they are generally processed in docId order. The special case "out-of-order" processing is only used for BooleanScorer1, in which the document IDs can be reported to the Collector out-of-order (because BooleanScorer scores documents in buckets). If you don’t allow out-of-order scoring,

RE: Necessary to close() IndexSearcher in 4.X?

2013-04-03 Thread Uwe Schindler
Hi, In Lucene before 4.0 there was a close method in IndexSearcher, because you were able to create IndexSearcher using Directory, which internally opened an IndexReader. This IndexReader had to be closed, so there was a need for IndexSearcher.close(). In 3.x this was constructor (taking Direc

Necessary to close() IndexSearcher in 4.X?

2013-04-03 Thread Lewis John Mcgibbney
Hi, I am encountering many situations where searcher.close() is present in finally blocks such as } finally { if (searcher != null) { try { searcher.close(); } catch (Exception ignore) { } searc

[ANNOUNCE] Apache Lucene 4.2.1 released

2013-04-03 Thread Mark Miller
April 2013, Apache Lucene™ 4.2.1 available The Lucene PMC is pleased to announce the release of Apache Lucene 4.2.1. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-te

Document scoring order?

2013-04-03 Thread Otis Gospodnetic
Hi, When Lucene scores matching documents, what is the order in which documents are processed/scored and can that be changed? I'm guessing it scores matches in whichever order they are stored in the index/on disk, which means by increasing docIDs? I do see some out of order scoring is possible..

Upgrading Lucene 2.0.0 TermQuery to 4.2 QueryParser

2013-04-03 Thread Lewis John Mcgibbney
Hi, I'm currently embarking upon a non trivial upgrade of some legacy 2.0.0 code and encounter the following IndexSearcher searcher = null; try { searcher = new IndexSearcher(indexFilePath); Term productIdTerm = new Term("product_id", productId);

Under the hood of SpanQueries

2013-04-03 Thread Igor Shalyminov
Hi all! I have a ~20GB index of documents that have words with several attributes associated with them, e.g.: WORD: word_1 word_2 ... word_n POS:pos1_1:pos1_2:pos1:3 pos2 ... pos_n_1:pos_n_2 LEMMA: lemma1_1:lemma1:2:lemma1_3 lemma2 lemma_n_1:lemma_n_2 Field tokens separated by ':' are ambig

Re: How to use concurrency efficiently

2013-04-03 Thread Paul Bell
All, Sorry, but I inadvertenly put my post re MultiFieldQueryParser in the wrong thread (wrong subject via cut and paste). Igor, thank you for the reply. I will look into what you suggest. -Paul On Wed, Apr 3, 2013 at 6:58 AM, Igor Shalyminov wrote: > I personally use SpanNearQuey (span posit

Re: CheckIndex tool

2013-04-03 Thread Michael McCandless
On Wed, Apr 3, 2013 at 10:31 AM, wrote: > Hello, > We have very old indexes (i.e. created with Lucene 2.1.0) we would like to > run the CheckIndex tool from the 2.9.4 Lucene jar file, since it is not > available in 2.1.0. Is it safe to assume that if we are not running with the > -fix option t

CheckIndex tool

2013-04-03 Thread ikoelliker
Hello, We have very old indexes (i.e. created with Lucene 2.1.0) we would like to run the CheckIndex tool from the 2.9.4 Lucene jar file, since it is not available in 2.1.0. Is it safe to assume that if we are not running with the -fix option that the indexes being checked aren't altered but the

Re: How to use concurrency efficiently

2013-04-03 Thread Igor Shalyminov
I personally use SpanNearQuey (span positions are always needed), and for different fields I use FieldMaskingSpanQuery class. I just choose one field name and then mask each SpanTermQuery's real field name with this field via wrapper. Maybe it can help. -- Igor 03.04.2013, 06:59, "Paul" : > H

Re: When should I commit IndexWriter and TaxonomyWriter if I use NRT readers?

2013-04-03 Thread Shai Erera
It's the same decision that you need to make regarding IndexWriter. You should commit when you want the data to be persistent. This can happen on a timer-basis (e.g. every 10 minutes), or following some application logic, e.g. finished crawling a website or indexing a chunk of documents. NRT suppo