RE: Lucene 3.6.2 deleteDocument(docNum) and undeleteAll

2013-05-23 Thread ikoelliker
What about undeleteAll? Is there an equivalent on the IndexWriter side? -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Thursday, May 23, 2013 4:01 PM To: java-user@lucene.apache.org Subject: Re: Lucene 3.6.2 deleteDocument(docNum) and undeleteAll Use delete by que

Re: Lucene 3.6.2 deleteDocument(docNum) and undeleteAll

2013-05-23 Thread Uwe Schindler
Use delete by query in IndexWriter. No need to use IndexSearcher. ikoelli...@axsone.com schrieb: >Hello, >We have code running with Lucene 2.9.4 that does the following: > > >1. Check that number of documents to be deleted, found with a >particular query, matches the expected number we pas

Lucene 3.6.2 deleteDocument(docNum) and undeleteAll

2013-05-23 Thread ikoelliker
Hello, We have code running with Lucene 2.9.4 that does the following: 1. Check that number of documents to be deleted, found with a particular query, matches the expected number we pass in 2. For each ScoreDoc in the ScoreDoc[] returned from the search we call deleteDocument(score

Re: Getting position increments directly from the the index

2013-05-23 Thread Jack Krupansky
If you add a special "end of document term" then some of these calculations might be easier. And, give that special term a payload of the sentence count. While you're at it, insert "end of sentence" terms that could have a a payload of the sentence number. -- Jack Krupansky -Original Mes

Re: Getting position increments directly from the the index

2013-05-23 Thread Michael McCandless
On Thu, May 23, 2013 at 9:54 AM, Igor Shalyminov wrote: > But, just to clarify, is there a way to get, let's say, a vector of position > increments directly from the index, without re-parsing document contents? Term vectors (as Jack suggested) are one option, but they are very heavy (slows down

Re: Blåbærsyltetøy v.s. Räksmörgås

2013-05-23 Thread Karl Wettin
22 maj 2013 kl. 20:29 skrev Petite Abeille: > > On May 22, 2013, at 7:08 PM, Karl Wettin wrote: > >>> * Use a filter after ASCIIFoldingFilter that discriminate all use of ae, >>> oe, oo, and other combination of double vowels, just keeping the first one. >> >> I ended up with that solution.

Re: Getting position increments directly from the the index

2013-05-23 Thread Jack Krupansky
Take a look at the Term Vectors Component: http://wiki.apache.org/solr/TermVectorComponent -- Jack Krupansky -Original Message- From: Igor Shalyminov Sent: Thursday, May 23, 2013 9:54 AM To: java-user@lucene.apache.org Subject: Re: Getting position increments directly from the the inde

Re: Getting position increments directly from the the index

2013-05-23 Thread Igor Shalyminov
Thanks, Mike and Jack! Those are really good options. But, just to clarify, is there a way to get, let's say, a vector of position increments directly from the index, without re-parsing document contents? -- Best Regards, Igor 23.05.2013, 16:13, "Jack Krupansky" : > It might be nice to inquire

Re: Getting position increments directly from the the index

2013-05-23 Thread Jack Krupansky
It might be nice to inquire as to the largest position for a field in a document. Is that information kept anywhere? Not that I know of, although I suppose it can be calculated at runtime by running though all the terms of the field. Then he could just divide by 1000. -- Jack Krupansky -O

Re: Getting position increments directly from the the index

2013-05-23 Thread Michael McCandless
Do you actually index the sentence boundary as a token? If so, you could just get the totalTermFreq of that token? Mike McCandless http://blog.mikemccandless.com On Wed, May 22, 2013 at 10:11 AM, Igor Shalyminov wrote: > Hello! > > I'm storing sentence bounds in the index as position increme