Re: create a Filter/DocIdSet from a number of documents

2014-03-12 Thread Michael McCandless
Have a look at https://issues.apache.org/jira/browse/LUCENE-5489 ... it's a patch to add a query "rescoring" API to do what you're describing, I think. Mike McCandless http://blog.mikemccandless.com On Wed, Mar 12, 2014 at 1:41 PM, Christian Reuschling wrote: > -BEGIN PGP SIGNED MESSAGE---

create a Filter/DocIdSet from a number of documents

2014-03-12 Thread Christian Reuschling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have a small set of document numbers as a query result collected with some non-scoring collector. Now, I want to send high-performant successive queries only in this document number scope, as part of a customized Similarity implementation (modifie

Re: Indexing useful N-grams (phrases & entities) and adding payloads

2014-03-12 Thread Manuel Le Normand
SynonymFilter makes sense. The planned payloads are indeed not needed. I guess a better solution would be making out of the boost an attribute during query time that will be consumed in the queryParser in order to boost these n-gram terms. Thanks for the hints. Manuel On Wed, Mar 12, 2014 at 12

Attention: Lucene 4.8 and Solr 4.8 will require minimum Java 7

2014-03-12 Thread Uwe Schindler
Hi, the Apache Lucene/Solr committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Lucene and Apache Solr (version 4.8)! Support for Java 6 by Oracle already ended more than a year ago and Java 8 is coming out in a few days. The next release

Re: Indexing useful N-grams (phrases & entities) and adding payloads

2014-03-12 Thread Michael McCandless
You could also use SynonymFilter? Why does the boost need to be encoded in the index (in a payload) vs at query time when you create the TermQuery for that term? Does the boost vary depending on the surrounding context / document? Mike McCandless http://blog.mikemccandless.com On Wed, Mar 12,

Indexing useful N-grams (phrases & entities) and adding payloads

2014-03-12 Thread Manuel Le Normand
Hi, I posted this question on the Solr mailing list but it has more to do with Lucene. I have a performance and scoring problem for phrase queries 1. Performance - phrase queries involving frequent terms are very slow due to the reading of large positions posting list. 2. Scoring - I wan