Re: Does Lucene compress postings (or posting lists) in its inverted index?

2010-10-17 Thread Paul Libbrecht
Mahmoud, Lucene's documents' fields can be, when stored, compressed on disk. I think that answers your question. paul On 17 oct. 2010, at 09:16, Mahmoud Abdelkader wrote: > Hello, > > We're currently evaluating utilizing Lucene to index a large English corpus > and we were are optimizing for

RE: MultiFieldQueryParser

2010-10-17 Thread Lev Bronshtein
> > Why don't you use the parse method without the flags? > > public static Query parse(Version matchVersion, String[] queries, > String[] fields, > Analyzer analyzer) throws ParseException > Thank you for the suggestion Simon. However the point is that I want to apply one query as opposed to

Re: proposed change to CharTokenizer

2010-10-17 Thread Michael Sokolov
OK - no responses to this, but in case you were curious...the patch I suggested won't work - so please don't install it :) In the end I was able to get the behavior I wanted by fiddling with offsets in my CharFilter, but it requires detecting token boundaries in the CharFilter stage, which se

Re: Copying Payload from one Token to the next

2010-10-17 Thread Christoph Hermann
Am Sonntag, 17. Oktober 2010, 19:35:33 schrieb Ahmet Arslan: Hello, > > how can i copy the Payload from the current token to the > > following token in a > > TokenFilter? > org.apache.solr.analysis.BufferedTokenStream.java (that can peek n tokens > ahead in the buffered input stream, without mod

Re: Copying Payload from one Token to the next

2010-10-17 Thread Ahmet Arslan
org.apache.solr.analysis.BufferedTokenStream.java (that can peek n tokens ahead in the buffered input stream, without modifying the stream) and CommonGramsFilter.java may help. --- On Sat, 10/16/10, Christoph Hermann wrote: > From: Christoph Hermann > Subject: Copying Payload from one Token

Re: MultiFieldQueryParser

2010-10-17 Thread Simon Willnauer
On Thu, Oct 14, 2010 at 3:04 AM, Lev Bronshtein wrote: > > Hi Group, > > I have an isue when using MultiFieldQueryParser, I would like to use one > query against a number of fields however I get an > java.lang.IllegalArgumentException: queries.length != fields.length > > Looked at the javadoc, an

Re: Does Lucene compress postings (or posting lists) in its inverted index?

2010-10-17 Thread Simon Willnauer
Hi Mahmoud, On Sun, Oct 17, 2010 at 9:16 AM, Mahmoud Abdelkader wrote: > Hello, > > We're currently evaluating utilizing Lucene to index a large English corpus > and we were are optimizing for space. We're basically concerned that the > size of the postings lists will become extremely large. Does

Does Lucene compress postings (or posting lists) in its inverted index?

2010-10-17 Thread Mahmoud Abdelkader
Hello, We're currently evaluating utilizing Lucene to index a large English corpus and we were are optimizing for space. We're basically concerned that the size of the postings lists will become extremely large. Does Lucene provide some kind of compression for the generated posting lists within th