date:20131008

Re: Exploiting a whole lot of memory

2013-10-08 Thread Benson Margulies

Oh, drat, I left out an 's'. I got it now. On Tue, Oct 8, 2013 at 7:40 PM, Benson Margulies wrote: > Mike, where do I find DirectPostingFormat? > > > On Tue, Oct 8, 2013 at 5:50 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> DirectPostingsFormat? >> >> It stores all terms + po

Re: Exploiting a whole lot of memory

2013-10-08 Thread Benson Margulies

Mike, where do I find DirectPostingFormat? On Tue, Oct 8, 2013 at 5:50 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > DirectPostingsFormat? > > It stores all terms + postings as simple java arrays, uncompressed. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, O

Re: Exploiting a whole lot of memory

2013-10-08 Thread Michael McCandless

DirectPostingsFormat? It stores all terms + postings as simple java arrays, uncompressed. Mike McCandless http://blog.mikemccandless.com On Tue, Oct 8, 2013 at 5:45 PM, Benson Margulies wrote: > Consider a Lucene index consisting of 10m documents with a total disk > footprint of 3G. Consider

Exploiting a whole lot of memory

2013-10-08 Thread Benson Margulies

Consider a Lucene index consisting of 10m documents with a total disk footprint of 3G. Consider an application that treats this index as read-only, and runs very complex queries over it. Queries with many terms, some of them 'fuzzy' and 'should' terms and a dismax. And, finally, consider doing all

Re: Analyzer classes versus the constituent components

2013-10-08 Thread Michael Sokolov

There are some Analyzer methods you might want to override (initReader for inserting a CharFilter, stuff about gaps), but if you don't need that, it seems to be mostly about packaging neatly, as you say. -Mike On 10/8/13 10:30 AM, Benson Margulies wrote: Is there some advice around about when

Re: Equivalent LatLongDistanceFilter in Lucene 4.4 API

2013-10-08 Thread David Smiley (@MITRE.org)

Hi James, The spatial module in v4 is completely different than the one in v3. It would be good for you to review the new API rather then looking for a 1-1 equivalent to a class that existed in v3. Take a look at the top level javadocs for the spatial module, and in particular look at SpatialExa

Analyzer classes versus the constituent components

2013-10-08 Thread Benson Margulies

Is there some advice around about when it's appropriate to create an Analyzer class, as opposed to just Tokenizer and TokenFilter classes? The advantage of the constituent elements is that they allow the consuming application to add more filters. The only disadvantage I see is that the following i

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

2013-10-08 Thread Michael McCandless

When you open this index for searching, how much heap do you give it? In general, you should give IndexWriter the same heap size, since during merge it will need to open N readers at once, and if you have RAM resident doc values fields, those need enough heap space. Also, the default DocValuesForm

Re: optimal way to access many TermVectors

2013-10-08 Thread Adrien Grand

Hi, On Mon, Oct 7, 2013 at 9:31 PM, Rose, Stuart J wrote: > Is there an optimal way to access many document TermVectors (in the same > chunk) consecutively when using the LZ4 termvector compression? > > I'm curious to know whether all TermVectors in a single compressed chunk are > decompressed

Re: Exploiting a whole lot of memory

Re: Exploiting a whole lot of memory

Re: Exploiting a whole lot of memory

Exploiting a whole lot of memory

Re: Analyzer classes versus the constituent components

Re: Equivalent LatLongDistanceFilter in Lucene 4.4 API

Analyzer classes versus the constituent components

Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

Re: optimal way to access many TermVectors

9 matches

Site Navigation

Mail list logo

Footer information