Pagination with MultiSearcher

2009-03-14 Thread Amin Mohammed-Coleman
Hi I'm looking at trying to implement pagination for my search project. I've been google-ing for a solution. So far no luck. I've seen implementations of HitCollector which looks promising, however my search method has to completely change. For example I'm currently using the following:

RE: AW: Re: Speeding up RangeQueries?

2009-03-14 Thread Uwe Schindler
Hi Niels, > meanwhile I got Trie working in indexing and querying. I haven't tried > yet with the large document collection but with my small test setup it > works well. > > Does Trie also work with ranges from negative to positive numbers? This is no problem, you can choose between long, int, d

Re: AW: Re: Speeding up RangeQueries?

2009-03-14 Thread Niels Ott
Hi Uwe, meanwhile I got Trie working in indexing and querying. I haven't tried yet with the large document collection but with my small test setup it works well. Does Trie also work with ranges from negative to positive numbers? Thank you very much for your support. Best Niels Uwe Schi

underscore a word separator in StandardAnalyzer?

2009-03-14 Thread Paul Libbrecht
Hello fellows of Lucene, I just discovered that the _ character is a word separator in the StandardAnalyzer. Can it be? It broke our usage of a field that stores a comma-separated list of "uri-fragments" which, of course, contain _: the standard-analyzer splits these as separate term whic

RE: AW: Re: Speeding up RangeQueries?

2009-03-14 Thread Uwe Schindler
Hallo Niels, Nice to hear. The Trie package will be included into Lucene 2.9, maybe it will move directly to lucene-core and may change its API or it will stay in Contrib-Queries, but it will be released soon. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eM

Re: AW: Re: Speeding up RangeQueries?

2009-03-14 Thread Niels Ott
Hello Uwe, thank you for clarifying things. I just checked the latest SVN revision of Lucene and apparently everything still works as it should with my system. Now I have to check if Trie does the job for me. I hope that Lucene 3 will include this fancy Trie package. Best, Niels Uwe Sch

RE: Lucene vs. LingPipe vs. GATE

2009-03-14 Thread syedfa
Thanks very much, I really appreciated! Fayyaz Uwe Schindler wrote: > > From what I can see on the JavaDocs of LingPipe, it is more for text > analyzing not searching. You could for example use the tokenizer package > to > tokenize your text/query before indexing/searching with these tokenizer

AW: Re: Speeding up RangeQueries?

2009-03-14 Thread Uwe Schindler
Hello Niels, You cannot use the trie package with current lucene stable. To compile, you must also apply LUCENE-1478 to the core. Another option is to checkout trie and remove the SortField and static FieldCache parsers from TrieUtils. I am the developer of trie and I use it with trunk lucene o

Re: Speeding up RangeQueries?

2009-03-14 Thread Yonik Seeley
On Sat, Mar 14, 2009 at 11:37 AM, Niels Ott wrote: > As far as I understand this is only available from the unreleased > development version, right? How safe is this version for use? > > Is it possible to use only the org.apache.lucene.search.trie package from > there together with the old and sta

Re: Speeding up RangeQueries?

2009-03-14 Thread Niels Ott
Hi Paul, Paul Elschot schrieb: Performance normally mostly depends on the number of terms indexed within the queried range. To limit the number of terms used during a range search, have a look here for more info on the new TrieRangeQuery: http://wiki.apache.org/lucene-java/SearchNumericalFields

Re: Speeding up RangeQueries?

2009-03-14 Thread Paul Elschot
On Saturday 14 March 2009 13:38:16 Niels Ott wrote: > Hi all, > > I'm working on my prototype system and it turns out that RangeQueries > are quite slow. In a first test I have about 80.000 documents in my > index and I combine two range queries with a normal text query using the > BooleanQuery

Re: Speeding up RangeQueries?

2009-03-14 Thread Yonik Seeley
On Sat, Mar 14, 2009 at 8:38 AM, Niels Ott wrote: > For now, I'm interested in a possibility to speed up range queries. Does the > performance of a range query depend on the length of contents in the field > in question? Usually the biggest factor is the number of terms in the range. The second

Re: A model for predicting indexing memory costs?

2009-03-14 Thread Florian Weimer
* mark harwood: > Thanks, I have a heap dump now from a run with reduced JVM memory > (in order to speed up a failure point) and am working through it > offline with VisualVm. > This test induced a proper OOM as opposed to one of those "timed out > waiting for GC " type OOMs so may be misleading.

Speeding up RangeQueries?

2009-03-14 Thread Niels Ott
Hi all, I'm working on my prototype system and it turns out that RangeQueries are quite slow. In a first test I have about 80.000 documents in my index and I combine two range queries with a normal text query using the BooleanQuery. On the long run I will need to enhance my index at indexing

RE: Lucene vs. LingPipe vs. GATE

2009-03-14 Thread Uwe Schindler
>From what I can see on the JavaDocs of LingPipe, it is more for text analyzing not searching. You could for example use the tokenizer package to tokenize your text/query before indexing/searching with these tokenizers instead of Lucene's. This could be done using a wrapper that transforms a LingPi

Lucene vs. LingPipe vs. GATE

2009-03-14 Thread syedfa
Dear fellow Java/Lucene developers: I have been working with lucene for the past year or so in developing search applications, and just recently discovered another API for Java called LingPipe. I have never used LingPipe, and would like to know what is the difference between the two, and if not,