query: order of search

2010-03-31 Thread suman.holani
Hello Query I Does the order of query play role in searching example:doc has fields rollno(pk), name, marks Query : marks=90&rollno=2&name=abc Query :rollno=2&name=abc&marks=90 which query processing will be more efficient. is it work like search doc field by field , it will look for doc havi

Re: Lucene Challenge - sum, count, avg, etc.

2010-03-31 Thread Ken Krugler
Hi Mike, I'm sure there are better options, but one thing you could do is per- compute totals for different date resolutions. Depending on the number of unique affiliate IDs, this might work. E.g. pre-calculate sums by day & by week (and maybe by month) for each affiliate id, and then turn

Re: Lucene Challenge - sum, count, avg, etc.

2010-03-31 Thread prasenjit mukherjee
I too am trying to achieve something. I am thinking of storing the integer values in payloads and then using spanquery classes to compute the respective SUMs -Prasen On Thu, Apr 1, 2010 at 6:47 AM, Michel Nadeau wrote: > Hi, > > We're currently in the process of switching many of our screens f

Lucene Challenge - sum, count, avg, etc.

2010-03-31 Thread Michel Nadeau
Hi, We're currently in the process of switching many of our screens from MySQL to Lucene because MySQL simply dies because we have too much data and it's becoming too long to generate the stats we need. So here's one MySQL query that we use to find out our Top 10 Affiliates : SELECT SUM(sale_amo

Re: best way to interest two queries?

2010-03-31 Thread Erick Erickson
I'm not quite sure what you mean by "passed this, nothing matches that query anymore". But one approach to your intersection process would be to fire two queries. Use the first query to create a filter for the second. See QueryWrapperFilter in the javadocs... HTH Erick On Wed, Mar 31, 2010 at 2:0

best way to interest two queries?

2010-03-31 Thread Paul Libbrecht
Hello list, I've been wandering around but I see no solution yet: I would like to intersect two query results: going through the list of one query and indicating which ones actually match the other query or, even better, indicating that "passed this, nothing matches that query anymore".

Re: Designing a multilingual index

2010-03-31 Thread Paul Libbrecht
David, I'm doing exactly that. And I think there's one crucial advantage aside: multilingual queries: if your user requests "segment" you have no way to know which language he is searching for; erm, well, you have the user-language(s) (through the browser Accept-Language header for example)

Designing a multilingual index

2010-03-31 Thread David Vergnaud
Hi everyone! I'm about to build a search engine that will handle documents in several languages (4 for now but the number will increase in the near future). In order to index them properly and offer the best user experience, I'm automatically recognizing the language of each document in order t

Re: Is it safe to use reopen on IndexReader

2010-03-31 Thread Michael McCandless
It's perfectly safe to .reopen a reader when other threads are using that reader (eg, for searching, or for anything else). The reopen call doesn't affect the original reader in any way. You should close the old reader when you're done (if indeed a new reader was returned by .reopen), but if mult

Re: InstantiatedIndex performance

2010-03-31 Thread Karl Wettin
31 mar 2010 kl. 10.21 skrev Michael Stoppelman: I was wondering why the InstantiatedIndex gets very slow as the number of documents increases in the index. I've been looking at the source and have only found comments saying "it's slow" when the index is big but not why. Do folks just run

fastest way to gather simple terms that match documents?

2010-03-31 Thread Jason Eacott
Hi all, After I've run a query I need to know which terms matched each result document (ie doc termfrequency>0). the only way I know to do this is by calling explain on each document, which the documentation claims to be almost the equivalent of a new query for each call so I'm keen to avoid th

Is it safe to use reopen on IndexReader

2010-03-31 Thread Jason Tesser
Is it safe to use reopen on IndexReader if their are other threads who had readers out or do I need to use a ref counter to make sure all readers are checked in? Secondly right now we also check this when we reopen IndexReader ir = indexSearcher.getIndexReader(); indexSearcher

InstantiatedIndex performance

2010-03-31 Thread Michael Stoppelman
Hi all, I was wondering why the InstantiatedIndex gets very slow as the number of documents increases in the index. I've been looking at the source and have only found comments saying "it's slow" when the index is big but not why. Do folks just run out of memory or something deeper? Thanks for th

[ANN] Eclipse GIT plugin beta version released

2010-03-31 Thread Thomas Koch
GIT is one of the most popular distributed version control system. In the hope, that more Java developers may want to explore the world of easy branching, merging and patch management, I'd like to inform you, that a beta version of the upcoming Eclipse GIT plugin is available: http://www.infoq.