Re: [ot] a reverse lucene

2008-11-23 Thread markharw00d
If you index the queries consider also that they can potentially be indexed in an optimised form. For example, take a phrase query for "Alonso Smith". You need only index one of these terms - an incoming document must contain both terms to be considered a match. If you chose to index this quer

Re: [ot] a reverse lucene

2008-11-23 Thread Ian Holsman
Thanks for all the suggestions guys.. This is great! Andrzej Bialecki wrote: Ian Holsman wrote: Hi. apologies for the off-topic question. I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot

Re: [ot] a reverse lucene

2008-11-23 Thread Andrzej Bialecki
Ian Holsman wrote: Hi. apologies for the off-topic question. I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot of queries, and run them against a document to see which queries match it. (wi

Re: [ot] a reverse lucene

2008-11-23 Thread David Sheldon
On Sun, Nov 23, 2008 at 02:57:28PM +1100, Ian Holsman wrote: > I can see the case for this would be a news-article and several people > writing queries to get alerted if it matched a certain condition. I haven't tried this, but if you have lots of queries and few documents then consider using luc

Re: [ot] a reverse lucene

2008-11-23 Thread Grant Ingersoll
The "formal" name for this stuff is "document filtering" or just "filtering". You can start on it, by looking at TREC, which had a filtering task for a number of years: http://trec.nist.gov/tracks.html At any rate, one approach is to store your queries as Lucene documents, albeit short one

Re: [ot] a reverse lucene

2008-11-23 Thread jm
I am using MemoryIndex in a similar scenario. I have not as many queries though, less than 100, but several 'articles' coming per second. Works nicely. On Sun, Nov 23, 2008 at 10:00 AM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: >> >> Hi. apologie

Re: [ot] a reverse lucene

2008-11-23 Thread Ian Holsman
Thanks Erik. I'll start looking at that. regards Ian Erik Hatcher wrote: On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: Hi. apologies for the off-topic question. Not off-topic at all! I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the r

Re: [ot] a reverse lucene

2008-11-23 Thread Erik Hatcher
On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: Hi. apologies for the off-topic question. Not off-topic at all! I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot of queries, and run the

Re: [ot] a reverse lucene

2008-11-23 Thread Cool The Breezer
AIL PROTECTED]> wrote: > From: Ian Holsman <[EMAIL PROTECTED]> > Subject: Re: [ot] a reverse lucene > To: java-user@lucene.apache.org > Date: Sunday, November 23, 2008, 2:35 AM > Anshum wrote: > > Hi Ian, > > I guess that could be achieved if you write code to &g

Re: [ot] a reverse lucene

2008-11-22 Thread Ian Holsman
Anshum wrote: Hi Ian, I guess that could be achieved if you write code to read the queries and query for each document (using lucene). Assuming that I got the question right! :) yes.. that is one way, but probably not the most efficient one. think of something like http://www.google.com/al

Re: [ot] a reverse lucene

2008-11-22 Thread Anshum
Hi Ian, I guess that could be achieved if you write code to read the queries and query for each document (using lucene). Assuming that I got the question right! :) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distin

[ot] a reverse lucene

2008-11-22 Thread Ian Holsman
Hi. apologies for the off-topic question. I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot of queries, and run them against a document to see which queries match it. (with a score etc) I