Re: MoreLikeThis over a subset of documents

2008-04-23 Thread Karl Wettin
Jonathan Ariel skrev: Yes, it will be too much to do in real time, but it is a good idea tough. I don't know if a vector of term frequencies is stored with the document. Because I could search on the index to get the subset of documents and then take the term frequencies from there. In that case

Re: MoreLikeThis over a subset of documents

2008-04-23 Thread Jonathan Ariel
Yes, it will be too much to do in real time, but it is a good idea tough. I don't know if a vector of term frequencies is stored with the document. Because I could search on the index to get the subset of documents and then take the term frequencies from there. In that case I could change MoreLike

Re: MoreLikeThis over a subset of documents

2008-04-23 Thread Karl Wettin
Jonathan Ariel skrev: Smart idea, but it won't help me. I have almost 50 categories and eventually I would like to "filter" not just on category but maybe also on language, etc. Karl: what do you mean by measure the distance between the term vectors and cluster them in real time? I mean exactly

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Jonathan Ariel
in category A, only add the text to the > catA field. Now do MoreLikeThis on catA. This assumes you know the > categories at index time, of course. > Redundant but workable. > > -Glen > > 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>: > > Is there any way to execute a M

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Glen Newton
. This assumes you know the categories at index time, of course. Redundant but workable. -Glen 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>: > Is there any way to execute a MoreLikeThis over a subset of documents? I > need to retrieve a set of interesting keywords from a subset of docu

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Jonathan Ariel
I could have up to 2 million documents and growing. On Tue, Apr 22, 2008 at 7:29 PM, Karl Wettin <[EMAIL PROTECTED]> wrote: > Jonathan Ariel skrev: > > Is there any way to execute a MoreLikeThis over a subset of documents? I > > need to retrieve a set of interesting keyw

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Karl Wettin
Jonathan Ariel skrev: Is there any way to execute a MoreLikeThis over a subset of documents? I need to retrieve a set of interesting keywords from a subset of documents and not the entire index (imagine that my index has documents categorized as A, B and C and I just want to work with those

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Jonathan Ariel
(bq); > > -glen > > 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>: > > Is there any way to execute a MoreLikeThis over a subset of documents? I > > need to retrieve a set of interesting keywords from a subset of > documents > > and not the entire index (imagine

Re: MoreLikeThis over a subset of documents

2008-04-22 Thread Glen Newton
(bq); -glen 2008/4/22 Jonathan Ariel <[EMAIL PROTECTED]>: > Is there any way to execute a MoreLikeThis over a subset of documents? I > need to retrieve a set of interesting keywords from a subset of documents > and not the entire index (imagine that my index has documents cate

MoreLikeThis over a subset of documents

2008-04-22 Thread Jonathan Ariel
Is there any way to execute a MoreLikeThis over a subset of documents? I need to retrieve a set of interesting keywords from a subset of documents and not the entire index (imagine that my index has documents categorized as A, B and C and I just want to work with those categorized as A). Right now