RE: grouping results by fields

2006-01-31 Thread mark harwood
> When using the TermEnum method won't the terms be > analyzed Typically this doesn't matter because "group fields" tend to be things other than free-text eg * Articles totalled by Year/Month * Products totalled by category code * Emails totalled by sender If a group field's values aren't a st

RE: grouping results by fields

2006-01-31 Thread Mike Streeton
EMAIL PROTECTED] On Behalf Of Chris Hostetter Sent: 30 January 2006 22:12 To: java-user@lucene.apache.org Subject: RE: grouping results by fields : currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups are partial) , my : response time HAS

RE: grouping results by fields

2006-01-30 Thread zzzzz shalev
hey chris, i was using the hits.doc method while iterating,,, you've given me some hope!! i will look into the FieldCache Chris Hostetter <[EMAIL PROTECTED]> wrote: : currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups

RE: grouping results by fields

2006-01-30 Thread Chris Hostetter
: currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups are partial) , my : response time HAS to be at most 500-600 milli (query + groupings) or my : company will probably go with a commercial search engine such as FAST or : somethin

RE: grouping results by fields

2006-01-30 Thread zzzzz shalev
verage size of your results, the average size of your index, and the average number of terms in the fields you want to group by. : Date: Mon, 30 Jan 2006 17:45:10 + (GMT) : From: mark harwood : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: RE: grouping resu

RE: grouping results by fields

2006-01-30 Thread Chris Hostetter
the fields you want to group by. : Date: Mon, 30 Jan 2006 17:45:10 + (GMT) : From: mark harwood <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: RE: grouping results by fields : : > A simple solution if you only have 20,00

RE: grouping results by fields

2006-01-30 Thread mark harwood
> A simple solution if you only have 20,000 docs is > just to iterate > through the hits and count them up against each > color etc, The one thing to avoid is reader.document() calls in such a tight loop. This is always a killer. The best way I've found is to create one bitset for all the matchin

RE: grouping results by fields

2006-01-30 Thread Mike Streeton
less memory. Mike www.ardentia.com the home of NetSearch -Original Message- From: z shalev [mailto:[EMAIL PROTECTED] Sent: 30 January 2006 17:16 To: java-user@lucene.apache.org Subject: Re: grouping results by fields hey Jim, thanks alot for the quick reply! much appreciated

Re: grouping results by fields

2006-01-30 Thread zzzzz shalev
hey Jim, thanks alot for the quick reply! much appreciated i will look a little closer into what is done in C|Net , seems more cost efficient than what im currently doing ;) however i am not sure how scaleable the solution is if , for example, i recieved 20,000 results and i ha

Re: grouping results by fields

2006-01-29 Thread Jim Powers
We're doing something very similar. Recently C|Net started using Lucene and there is a blog entry about how they implemented a "category" scheme that basically does what you want. http://www.nabble.com/Announcement%3A-Lucene-powering-CNET.com-Product-Category-Listings-t266441.html#a748420 The

grouping results by fields

2006-01-29 Thread zzzzz shalev
hey, i have a bit of a complex problem, i need to group results recieved in a result set, for example: my result set returns 10,000 results there are about 10 fields in each result document i need to group the most frequent values appearing in each field. if 1 of m