Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
They may not be dictionary, but they is a limited number of term entries and they seem regular. Your inquiries indicate you need a faceting feature (or even an sql-like set of queries backed up by a fast index...), probably with some pruning. Clustering is an unsupervised process that attempts to

Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
Thanks Dawid. I was trying to give some example, but this is not exactly our text. Our fields include things like "user name", "IP Address", "Application Name", "Port 3", "Byte Count" - all network related stuff. So, if user searches on certain IP address then we would need to group the result by u

Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
> 1) We index around 20 fields, of that we want to have grouping option > for five of them. For ex., user can search on name of the city and we > should have option to group by products available in that city (and > vice-versa). > Are these fields stricly defined or free text? Because if they are

Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
Thanks Dawid for the reply. Here is what we are trying to do, 1) We index around 20 fields, of that we want to have grouping option for five of them. For ex., user can search on name of the city and we should have option to group by products available in that city (and vice-versa). 2) We also need

Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
Can you shed some more light on what you're trying to achieve (what is the purpose of clustering -- are clusters to be utilized for front-end user interface, further data mining analysis, etc.)? With the sizes you report Carrot2 won't work for you, I'm afraid, but Mahout may. Still, there's plenty

Re: Clustering with Lucene

2005-10-17 Thread Stanislaw Osinski
Hi Joe, I'm one of Carrot2 developers and I have good news for you :) The example of using Carrot2 with Lucene is in the Carrot2 repository on SourceForge.net ( http://sourceforge.net/projects/carrot2). Please check out the "carrot2" module (http://cvs.sourceforge.net/viewcvs.py/carrot2/carrot2/)