First, thanks for your reply. I was wondering about adding some extra clustering functionalities to Lucene. I wrote a clustering engine, based on hac/ahc and k-means algorithms based on Lucene search results. That work is based on a customized solution, and so I decided to write some general code . Right now I'm looking at this class com/mwroblewski/carrot/filter/ahcfilter/AHCFilter from carrot2 and found it to be very similar to my work;-) My aim is to provide some basic clustering functionalities to lucene search results. Carrot2 offers a lot of functionalities, like many inputs, I'm just trying to offer a simpler (much simpler!) clustering opportunity for lucene users. Hope I can get some good advices from you! ciao, Lorenzo
On 6/8/05, Dawid Weiss <[EMAIL PROTECTED]> wrote: > > > You should state your requirements clearly: > > 1. What data you want to cluster? (whole index/ search results) > 2. What is the role of the extension? How is it going to be used? > (front-end clusters, query refinement, etc) > 3. Do you need the implementation or an API for clustering in the > source code? (I'd personally stick to the API; there are many products > out there that perform clustering. Carrot2 is no exception -- there is > an excellent (in my humble opinion :) open source clustering algorithm > Lingo, but there is also a commercial component that is much faster and > more customizable. You can start off with an open source clusterer then > and switch to a commercial product if you want higher scalability or > different functionality. I implemented such an API in Nutch -- take a > look in its source code for hints). > > Dawid > > Lorenzo wrote: > > I see some noise about clustering and lucene, but I'm still waiting for > > someone that will help me creating a clustering extension. > > I know both carrot2 and weka (the first can be integrated with Lucene, > the > > latter may be - Falko can you tell me?) but would like to write > something > > that could be included in the sandbox (or similar) with an > implementation > > that we'll find the better for a general purpose environment. Maybe > carrot2 > > or other will be the best one (I really hope, I'm a lazy coder;-) ) and > so > > we will simply ask David to extend his code, but first want to make some > > tests. > > bye > > Lorenzo > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >