Zilverline Search Engine version 1.4.0 released

2005-06-08 Thread Zilverline info
All, I've just released Zilverline version 1.4.0. This version indexes in the background, detects ISBN numbers, and has (some) IMAP support. This version is fully webbased, all settings, collections, preferences can be set via the web interface. The source will be made available as well very s

Re: final modifier on IndexReader class

2005-06-08 Thread Daniel Naber
On Wednesday 08 June 2005 23:35, John Wang wrote: >    Why is there a final modifier on IndexReader.finalize? This has already been removed in the development version, no need for a bug report. Regards Daniel -- http://www.danielnaber.de -

final modifier on IndexReader class

2005-06-08 Thread John Wang
Hi: Why is there a final modifier on IndexReader.finalize? IndexReader is an abstract class and therefore is meant to be derived. The problem here is I am not able to provide a finalize method for my subclass since the finalized method is "finaled". I will create a bug. Thanks -John ---

Re: Lucene search clusters

2005-06-08 Thread Dawid Weiss
Your application works very well, congrats! May I ask how the input is looking? How are the terms selected, how do you model phrases? Do you handle titles different from the short summaries? Only search results (snippets and titles) are used. I ask, because the descriptions of your clusters

Re: Cannot search on plain numbers

2005-06-08 Thread Daniel Naber
On Tuesday 07 June 2005 22:48, Peter T. Brown wrote: > Thank You. I've re-read the FAQ and I think I've got a better > understanding of how I am confused. Presently I am using this > arrangement to get my analyzer: > >     public static class DefaultAnalyzer extends Analyzer { >         public Tok

Re: Indexing from multiple applications to a central index.

2005-06-08 Thread Daniel Naber
On Tuesday 07 June 2005 19:36, Doug Hughes wrote: > I am thinking that I can make all of the applications index into their > own index, not the central shared index.  Their own index might be a > FSDirectory or a RAMDirectory.  When done indexing, the applications' > indexes would be merged with t

Re: Doing a Join across indexes [was Documents returned by Scorer]

2005-06-08 Thread Paul Elschot
On Wednesday 08 June 2005 01:30, Matt Quail wrote: > > On 08/06/2005, at 1:33 AM, Paul Elschot wrote: > > > On Tuesday 07 June 2005 11:42, Matt Quail wrote: > > > >> I've been playing around with a custom Query, and I've just realized > >> that my Scorer is likely to return the same document more

Re: Lucene search clusters

2005-06-08 Thread Daniel Stephan
Your application works very well, congrats! May I ask how the input is looking? How are the terms selected, how do you model phrases? Do you handle titles different from the short summaries? What I am doing is: I remove stopwords, stem terms using snowballs default english stemmer, and then alread

Re: Lucene search clusters

2005-06-08 Thread Dawid Weiss
right, shit in - shit out :-). True. But in most cases clustering of search results can yield sensible clusters. Try, for example: http://demo.carrot-search.com/carrot2-remote-controller/newsearch.do?query=chips&processingChain=carrot2.process.lingo-cluster-odp&resultsRequested=200 We in f

Re: Lucene search clusters

2005-06-08 Thread Daniel Stephan
My experience is also limited and stems mostly from having read some papers with promising results. I went from the k-Means to the EM, because I was hoping that it would be able to model more complex relationships of my data. After all, EM is using multivariate gaussians, so its results should mirr

Re: Adding document with FileReader and deletions.

2005-06-08 Thread Chris D
On 6/7/05, Chris D <[EMAIL PROTECTED]> wrote: > Hi list, > > I've been trying to use lucene to index documents that change > occasionally with fields that change frequently. When I add the > contents of the file they are removed when I try to delete and readd > the document. I and am using somethi

Re: Lucene search clusters

2005-06-08 Thread Dawid Weiss
Lorenzo... Did you take a look at the mail I posted before? There was a ready-to-use clustering for Lucene there. It _is_ simple. I don't know what you mean by "much simpler" -- much simpler to use? You really don't have to know all of Carrot2 code to use it. You build, or fetch a precompiled

Re: Lucene search clusters

2005-06-08 Thread Lorenzo
First, thanks for your reply. I was wondering about adding some extra clustering functionalities to Lucene. I wrote a clustering engine, based on hac/ahc and k-means algorithms based on Lucene search results. That work is based on a customized solution, and so I decided to write some general cod

Re: Lucene search clusters

2005-06-08 Thread Dawid Weiss
You should state your requirements clearly: 1. What data you want to cluster? (whole index/ search results) 2. What is the role of the extension? How is it going to be used? (front-end clusters, query refinement, etc) 3. Do you need the implementation or an API for clustering in the source cod

Re: Lucene search clusters

2005-06-08 Thread Lorenzo
I see some noise about clustering and lucene, but I'm still waiting for someone that will help me creating a clustering extension. I know both carrot2 and weka (the first can be integrated with Lucene, the latter may be - Falko can you tell me?) but would like to write something that could be in

Re: Lucene search clusters

2005-06-08 Thread Falko Guderian
You can add the WEKA packages http://www.cs.waikato.ac.nz/ml/weka/ . It has an EM clusterer. -Falko Some people just replied, but I forgot the most important thing... I'm thinking of this project as part of the Google's Summer of Code program, so I'm looking for other students. I've sent an e

Re: Lucene search clusters

2005-06-08 Thread Lorenzo
Daniel, could you explain to me why you are using em clustering? Is there any best field or case for that technique? I don't have any em experience and would like to know something about that (just studying some papers...) Thanks, Lorenzo

URL-based access

2005-06-08 Thread LABATTE Jacques
Is there a way to allow Lucene to function using URL-based access ? Reader and writer requests are in the first machine ; Documents to index and indexes are in a second machine. Jacques.

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-08 Thread Paul Elschot
On Wednesday 08 June 2005 01:18, Kevin Burton wrote: > Paul Elschot wrote: > > >For a large number of indexes, it may be necessary to do this over > >multiple indexes by first getting the doc numbers for all indexes, > >then sorting these per index, then retrieving them > >from all indexes, and re