Second that - I was a client of Stellent - the libs work great but are
expensive. To see Stellent in action - get a copy of the free X1 desktop
search or the X1 server (Lucene based).
Another alternative is KeyView from Verity - now Autonomy.
-Original Message-
From: mark harwood [mailto:[
Unfortunately the term search at the site is down - gives 500 internal
server error.
-Original Message-
From: Dave Kor [mailto:[EMAIL PROTECTED]
Sent: Sunday, September 03, 2006 9:22 PM
To: java-user@lucene.apache.org
Subject: Re: word frequency list?
There is the Berkeley Web Term Frequ
(excuse the semi-appropriate forum to make this comment in - but it is very
brief and may actually help improve the final Lucene-based app)
You may also like to import popularity data from Amazon using their open
APIs and mix the relevancy between your own popularity score and theirs.
Dejan (affi
Indeed - you bring up interesting questions. You may want to take a look at
NUTCH first, however - I am not sure if they have done some of the
Google-like ranking you mention.
However - collaborative relevance enhancement, based on user feedback, would
be a nice Web-2.0-ish feature to bake into th
The approach we I find best is to create both Email documents - where a list
(and links) to all attachments is contained as well as individual Attachment
documents.
It gets a little tricky when you have a forwarded email, containing an
original Email that contains a tar.gz attachment, which contai
The important detail here is what you mean by "single server"?
A high-end server will work just fine - you want 4GB+ or RAM and the fastest
disk/IO you can get; CPU speed is far less important; A nice Linux software
RAID and 5+ 15K SCSI disks will get you superb performance, at a reasonable
price.
Yes - parallelizing works great - we built a share-nothing java-spaces based
system at X1 and on a 11-way cluster were able to index 350 office documents
per second - this included the binary-2-text conversion, using Stellent INSO
libraries. The trick is to create separate indexes and, if you do no
Michael -
Please take a look at our MakeTime UI here: http://www.maketime.com
It is in fact Lucene on the back end - albeit very hard to tell :)
Dejan
-Original Message-
From: Michael Prichard [mailto:[EMAIL PROTECTED]
Sent: Monday, July 17, 2006 8:00 PM
To: java-user@lucene.apache.org
Here is a use case I am trying to address.
I have two separate indexes, which contain sets of the same document
pool/corpus.
The two indexes have a different set of indexed fields.
One of the indexed fields is an external DocumentID.
I would like to perform searches, like a relational join, expre