Sure, the only danger is you have to make sure that both processes store
their lock files in the same directory (default they are in your home
directory I believe) unless you use a different locking mechanism.
There are supposed to be problems when accessing indices over network
shares, but I use
Sorry to contradict you Yonik, but I'm pretty sure the commit lock is
*not* locked during a merge, only while the "segments" file is being
updated.
The merge process takes a set of 'old' segment files, writes new segment
files and 'registers' them in the "segments" file when they are ready to
be
Hi,
I try to read the source code of the lucene. But i only find in the
TermScorer.java where the tf/idf measure is really implemented. I guess that
whether the Queryparser class will convert each word into a termquery first.
Then, queries such as the the Booleanquery are built.
The source code o
You can get the term frequency matrix first. Then, select the most frequent
terms.
One letter has said how to build the term frequency matrix.
regards
jiang xing
On 2/6/06, Pranay Jain <[EMAIL PROTECTED]> wrote:
>
> I have earlier used lucene and I must say it has performed bug free for
> the
>
Hi, what scale is this website? millions of posts or under?
wouldn't it be easiler to use a bayesian algorithm to scan each new post
before it is posted to detect whether it is acceptable or not? just a quick
idea of my head
_gk
- Original Message -
From: "Jeff Thorne" <[EMAIL PRO
I have earlier used lucene and I must say it has performed bug free for the
limited use I deployed it for. I now want to deploy lucene to do something
more. Once indexed, I want to know, which is the word which occurs maximum
times among all the rest in a document set. Does lucene already provide s
You can generate a token stream for a block of text without having to index
it. Take a look at the highlighter code, it does this very thing.
On 2/5/06, Jeff Thorne <[EMAIL PROTECTED]> wrote:
>
> I am trying to figure out whether or not Lucene is an appropriate solution
> for a problem that our
Jeff Thorne wrote:
I am trying to figure out whether or not Lucene is an appropriate solution
for a problem that our site faces.
I would like to analyze each users post for various words and expressions
before publishing their post to the DB. I am reading through the Lucene in
action book and
I am trying to figure out whether or not Lucene is an appropriate solution
for a problem that our site faces. Our site
allows users to post their opinions on various topics. Due to various
government legislations around the world our management would like us to
scan each users post against various
I have two applications, one which will be generating all the indexes and the
second one which will be reading those indexes. I cannot keep them in the same
application, because one will run all the times in batches via some sort of
scheduler to generate the indexes and the application which wil
Hi,
you have to write your own similarity object and pass it to your analyzer.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h
tml
Cheers,
Klaus
-Ursprüngliche Nachricht-
Von: xing jiang [mailto:[EMAIL PROTECTED]
Gesendet: Sonntag, 5. Februar 2006 04:27
An
Hi All,
I'm currently using the Default Similarity with the Boolean Query add
function to append clauses. The problem I face is this, given a query
, where = a term
it returns me a document which that has just ONE term in it say and
nothing else. Surprisingly, the hits score for this
I recommend you take a look at your indexes with Luke and see what
actually is indexed.
Erik
On Feb 4, 2006, at 11:54 PM, Xin Herbert Wu wrote:
Hi,
I have two libraries A and B indexed from database tables where A
has about
10 fields and B has about 30 fields(with about a couple
13 matches
Mail list logo