involves multiple
terms and/or multiple fields, right?
/Jong
On Mon, Apr 23, 2012 at 11:58 AM, Earl Hood wrote:
> On Mon, Apr 23, 2012 at 10:31 AM, Jong Kim wrote:
>
> > Is there any good way to solve this design problem? Obviously, an
> > alternative design would be to split
When I update a document in Lucene (i.e., re-indexing), I have to delete
the existing document, and create a new one. My understanding is that this
assigns a new doc ID for the newly created document. If that is the case,
is it true that the system can rather quickly run out of doc ID space
(which
Hi,
According to the Lucene In Action (Second Edition), the section 2.11.2
"Accessing an index over a remote file system" explains that there are
issues related to accessing a Lucene index across remote file system
including NFS.
I'm particuarly interested in NFS compatibility, and wondering if t
012 at 3:21 AM, Vitaly Funstein
> wrote:
> >> How tolerant is your project of decreased search and indexing
> performance?
> >> You could probably write a simple test that compares search and write
> >> speeds of local and NFS-mounted indexes and make the decision
ght?
>
> Paul
>
>
> Le 2 oct. 2012 à 14:01, Jong Kim a écrit :
>
> > Thank you all for reply.
> >
> > So it soudns like it is a known fact that the performance would suffer
> > rather significantly when the index files are accessed over NFS. But how
> >
rather than corruption). I've seen
> fairly large infrastructures being based on NFS and corruption is something
> I've never heard about.
> >
> > Note: no concurrent access to a lucene index, right?
> >
> > Paul
> >
> >
> > Le 2 oct.
random access to files and this has no reason to be
> > unreliable unless bad things such as network drops happen (in which case
> you'd
> > get direct failures or timeouts rather than corruption). I've seen
> fairly large
> > infrastructures being based on NFS and
a replication. You end up repeating indexing once per
> replica. You also may have to move the indices around as you
> add/remove/restart nodes. We are moving to this architecture with a new
> product, so I am just now starting to understand the trade-offs.
>
> Hope that helps.
>
;
> My 2 cents,
> Tommaso
>
> 2012/10/2 Jong Kim
>
> > The setup is I have a home-grown server process that has exclusive access
> > to the index files. All reads and writes are done through this server. No
> > other process is reading the same index files whether
Hi,
I'm looking for a stemmer that is capable of returning all morphological
variants of a query term (to be used for high-recall search). For example,
given a query term of 'cares', I would like to be able to generate 'cares',
'care', 'cared', and 'caring'.
I looked at the Porter stemmer,
imagine the stem is
"car". Suddenly the word "cars" shares the same "car" stem and you have a
false positive.
Jong: I _think_ what you need is a "reverse lemmatizer".
Otis
- Original Message
From: Bill Taylor <[EMAIL PROTECTED]>
To: java
Hi,
Does anyone know of a written document that describes in some details
how Lucene's ranking/scoring algorithm works? I'm safely assuming that
a single consistent algorithm is being used to compute the scores of
each matching documents (with or without explicit boost factors in the
query) and r
Hi,
The MoreLikeThis class in Lucene's contrib/queries project performs noise
word filtering based on the case-sensitive comparison of the terms against
the user-supplied stopwords set.
I need this comparison to be case-insensitive, but I don't see any way of
achieving it by extending this cla
e.
I don't imagine there should be a need to change the MoreLikeThis source.
Cheers
Mark
- Original Message ----
From: Jong Kim <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, 8 July, 2007 10:12:08 PM
Subject: Stop-words comparison in MoreLikeThis class in Lu
supply stop words in a
case-insensitive fashion?
- Original Message ----
From: Jong Kim <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, 9 July, 2007 3:00:05 PM
Subject: RE: Stop-words comparison in MoreLikeThis class in Lucene's
contrib/queries project
My applicat
this applies to your app you could run MoreLikeThis on the
lower-cased version of the field in the index.
Cheers
Mark
- Original Message
From: Jong Kim <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, 9 July, 2007 3:55:03 PM
Subject: RE: Stop-words compariso
Mark,
I understand your point.
However, we do not maintain a separate field for the lower-case version of
the words.
Instead we index them twice at the same position within the same field,
which allows us to provide case-exact match for search queries containing
upper case characters, but case-i
17 matches
Mail list logo