AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
a couple of refrences to "Lucene 1.2" in the last few months got me
thinking and made me realize that 1.4.3 is the oldest release available in
the lucene dist archive, they might still be in the jakarta dist archive.
sure enough...
http://archive.apache.org/dist/jakarta/lucene/source/
http:/
Lucene,
Is there JSON support in Lucene? JSON is more fat-free compared to
XML and would be preferred. Digester works well for indexing XML but
something along the same lines for JSON would be even sweeter.
Best, Thom
-
To
This is likely one of the many subtleties of the Porter stemmer. Dr.
Porter has chosen a particular way of doing things, but it isn't
necessarily right for everyone. You really have to measure the net
benefit across all your searches, not specifically just one. If you
can't live with thi
Christian,
I do not have an answer for you (hope some of the gurus on this board can
provide you an appropriate answer.
However, I would request you share your finding and experience on this list.
We are facing a similar situation and would appreciate if you shared your
learning.
Regards
AS
Hello,
I was wondering if there had been any work done out there on an analyzer for
URL strings. I'm looking for something which will match on any of the words
in the domain or path of the URL. I am considering using a PatternAnalyzer
but I wanted to ask this group to see if this was something whi
A Freeware, OpenSource Windows PC and Web based application:
http://BahaiResearch.com
It allows people from 14 languages to investigate the religious texts of
other religions. The goal is to foster better understanding between peoples
of many religions and many languages. A many-to-many relations
Thanks Mike for the links. That certainly helps us better to plan the
dependencies.
Michael McCandless wrote:
Well... there are a couple threads on java-dev discussing this "now":
http://www.nabble.com/2.9-3.0-plan---Java-1.5-td20972994.html
http://www.nabble.com/2.9,-3.0-and-deprecation-
You're right, there's not much benefit now... there will be more
benefit when flexible indexing is available. Though, you could set up
an analysis chain where the producer puts something "new" onto each
token, and somewhere downstream you pick that up and do something
interesting with it
Right, I was debating throwing that in myself - its great stuff, but I
wasn't sure how much of a feature benefit it brought now. My
understanding is that its main benefit is along the flexible indexing
path and using multiple consumers eg its more setup for the goodness yet
to come. My understa
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Well, I'm reasonably sure you could make this work, although it'll
take some effort.
The 3,000,000 records/day should be pretty easy.
Parsing the URLs, if none of the supplied tokenizers do exactly what you
want, you can always make your own. Or you can pre-process the input
if that's easier. e.g
The new extensible TokenStream API (based on AttributeSource) is also
in 2.9.
Mike
Mark Miller wrote:
Well look at the issues and see for yourself :)
Its a subjective call I think. Heres my take:
There are not going to be too many sweeping changes in the next
release. There are tons of
hi *,
i am searching for a fulltext index capeable of the following requirements:
index everyday 3 000 000 new records with a validity of N days (e.g.
90 days expiration)
== 34,7 / s
one record is e.g. an url and can be up to 2 k big
http://example.com/somedir/some.html
lucene should use "/" as
How did you delete the documents? EG, by docID using IndexReader, by
Term or Query using IndexWriter?
And when you said your previous index had 14488449 docs, was numDocs()
or maxDoc()?
Mike
1world1love wrote:
Ganesh - yahoo wrote:
Optimize will remove the deletes and rearrange t
Hi,
I'm using the SnowballAnalyzer for my stemming processing.
search words: love, loved, loveliness, loveless, lovely, and loving
On my index I have the word love. The behavior during searching is that it
can't correctly stem the two words loveliness, loveless to love. And the odd
thing is love
Hi Chris,
I was just thinking that when the 1st query of q2 is run it will have its
result.
Then the 2nd query of q2 will run and have its own result BUT it is now
filtered that no same data from the 1st query is returned.
Results of the 1st and 2nd query have been appended.
Does the 2nd quer
Thanks Mike, I'm still on 2.3.1, so will upgrade soon.
Antony
Michael McCandless wrote:
This was an attempt on addIndexesNoOptimize's part to "respect" the
maxMergeDocs (which prevents large segments from being merged) you had
set on IndexWriter.
However, the check was too pedantic, and was
20 matches
Mail list logo