Hi Lucene experts
I have a program that use lucene to index the content of my objects
Documents, Coments, etc
after index a lot of documents (5000) (12000) is not always at same point i
get this error
#
# An unexpected error has been detected by HotSpot Virtual Machine:
#
# SIGSEGV (0xb) at pc=
hello,
i have modified the IndexFiles.java to read the document numbers from within the
TREC files, which are also being read correctly, however the index fails to
create the .cfs file.
thus the search query does not return the correct document number.
any suggestions how this can be sorted?
chee
There used to be a such a beast, and to get at it you'll need to
resurrect it from the jakarta-lucene-sandbox CVS repository attic.
We (well, I, but no one objected) chose not to bring it over as it
was not a best-practice recommended way to work with Lucene search
results. It had its own
The "Lucene Sandbox" is also known as the "Lucene contrib directory" which
as of 1.9 is included in the core distribution (with each contrib module
in it's own jar)
however, there does not appear to be anything named "SearchBean" in
contrib at the moment.
: Date: Fri, 7 Apr 2006 15:38:50 -0500
Can someone tell me where I can find the source code for SearchBean (Lucene
Sandbox)?
Thanks,
--Rajesh
first off: you should double check the correctness ofyour customized
similarity class. I'm pretty sure it's resulting in a differnet set of
matches then the DefaultSimilarity because your tf function returns 0f
regardless of wether there is a match. (when i said "every function
returns 0 or 1" i
I just wrote some simple code to test this.
For my test I ran the test with 3 queries:
- A 3 term boolean
- A single term query with over 5000 hits
- A single term query with 0 hits
For each query I ran the ran 4 tests of 10,000 searches:
1) using hits.length to get the counts and the standard si
Lucene does not provide this out of the box. You will have to write a
program to do it and feed the results to Lucene.
If I remember right, these files are in XML, so you can probably use SAX
or a pull parser.
I think a number of TREC participants, in the past, have used Lucene, so
you may
So I'm trying to do silly stuff, just to poke a bit at wildcard queries. So
sue me... But I ran across this
And yes, I know that creating a wildcard query is dangerous and downright
silly when you don't have a wildcard in the term, but this still seems like
a case should, say, default to a sim
hi
can anyone suggest how to split files using lucene.
i am trying to index the TREC collection using lucene-1.4.3
i want lucene to read the multiple files within single TREC file and create an
index accordingly.
cheers,
trupti mulajkar
MSc Advanced Computer Science
-
OK, I know I'm asking you to write my code for me (or at least point me to
an example), but I'm at my wits end, so please rescue me
This is a reprise of TooManyClauses. We have a large amount of text, and a
requirement to do a wildcard query. Of course, it's wy too big to use
Wildcard or t
Thanks Chris
I just realize the "contents" in the index is not the "contents" in the
original document.
Miki
Original Message Follows
From: Chris Hostetter <[EMAIL PROTECTED]>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: doc.get("contents")
Date:
Hi all,
Sorry for the noise, it was my own fault. After a look at the sources, I saw I
misinterpreted the MaxBufferedDocs parameter.
IndexWriter.maybeMergeSegments() seems to always merge everything if it is set
so high. For my iterative updates of the index, it seems that the standard
setting
yes, this might be a way, but in my case it would not work:
The probles is, that I have to return an exceprt (snippet) and the words to
be highlighted as two separate strings. So now I use highlighter and
getBestFragment to extract the excerpt, then I remove the inserted html tags
and return the
14 matches
Mail list logo