Hi Otis,
Thanks for such a quick reply. I tried using finally, but it didn't help.
I guess if I explain the integration of lucene with my app in little detail
then you probably can help me better.
I allow users to upload documents, which are then indexed, and search on
them. Now I am getting thi
Hi,
I have used Lucene in my application and am just indexing and searching on
some documents. The code that indexes the documents was working fine till
yesterday and suddenly stopped working.
I get an error when I am trying to close the index writer. The code is as
follows:
.
Who knows what else the app is doing.
However, I can quickly suggest that you add a finally block and close your
writer in there if writer != null.
Otis
- Original Message
From: Shivani Sawhney <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, February 15, 2006 11:31:
Here are a few bits:
http://www.lucenebook.com/search?query=indexing+numbers
The Wiki and the FAQ also have some information about indexing numbers/dates.
Basically, you want them small (ints, faster sorting, if you need sorting), and
you don't want them too fine, if you'll be expanding them into
Hi,
I am in the process of deciding specs for a crawling machine and a
searching machine (two machines), which will support merging/indexing
and searching operations on a single Lucene index that may scale to
about several million pages (at which it would be about 2-10 GB,
assuming linear growth w
Hi,
What is the best way to index numeric decimal fields, like experience, when
I want to use a range search on this field?
Thanks in advance.
Regards,
Shivani
In the example code, take a look at the SearchServlet.java code and the
performFeedback and getTopTerms() methods, which demonstrate the use of
the term vectors. It is fairly well commented. You don't need maven,
JSP or JUnit for this. On the indexing side, look at the TVHTMLDocument
for how
Omar Didi wrote:
I have tried to use the isCurrent() method IndexReader to figure out
if an index is merging. but since I have to do this evrytime I need
to add a document, the performance got s slow.
here is what I am doing, I create 4 indexs and I am running with 4
threads. I do a round r
Hi
Thanks for replying.
I read your ppt. It is good. But the code or the basic relevance feedback is
not explained there. Actually I am not familiar with JSP, JUnit, Maven, etc.
I guess It will take me lot of time to actually discover how the things work
in demo program because I have to learn all
Leon,
Index is typically a directory on disk with files (commonly called "index
files") in it.
Each index can have 1 or more segments.
Each segment is comprised of several index files.
If you are using the compound index format, then the situation is a bit
different (less index files).
Otis
P.
Hi Chris,
Thanks, when I quoted segment I meant index file.
So if we have 10 seperate index files are you saying we should have one
indexSearcher for the index collectively, or one per index file
Thanks
Leon
- Original Message -
From: "Chris Hostetter" <[EMAIL PROTECTED]>
To:
Sent
: We may have many different segments of our index, and it seems below we are
: using one
: IndexSearcher per segment. Could this explain why we run out of memory when
: using more than 2/3 segments?
: Anyone else have any comments on the below?
terminology is a big issue hwere .. when you use the
Try using a different reader to delete the documents.
Hits can re-execute a query, and if the searcher you are using is
sharing the reader you are deleting with, it's like changing a list
you are iterating over (fewer hits will be found the next time the
query is executed).
-Yonik
On 2/15/06, Dan
Hi lucene users I have a strange error and I don't know to do?
My logs say this:
java.lang.ArrayIndexOutOfBoundsException: 100 >= 100
at java.util.Vector.elementAt(Vector.java:431)
at org.apache.lucene.search.Hits.hitDoc(Hits.java:127)
at org.apache.lucene.search.Hits.doc(Hits
I have tried to use the isCurrent() method IndexReader to figure out if an
index is merging. but since I have to do this evrytime I need to add a
document, the performance got s slow.
here is what I am doing, I create 4 indexs and I am running with 4 threads. I
do a round robbin on the ind
Hi All,
My system requires traversing Hits (search result) and extracting some
data from it. If the result set is very large my system becomes very
slow.
Is there a way to increase performance? Is there a way i can limit the
number of most relevant documents returned?
Best regards,
Urvashi
Chandramohan wrote:
perform such a cull again, you might make several
distinct indexes (one per
day, per week, per whatever) during that reindexing
so the next time will be
much easier.
How would you search and consolidate the results
across multiple indexes? Hits from each index will
have
Looking into the memory problems further I read
"Every time you open an IndexSearcher/IndexReader resources are used which
take up memory. for an application pointed at a static index, you only
ever need one IndexReader/IndexSearcher that can be shared among multiple
threads issuing queries. if
> perform such a cull again, you might make several
> distinct indexes (one per
> day, per week, per whatever) during that reindexing
> so the next time will be
> much easier.
How would you search and consolidate the results
across multiple indexes? Hits from each index will
have independent sc
> From the user's point of view I think it will make sense to
> build a phrase query only when the quotes are found in the search string.
You make an interesting point Sergiu. Your proposal would increase
the expressive power of the QueryParser by allowing the construction
of either phrase querie
URL is http://www.cnlp.org/apachecon2005/
Koji Sekiguchi wrote:
Please check Grant Ingersoll's presentation at ApacheCon 2005.
He put out great demo programs for the relevance feedback using Lucene.
Thank you,
Koji
-Original Message-
From: varun sood [mailto:[EMAIL PROTECTED]
Sent
Chris Hostetter wrote:
: Exactly this is my question, why the QueryParser creates a Phrase query
: when he gets several tokens from analyzer
: and not a BooleanQuery?
Because if it did that, there would be no way to write phrase queries :)
I'm not very sure about this ...
QueryParser only
You might also want to look at that the LucQE project
(http://sourceforge.net/projects/lucene-qe/), which implement a couple
of automated relevance feedback methods including Rocchio's formula.
On 2/15/06, Koji Sekiguchi <[EMAIL PROTECTED]> wrote:
> Please check Grant Ingersoll's presentation at A
Hi Greg,
Thanks. We are actually running against 4 segments of 4gb so about 20
million docs. We cant merge the segments as their seems to be problems with
out linux box , with having files over about 4gb. Not sure why that is.
If I was to upgrade to 8gb of ram does it seem likely this will dou
24 matches
Mail list logo