Check the nutch or solr projects, both of which are subprojects of lucene. Feel
free to drop me a line if you should run into difficulties.
Sent via BlackBerry by AT&T
-Original Message-
From: "John Evans" <[EMAIL PROTECTED]>
Date: Mon, 28 Jul 2008 18:53:08
To:
Subject: Using lucene as
Dear Karsten:
Sorry for the multiple posts, but I have made some progress. I think in
order to search multiple fields, I should be using the
MultipleFieldsQueryParser class, and simply pass a String array containing
the fields I wish to search over. My follow-up question to you is this:
How do
Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
sacrifice anything. It defaults to 16.0 MB so depending on the size of your
index you may want to make it larger. Do some testing at various values to
see where the sweet spot is.
John G.
-Original Message-
From: Dragon
Hi All,
I have successfully used Lucene in the "tradtiional" way to provide
full-text search for various websites. Now I am tasked with developing a
data-store to back a web crawler. The crawler can be configured to retrieve
arbitrary fields from arbitrary pages, so the result is that each docum
Hi,
I've indexed Book Title,Author Name,Contents and some other fields.
Previously I gave option to search string in any of those fields and I
displayed results from getting fields "Title","Author Name","Contents" from
hits resulted docs.
Now I want to display "Title" & "Author Name" list w
I'd like to shorten the time it takes to optimize my index and am willing to
sacrifice search and indexing performance. Which parameters (e.g. merge
factor) should I change? Thank you.
_
Stay in touch when you're away with Windows
Hi Karsten:
I have another follow-up question for you. Once I create the index the way
you suggested, how would I modify my code to search it?
At present, I have the following code:
Analyzer analyser = new StandardAnalyzer();
Query parser=new QueryParser("LINES", analyser).pa
BTW, we use Lucene .NET not Java currently, so version is 1.9. Unfortunately
we don’t have "setAllowDocsOutOfOrder" but do have "useScorer14" which is
almost the same thing for some queries. I did not see much improvement and for
other queries it was slower. We are stuck on 1.9 due some stabil
Thanks Karsten for your reply. I will implement your solution tonight,
however I did have a quick follow up question. I understand how you are
implementing the solution for the "SCENE-COMMENTARY" tag, however because at
present I am working with the "LINES" tag, shouldn't I continue using that
i
The description here sounds exactly like what we were seeing before
LUCENE-669 was fixed -- from his writeup it doesn't look like he
tested with Lucene 2.2 to see if the problem went away. I think it
very well may.
That said, as a precaution, maybe we should no longer call close() on
o
Perhaps one thing to try is a partial optimize
(IndexWriter.optimize(int maxNumSegments)). It makes optimize faster,
but searches may run slower than a full optimize.
EG, optimize(5) will reduce index to <= 5 segments.
Mike
Stu Hood wrote:
Also, keep in mind that optimization is a very
Yes you can, and that should be fast.
Another thing to try is an SSD -- look at the "Lucene performance
issues" thread on java-user.
Mike
On Jul 27, 2008, at 11:54 PM, 王建新 wrote:
Thanks a lot.
I have an idea, Can I use lucene on a 64bits VM?
In the condition, I can load all index files t
Ahh gotchya, OK.
Mike
Ajay Garg wrote:
Thanks Mike.
Yes, I know, 2.3.2 doesn't have commit(). That's why, I asked
whether commit
= close + new IndexWriter, because then I can write a commit() method,
encapslating close() + new IndexWriter.
Thanks a ton for the prompt replies..
Ajay Gar
Hi Fayyaz,
again, this is about SAX-Handler not about lucene.
My understanding of what you want:
1. one lucene document for each SPEECH-Element (already implemented)
2. one lucene document for each SCENE-COMMENTARY-Element (not implemented
yet).
correct?
If yes, you can write
i
On Sun, 2008-07-27 at 21:38 +0100, Mazhar Lateef wrote:
> * email searching
> o We are creating very large indexes for emails we are
> processing, the size is upto +150GB for indexes only (not
> including data content), this we thought would improve
> search
Not an answer to your question. But, have you tried IBM's OmniFind Personal
Email Search ? Excerpt from their site :
Simple keyword or text search is not always effective for quickly finding
what you need. IBM(R) has gone beyond keywords by inventing a fast and
accurate semantic search system for
16 matches
Mail list logo