Hi Tushar,
If you refer to the Javadocs for IndexReader, you'll come across the
following line:
"For efficiency, in this API documents are often referred to via
document numbers, non-negative integers which each name a unique
document in the index. These document numbers are ephemeral--they may
ch
> one can improve search performance by using a RAMDirectory created
from an underlying FSDirectory using one of the parameterised
constructors. Is this correct?
Absolutly
> Will a FSDirectory not automatically load the index into memory
provided enough RAM is available?
Not all index files are
Hi Shakti,
> I am using Searching is taking a lot of time.
What do you mean by a lot of time? How much time is it taking?
There are a lot of factors that affect the search speed.
> The size of the folder where I am keeping the index files is 160 MB
containing 3277 documents.
That's not too much.
Liaqat,
What exactly are you looking for? Are you sure you want to build the
source of lucene and then use it? Alternatively you could simply use the
lucene jar file (ie. already built for you) and start playing around
with it. This jar file is bundled in the archive that you might have
downloaded.
Checkout for "http://www.ibm.com/developerworks/web/library/j-lucene/";
Though this page does not list a comparison between SAX and Digester, it
convinced me enough to use Digester
Regards,
kapilChhabra
-Original Message-
From: syedfa [mailto:[EMAIL PROTECTED]
Sent: Monday, November 19,
Hi,
Checkout for the following lines in the documentation of IndexReader:
* "An IndexReader can be opened on a directory for which an IndexWriter
is opened already, but it cannot be used to delete documents from the
index then. "
* "Once a document is deleted it will not appear in TermDocs or
Term
Hey!
Search for the topic "Aggregating Category Hits" in the list. You'll get
a few approaches that you may use to implement "groupby".
Regards,
kapilChhabra
-Original Message-
From: Haroldo Nascimento [mailto:[EMAIL PROTECTED]
Sent: Monday, November 19, 2007 3:02 AM
To: java-user@lucene
If its only about the search, you could have "section" as just another field in
your index. You could simply search on work as "well" as "section".
Otherwise, if you are looking at aggregating category hits, then look at
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200605.mbox/[EMAI
about the load balancer yet. That is what we are trying
to find out. I was thinking of using LVS, but not sure how easy is it to
use. I also found the Persistence feature to be quite useful. Is there
any better solution for load balancing?
Chhabra, Kapil wrote:
>
> Ah! There are so many way
Ah! There are so many ways to do this as there are so many questions
unanswered in your mail.
What kind of load balancer are you going to install?
Will you be replicating the complete lucene index on both the servers?
Do you plan to use the MultiSearcher/ParellelMultiSearcher?
Do these servers sha
Hi,
Yes there are ways and workarounds to remove duplicates based on one
field. But, you should not need this if you don't index duplicates at
the first place. Just put a call to "delete" from index right before you
add the document to in.
Best Regards,
Kapil Chhabra
-Original Message-
Fr
Hey Albert,
Just to remind you, that the fields in Lucene are per document and not
per index. This means that you can have documents in an index which have
different fields altogether.
So, in effect, you can all your document types to your existing index.
And guess what, you don't need to change an
Try going through:
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
Regards,
kapilChhabra
-Original Message-
From: SK R [mailto:[EMAIL PROTECTED]
Sent: Monday, August 06, 2007 5:09 PM
To: java-user@lucene.apache.org
Subject: speedup indexing
Hi,
I have indexed 5 fields and s
What is the structure of your index?
If you havnt already, then add a new field to your index that stores the
contractId. For all other fields, set the "store" flag to false while
indexing.
You can now safely retrieve the value of this contractId field based on
your search results.
Regards,
kapil
You just have to make sure that what you are searching is indexed (and
esp. in the same format/case).
Use Luke (http://www.getopt.org/luke/) to browse through your index.
This might give you an insight of what you have indexed and what you are
searching for.
Regards,
kapilChhabra
-Original Me
Is it not true for any RDBMS table as well which does not have a Primary
Key?
If this is a problem that you are facing, then it can be solved by
introducing one unique identifier as a field in your index which would
act as a Primary Key for your index.
Using an untokenized field might not be a good
I don't think that there is any other way out apart from re-indexing in
all-small or all-caps case(through an Analyzer or externally), and then
searching in the same case as you used while indexing.
Even if you find a way by which you can run case insensitive searches, I
am sure it'll add to the co
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Hits.ht
ml#id(int)
public final int id(int n)
throws IOException
Returns the id for the nth document in this set. Note that ids may
change when the index changes, so you cannot rely on the id to be
stable.
kapilCh
18 matches
Mail list logo