Hi All,
I had a query regarding usage of lucene.
I have done the indexing for the files kept in root folder ->
subfolder-> Subfolder structure.
When I make the search with particular word it returns me the list of
matching files across the folder structure right from root to the last
subfolde
Hi
I would like to join java-user mailing list.
I had a query regarding usage of lucene.
I have done the indexing for the files kept in root folder -> subfolder
-> subfolder structure.
When I make the search with particular word it returns me the list of
matching files across the f
Hi
Thanks for the suggestions. This would require us to change the index
and right now we literally have millions of documents stored in current
index format. I'll bear it in mind, but I am not entirely sure how I
would go about implementing the change at this point.
Much appreciate
Jamie
1. redefine the archivedate field as YYmmDD format,
2. add another field using timestamp for sort use.
3. use RangeFilter to get result and then sort by timestamp.
2008/2/27, Jamie <[EMAIL PROTECTED]>:
>
> Hi Michael & Others
>
> Ok. I've gathered some more statistics from a different machine for
Hi John,
I am getting Summary value null in results.jsp page and I need "snippet"
or "fragment" to be highlighted.
I have gone through lucene faqs related but it's not clear. I will
appreciate if you help me to find list of files (Java) to be modified.
Thanks in advance.
Ravinder
-Original
Ravinder,
If you want something from an index it has to be IN the index. So, store a
summary field in each document and make sure that field is part of the
query.
John G.
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 27, 2008 7:58 PM
To:
Soren,
Your documents are being boosted. Because of the way document boost values
immediately go through some calculations and are stored in the index Luke
will always show 1.o as the boost value. There has been some talk in the
recent past that this should be removed from Luke since it is actuall
Am I missing something? Isn't this exactly what Lucene does?
Put in a value when you create your Document, get it back out when it comes
back from a search, right?
Want a text summary? Put it in to the document...
I just started playing with Lucene so maybe I'm missing something, but these
ques
Hi All,
Is there a way to get a text summary of an indexed document to display
along with the search result?
Please let me know the technical changes.
Thanks,
Ravinder
DISCLAIMER:
This message contains privileged and confidential information and is intended
only for an individual n
Ah, you didn't mention term vectors. What do you need them for?
Perhaps a bit more background could help here.
-Grant
On Feb 27, 2008, at 1:31 PM, fangz wrote:
I implemented HitCollector as you suggested. It improved the initial
run
significantly. However it only showed slight improveme
Hi List,
I have a situation similar to indexing a mailing list, with each mail
indexed as a Doc. Mails from a same thread share a same thread ID, which is
indexed in a separate field. Now I want to search through all the mails
using some keywords, and list all the unique thread IDs which I can pas
I work with Lucene 2.0. I boost some documents:
Document doc = new Document();
// adding fields
doc.setBoost(2.0f);
indexwriter.addDocument(doc);
If I look to my index with Luke (0.6) the boost value of all documents
is still 1.0.
How can I boost documents?
Thanks. Sören
Simon Wistow wrote:
On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
When you previously saw corruption was it due to an OS or machine
crash (or power cord got pulled)? If so, you were likely hitting
LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
at s
You need to make sure your storage does not lie in response to an fsync
command. If it does (most commercial stuff does), you cannot guaranty no
corruption. Search google for "your harddrive lies to you" or something.
It shouldnt be that hard to take the patch from the issue and apply it
to a
On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
>
> When you previously saw corruption was it due to an OS or machine
> crash (or power cord got pulled)? If so, you were likely hitting
> LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
> at some point) but
I implemented HitCollector as you suggested. It improved the initial run
significantly. However it only showed slight improvement in the subsequent
runs. I don't know how to implement FieldSelector in my situation. My codes
look like this:
public void collect( int doc, float score ) {
TermFr
Hey everybody,
As my subject is telling, i have a little problem with analyzing the
explain() output.
I know, that the fieldnorm value consists out of "documentboost, fieldboost
and lengthNorm".
Is is possible to recieve the single values? I know that they are multiplied
while indexing but
can t
The first question is always "how much memory are you giving
your JVM?".
A 256M index is pretty small, I wouldn't be surprised if your JVM is using
some vary small default
Best
Erick
On Wed, Feb 27, 2008 at 6:23 AM, GURUPRASAD MS <[EMAIL PROTECTED]>
wrote:
> Lucene Index contains 2.1 Milli
To reinforce Grant's comment, lazy loading improved one situation for me
on the order of 10X. I wrote it up and it's somewhere in the Wiki. Your
results
will vary, and unless you have a LOT of stored fields I wouldn't necessarily
expect a similar speedup, but it's sure worth looking at.
And don't
When you previously saw corruption was it due to an OS or machine
crash (or power cord got pulled)? If so, you were likely hitting
LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
at some point) but is not fixed in 2.3.
If that is what you were hitting, then unfortunately n
Lucene Index contains 2.1 Million records (indexed from 2.1 million records
from sqlserver DB).
Lucene Index file Size 256MB
Lucene version: 2.3
Searching works fine when we sort the results on a single field. However, if
the search results is sorted on more than one field we get Out of Memory
exc
I currently have a set up that indexes into RAM and then periodically
merges that into a disk based index.
Searches are done from the disk based index and deletes are handled by
keeping a list of deleted documents, filtering out search results and
applying the deletes to the index at merge tim
I'm wondering if your date field's precision may be a little too
much? What I mean is that you are going all the way down to
seconds. Whenever you do a range query you are essentially spawning
a BooleanQuery with a representation of that range. Do you really
need to be that precise? I u
You could also look at the FieldSelector when getting the Document.
Such that you only load the one field you need
-Grant
On Feb 26, 2008, at 10:13 PM, Mark Miller wrote:
The Lucene prime directive: dont iterate through all of Hits! Its
horribly inefficient. You must use a hitcollector. Ev
h t a écrit :
I guess you can implement createBitSet() more effciently by using
Filer,but not BooleanQuery
Hi,
thanks for advice, but did you mean Filter or Filer? And even if I
should use a Filter, I don't really understand how to replace the
Boolean query :(
The boolean query is already very
25 matches
Mail list logo