Hi John,
In case of deletions, it is just a delayed delete. In other words, the doc
is just marked as deleted in the deletable file, leaving a void in the
numbering of docs. The actual shifting of document ids happens only when you
optimize the index. In that case the deletables file is used to ph
Thanks Mike for your valuable time.
Regards,
Aditi
On Thu, Jul 10, 2008 at 5:36 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:
>
> Yes you must delete the entire document and then re-index a new one, to
> update a single Field.
>
> There is some work underway, or at least a Jira issue opened
Guys (and Gals),
A question on index deletions, what exactly happens to the Lucene document
numbers in an index when a document is deleted? Let's say I have a 5 doc
index.
Document # Doc
0 doc1
1
Chris,
-Original Message-
From: Chris Bamford [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 10, 2008 9:15 AM
To: java-user@lucene.apache.org
Subject: Re: newbie question (for John Griffin) - fixed
Hi John,
Please ignore my earlier questions on this subject, as I have got to the
bottom
Chris,
The code you refer to in the blog is 5 years old! Some of the code is no
longer valid with the newer Lucene jars. I wouldn't use it to test anything.
My suspicion is that your index itself is suspect. Let's see the code you
use to build the index with a small data set that will show what
Well, according to him, using the reader to access the index everytime a
document is found to retrieve certain values is inefficient. Meaning if
there is 500k document, the index will be access 500k times. It might affect
the performance of the search.
So I am instructed to retrieve all the neces
On Jul 10, 2008, at 1:42 AM, blazingwolf7 wrote:
Well, I am trying to extract the URL and contentLength from the
".fdt" file.
I am planning to use both of these values in a filter to remove
certain
links to be display in the search result. The problem is, I am told
not to
use the IndexR
Thanks. I think I will follow the advice. But just for the sack of curiosity,
can what I suggest be done ?
Yonik Seeley wrote:
>
> On Thu, Jul 10, 2008 at 1:42 AM, blazingwolf7 <[EMAIL PROTECTED]>
> wrote:
>> Well, I am trying to extract the URL and contentLength from the ".fdt"
>> file.
>> I a
Hi,
I have individual index files for Audio, Image and PDF files. We build common
meta fields for them. When I search for a string, I want the search defaults to
return mixed search results from these 3 different index based on relevancy.
But I also wants to know hit count for each individual in
I need to perform a query for a term that may or may not have values,
and I need to check for the conditions where either no terms are
indexed OR any and ALL indexed terms match a wildcard.
For example, say the following values were indexed as terms in the
field "myfield" in the three docum
: I have a MultiSearcher from remote using by
: Naming.bind("rmi://"+IP+":"+PORT+"/"+NAME, RemoteSearchable)
: ,but MultiSearcher doesn't has getIndexReader() .
: How to get IndexReader?
It's not possible to get a remote IndexReader ... that's the main
distinction between the Searchable interf
: But how does the built-in STRING sort work with null values then? And how do
: I make a SortComparitor that works?
Built in string sorting uses FieldCache.DEFAULT.getStringIndex() ... any
doc without a value ends up without an assignment in StringIndex.order[],
so it gets the default value o
I may take a crack at this. Any more thoughts you may have on the
implementation are welcome, but I don't want to distract you too much.
Thanks,
Peter
On Thu, Jul 10, 2008 at 1:30 PM, Grant Ingersoll <[EMAIL PROTECTED]>
wrote:
> Makes sense. It was always my intent to implement things like
> P
Makes sense. It was always my intent to implement things like
PayloadNearQuery, see http://wiki.apache.org/lucene-java/Payload_Planning
I think it would make sense to develop these and I would be happy to
help shepherd a patch through, but am not in a position to generate
said patch at thi
On Jul 9, 2008, at 10:14 PM, Chris Hostetter wrote:
I'm going to guess you have a doc where that field doesn't have a
value.
ordinarily that's fine, but maybe SortComparator doesn't handle
that case very well.
But how does the built-in STRING sort work with null values then? And
how do I
On Thu, Jul 10, 2008 at 11:13 AM, Beard, Brian <[EMAIL PROTECTED]> wrote:
> Question: If autoCommit is false, does this apply to optimization also,
> so that during an hour long optimization that gets killed in the middle,
> will the index be in the left in the initial state before optimization
> s
Hi John,
Please ignore my earlier questions on this subject, as I have got to the
bottom of it.
I was not passing each word in the phrase as a separate Term to the
query; instead I was passing the whole string (doh!).
Thanks.
- Chris
Chris Bamford wrote:
Hi John,
Further to my question be
Currently the default setting is being used with our setup, so
autoCommit is true. I'll set this to false to see if it improves.
Question: If autoCommit is false, does this apply to optimization also,
so that during an hour long optimization that gets killed in the middle,
will the index be in the
On Thu, Jul 10, 2008 at 1:42 AM, blazingwolf7 <[EMAIL PROTECTED]> wrote:
> Well, I am trying to extract the URL and contentLength from the ".fdt" file.
> I am planning to use both of these values in a filter to remove certain
> links to be display in the search result. The problem is, I am told not
Why does SubversionUpdate require shutting down the IndexSearcher?
What goes wrong?
You might want to switch instead to rsync.
A Lucene index is fundamentally write once, so, syncing changes over
should simply be copying over new files and removing now-deleted
files. You won't be able
Suppose I create a SpanNearQuery phrase with the terms "long range missiles"
and some slop factor. Each term is actually a BoostingTermQuery. Currently,
the score computed by SpanNearQuery.SpanScorer is based on the sloppy
frequency of the terms and their weights (this is fine). But even though
eac
I'm not fully following what you want. Can you explain a bit more?
Thanks,
Grant
On Jul 9, 2008, at 2:55 PM, Peter Keegan wrote:
If a SpanQuery is constructed from one or more BoostingTermQuery(s),
the
payloads on the terms are never processed by the SpanScorer. It
seems to me
that you wou
Hi.
Currently using Lucene 2.3.2 in a tomcat webapp. We have an action
configured that performs reindexing on our staging server. However, our live
server can not reindex since it does not have the necessary dtd files to
process the xml.
To update the index on the live server we perform a subvers
Yes, the term frequency vector is exactly what I needed. Thanks!
-James
Ajay Lakhani wrote:
>
> Hi James,
>
> Try this:
>
> Searcher searcher = new IndexSearcher(dir);
> QueryParser parser = new QueryParser("content", new
> StandardAnalyzer());
> Query query = parser.parse(queryS
Yes you must delete the entire document and then re-index a new one,
to update a single Field.
There is some work underway, or at least a Jira issue opened, towards
improving this situation, here:
https://issues.apache.org/jira/browse/LUCENE-1231
But it will be some time before that'
Hi John,
Further to my question below, I did some back-to-basics investigation of
PhraseQueries and found that even basic ones fail for me...
I found the attached code on the Internet (see
http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html)
and this fails too... Can you
Hi,
I want to modify a field on the current index. Can it be done?
For what I have heard that we cannot update the index . It has to be
reindexed by deleting and then indexing again.
Thanks,
Aditi
Hi John,
Just continuing from an earlier question where I asked you how to handle
strings like "from:fred flintston*" (sorry I have lost the original email).
You advised me to write my own BooleanQuery and add to it Prefix- /
Term- / Phrase- Querys as appropriate. I have done so, but am having
Hi
Is it possible to Hightlight more than one terms with highlighter but
with different style for each term ??
1st term with SimpleHTMLFormatter("", "");
2rd term with SimpleHTMLFormatter("", "");
..
n-th term with SimpleHTMLFormatter("", "");
or for foloween code
SimpleHTMLFormatter
Hi James,
Try this:
Searcher searcher = new IndexSearcher(dir);
QueryParser parser = new QueryParser("content", new StandardAnalyzer());
Query query = parser.parse(queryString);
HashSet queryTerms = new HashSet();
query.extractTerms(queryTerms);
Hits hits = searcher.sear
30 matches
Mail list logo