Thanks to everyone who commented. Clearly, I have a lot to think about,
but thanks for the help.
Scott
-Original Message-
From: Rob Staveley (Tom) [mailto:[EMAIL PROTECTED]
Sent: Friday, July 07, 2006 2:53 PM
To: java-user@lucene.apache.org
Subject: RE: Managing a large archival (and co
> dan2000 <[EMAIL PROTECTED]> wrote on 07/07/2006 15:03:35:
> but if you remove it and add it again, you'll need to re-index it again.
> don't you? When you do re-index, you'll have to close the reader, which
> would pause the search. Any better way of doint it?
INHO yes and no -
There's no need
but if you remove it and add it again, you'll need to re-index it again.
don't you? When you do re-index, you'll have to close the reader, which
would pause the search. Any better way of doint it?
--
View this message in context:
http://www.nabble.com/modify-existing-non-indexed-field-tf1905726.
We only display the 10 hits at a time, so we don't need to iterate through all
the hits.
It feels like there should be a way to pull a document out 1 index and stick
it into an other and bring all the unstored fields along with it.
On Friday 07 July 2006 12:52, Erick Erickson wrote:
> Did you
Aha, OK that makes sense. Likewise James Pine's explanation. Thanks both of
you.
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: 07 July 2006 20:40
To: java-user@lucene.apache.org
Subject: RE: Managing a large archival (and constantly changing) database
: How ca
Did you use a Hits object to assemble your results? And is that what you're
measuring when you say it's slow? In other words, were you measuring the
time it took to execute the statement
Hits hits = searcher.search(query, new Sort("fullname"));
or the time it took to iterate over the Hits object
: How can that be so? When the segments file is re-written it will surely
: clobber the copy rather than creating a new INODE, because it has the same
: name... wouldn't it?
if you take a look at SegmentInfos.java you'll see that an existing
segments file is never modified. a new segments file i
Thank you so much. I apologize for my ignorance.
Mark
On 7/7/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: > But ParseException extends IOException, so I don't see a problem
there.
: I wish my compiler agreed with you:) Which it seems to do until you
: rebuild the files with javacc. I saw
: > But ParseException extends IOException, so I don't see a problem there.
: I wish my compiler agreed with you:) Which it seems to do until you
: rebuild the files with javacc. I saw at least two other posts about this
: problem on the web with no answer given...
: This guy also found the same
Daniel Naber wrote:
On Freitag 07 Juli 2006 16:20, Mark Miller wrote:
the javacc generated StandardTokenizer next() method is declared to
throw a ParseException
final public org.apache.lucene.analysis.Token next() throws
ParseException, IOException {
unfortunately, org.apache.lucene.anal
On Freitag 07 Juli 2006 16:20, Mark Miller wrote:
> the javacc generated StandardTokenizer next() method is declared to
> throw a ParseException
>
> final public org.apache.lucene.analysis.Token next() throws
> ParseException, IOException {
>
> unfortunately, org.apache.lucene.analysis.Token nex
Heh, you said it better than I. I was just about to reply with the
witty "Nutch is Lucene, isn't it?"
Jeff
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Friday, July 07, 2006 10:28 AM
To: java-user@lucene.apache.org
Subject: Re: Nutch- Better than Lucene?
:
there was a thread discussing the performance differneces just recently...
http://www.nabble.com/forum/Search.jtp?forum=44&local=y&query=MultiReader+MultiSearcher
: Date: Fri, 7 Jul 2006 16:34:08 +0100
: From: Mike Streeton <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java
: Subject: Nutch- Better than Lucene?
: > http://wiki.apache.org/nutch/Nutch_-_The_Java_Search_Engine
Asking if Nutch is better then Lucene is like asking if a Truck is better
then a Combustion Engine -- you can't compare them. A truck is a vehicle
tht does stuff, and it gets it's power from a
Thanks Jo. You may want to look for Andi Vajda's email with performance
numbers, too. I think he did send them out when he first contributed
DbDirectory, and I don't recall the numbers being this bad.
Otis
- Original Message
From: Johannes Christen <[EMAIL PROTECTED]>
To: java-user@l
Yes, you can do something like that, but of course you have to delete the old
Document, and add the new, modified oneto the index, too. I do something like
that on one of the Simpy.com indices and it works nicely.
Otis
- Original Message
From: dan2000 <[EMAIL PROTECTED]>
To: java-user
> When you say you keep your documents ordered alphabetically, it's confusing
> to me. Are you saying that you pre-sort all your documents then insert them
> one after another so that automatically-generated internal Lucene ID maps
> exactly to the alphabetical ordering? That is, for any document I
--- "Rob Staveley (Tom)" <[EMAIL PROTECTED]> wrote:
> Doug says:
>
> > 1. On the index master, periodically checkpoint
> the index. Every minute or
> so the IndexWriter is closed and a 'cp -lr index
> index.DATE' command is
> executed from Java, where DATE is the current date
> and time. This
> e
What performs best across multiple indexes:
Each index with an IndexReader with an IndexSearcher on top and the
searchers linked with a ParallelMultiSearcher
Or
Each index with an IndexReader linked with a MultiReader and an
IndexSearcher on top
Many Thanks
Mike
www.ardentia
When you say you keep your documents ordered alphabetically, it's confusing
to me. Are you saying that you pre-sort all your documents then insert them
one after another so that automatically-generated internal Lucene ID maps
exactly to the alphabetical ordering? That is, for any document IDs D1 a
Hi,
Can somebody explain the lengthNorm, queryNorm and coord in lucene?
lengthNorm is the (term freq)/(total terms number) or (term freq)/(max term
freq) or something else. queryNorm is the (term squared
weight)/(sumOfSqureWeights)? Why we still need queryNorm when it will not
affect the score for
All,
I sent this the other day, but didn't get any responses. I'm hoping that it
was just missed, so I'm trying again.
There has to be a better way to to insert a document in to an index then
reindexing everything.
--Jason
On Wednesday 05 July 2006 5:06 pm, Jason Calabrese wrote:
> All,
>
>
I have added support for sent/para prox search by modifying the notspan
query. In doing so I have changed the standard analyzer javacc .jj file.
Here is my problem:
the javacc generated StandardTokenizer next() method is declared to throw a
ParseException
final public org.apache.lucene.analysis
I don't think you've done anything to the index. This code is really
equivalent to something like
Field field = hits.doc(i).getField('address");
field.set("11 Diana Street");
You've changed the value of the field instance, but that is essentially a
local variable (even though not explicit in you
Is it possible to modify a stored field but not indexed? for example, if I
have a field like this:
new Field("address", address, Field.Store.YES, Field.Index.NO)
and I want to modify it like this:
hits.doc(i).getField("address").set("11 Diana Street");
Is it possible?
--
View this message in co
Hy,
On Friday 07 July 2006 12:23 mark harwood wrote:
> Out of interest, why are you using a RAMDirectory here? An IndexWriter uses
> one internally of size IndexWriter.setMaxBufferedDocs so you get the
> benefits of buffering automatically when writing to a File-based directory.
realy? I read the
The answer is because addIndexes() currently always does an optimize
post-merge. If I recall correctly optimize() will create a complete copy of the
existing index during the optimize process then delete the old one so this
shouldn't be done too often.
Out of interest, why are you using a RAMDi
Hy,
I use the following code to index about 1 Million Documents to a empty index:
=
private static void do_searchindex(Connection target) throws
SQLException,IOException {
int i=1164;
PostIndexer.createIndexDir(); //Creates Index-Director
I have written a paper about Topic Detection and Tracking, where I also
explain the TF-IDF-scheme. If you like, i can send you the paper.
Aleksander
On Fri, 07 Jul 2006 04:46:52 +0200, Rajiv Roopan <[EMAIL PROTECTED]>
wrote:
Hello,
I was recently looking thru the lucene in action book
I should probably direct this to Doug Cutting, but following that thread I
come to Doug's post at
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg12709.html .
Doug says:
> 1. On the index master, periodically checkpoint the index. Every minute or
so the IndexWriter is closed and a
> http://wiki.apache.org/nutch/Nutch_-_The_Java_Search_Engine
31 matches
Mail list logo