Simon,
In this example I've set the DEFAULT_MAX_THREAD_STATES of
DocumentsWriterPerThreadPool to 1. I've debugged the code and I've made sure
that ThreadAffinityDocumentsWriterThreadPool has the value set to 1 (as I
was trying to make it behave similar to lucene 3.4 using a single thread).
I'm ind
I'm doing some performance test doing bulk indexing with lucene 4.0 and I'm
seeing weird results. I've read
http://www.gossamer-threads.com/lists/lucene/java-dev/127190?do=post_view_threaded#127190
but I'm still having doubts.
I'm building an index of 1G containing 1 milion docs. When building the
I've read in another thread
(http://lucene.472066.n3.nabble.com/Indexing-slower-in-trunk-td3059836.html#a3062991)
/Since Lucene 2.9, Lucene works on a per segment basis when searching. Since
Lucene 3.1 it can even parallelize on multiple segments. If you optimize
your index you only have one segm
Hey there,
I have a doubt about the behaviour of IndexReader.reopen.
I have a tomcat server holding a lucene index over an IndexSearcher. If I
move the index.folder to index.folder.old and another index, let's say
index.folder.2 to index.folder and then I reopen readers, something weird
happen if
Thanks, so clarifying. As far as I've understood, if I have to end up
optimizing the index just after merging it, no matter if I use the lucene
3.X addIndexes or addIndexesNoOptimize as the sum of time of doing both
things will be the same in one case or other. Am I right?
--
View this message i
Thanks a lot Shai, couple of questions:
>> In Lucene 3x there is a new addIndexes which accepts Directory… that
>> simply registers the new indexes in the index, without running merges.
>> That makes addIndexes very fast.
With the lucene 3.X addIndexes which accepts Directory, if after the mer
I am doing some test about merge indexing and have a performance doubt
I am doing merge in a simple way, something like:
FSDirectory indexes[] = new FSDirectory[indexList.size()];
for (int i = 0; i < indexList.size(); i++) {
indexes[i] = FSDirectory.open(new File(indexList
elevant changes for 3.x...
>
> I'm pretty sure that your supposition 2 is the right one.
>
> HTH
> Erick
>
> On Tue, Mar 16, 2010 at 2:58 PM, Marc Sturlese
> wrote:
>
>>
>> I would like to know how Lucene deals with the score on multiValued
>> fi
I would like to know how Lucene deals with the score on multiValued fields.
I am wandering if:
1) a score is computed per field and the maximum between them wins
or
2)all terms of all fields (from the multivalued field) influence eachother
to compute the score
Let's say I have a document with a m
Thanks Hoss for the useful info.
Acording the coord(q,d) definition it's calculated at document level. It's
said:
is a score factor based on how many of the query terms are found in the
specified document
If I am just searching for a term, "ipod" in this case, how would be coord
computed? Would i
Hey there,
If I want to search let's say "ipod" in three different fields (device,
sound,technology)
Would be the same to use a DisjunctionMaxQuery with the tie braker = 1 than
to use a MultiFieldQueryParser with and OR to build the boolean queries?
As far as I understood in the api documentation
I have FastVectorHighlighter working with a query like:
title:Ipod OR title:IPad
but it's not working when (0 snippets are returned):
title:Ipod OR content:IPad
Could this be because when FieldQuery is created the query to build it must
have just one field?
If it's not the case I may be missing
I am doing some test with optimize and adding segments and I am wondering if
someone knows if what I am doing can give document inconsistency.
I have 2 folders with one index each. One have a non optimized index1 with 1
milion docs and a mergeFactor=10. The other one, index2 has the same index
op
Hey there,
Until now when using Lucene 2.4 I was always optimizing my index using
compound file after updating it. I was doing that because if not I could
feel a lot performance loss in search responses.
Now in Lucene 2.9 there are per segment readers and I have read something
about it performes b
Hey there, I am iterating over a DocSet and for every id I neew to get the
value of a field wich is analyzed with KeyworddAnalyzer and is not sored.
I have noticed to ways of doing it using Fieldcache. Can someone pleas
explain me the pros and contras of using one or another?
Using StringIndex:
/solr/search/MissingStringLastComparatorSource.html
>From there I try to link to org.apache.lucene.search.FieldComparatorSource
but get a 404 error.
Any idea how can I get access to that documentation?
Thanks in advance!
Michael McCandless-2 wrote:
>
> On Fri, Jun 12, 2009 at 6:09 PM,
Hey there,
I have noticed I am experiencing sort of a memory leak with a
CustomComparatorSource (wich implements SortComparatorSource).
I have a HashMap declared as variable of class in CustomComparatorSource:
final HashMap docs_to_modify
This HashMap contains ids of documents and priorities use
if used) are discarded (have no effect).
>
> Mike
>
> On Mon, Apr 6, 2009 at 4:01 PM, Marc Sturlese
> wrote:
>>
>> Hey there,
>> Does de function doc.setBoost(x.y) accept negative values or values minor
>> than 1?? I mean... it compile and doesn't give e
Hey there,
Does de function doc.setBoost(x.y) accept negative values or values minor
than 1?? I mean... it compile and doesn't give errors but the behabiour is
not exactly what I was expecting.
In my use case I have the field title... I want to give very very low
relevance to the documents witch t
> processed in bulk when the deletes are flush. So at the time of that
> call, IndexWriter does not know how many documents were affected by
> the delete.
>
> But why do you need to check this in the first place? EG searching
> will never return to you a deleted do
Hey there,
I would like to know how to check if a document has been deleted if I am
using an IndexWriter and the fucntions deleteDocument or updateDocument.
I have seen that deleteDocument from IndexReader returns an integer but in
the IndexWriter's case it's a void.
Any advice?
Thanks in advance
>
>
> On Nov 30, 2008, at 11:11 AM, Marc Sturlese wrote:
>
>>
>> Hey there,
>> I have a simple question about boosting fields,
>> I have a lucene indexer app that indexes data from a db. At indexing
>> time I
>> give different boost to the fields de
Hey there,
I have a simple question about boosting fields,
I have a lucene indexer app that indexes data from a db. At indexing time I
give different boost to the fields depending on if the field is title or
content. Would it be the same to set the boost at searching time instead of
at indexing? I
Hey there, I have posted about this problem before but I think I didn't
explain mysql very well.
I'll try to explain my problem inside the context:
I get ids from a database and I look for the documents in an index that
correspon to each id. There is just one match for every id. One I have the
doc
Hey there,
I am having some memory trouble with my Lucene app. I need to get the info
and delete about 1000 docs every time I execute the app. I get the IDs of
the documents to delete from a database and for all single ID I get the data
from the indexed doc using an index searcher and topdocs (sea
25 matches
Mail list logo