2013/3/17 Michael McCandless :
> Hi Michael (directly CC'd this time...),
>
> Maybe you're not subscribed to the list? Your first email got some
> responses, eg:
>
> http://lucene.markmail.org/thread/lrv7miivzmjm3ml5
Indeed, he's not, I didn't auto-subscribe him when putting his message
throu
2013/2/28 ash nix :
> Hi,
>
> Can anyone please send me document on lucene 4 index format?
> Want to know internals of index.
It is part of the Lucene documentation.
http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/codecs/lucene41/package-summary.html#package_description
--dho
> --
>
2012/12/17 dokondr :
> java-user-subscribe
Sorry, I let this message through forgetting that the allow / accept
addresses just send the message and don't actually subscribe the user.
If you would like to subscribe to the list, please send an email to
java-user-subscr...@lucene.apache.org.
--dho
Seems worth mentioning in partial response to this thread's topics that
(almost) regardless of index strategy, lucene performance hinges on number
of matched documents per query, not total docs in index. There are other
mitigating factors (disk type, ram size, etc), but worst case performance
analy
In my experience, reopen will find all changes on an index, whether it was
modified by the same process or not. If you're replicating over a network,
you might need some barrier / lock around the reopen call to make sure the
replicated index is complete. Obviously with something as fickle as a
netw
One way to do this is to create an Analyzer and Tokenizer that are
used on both index and search side. In the tokenStream method, you
return a new normalizing tokenizer; in the Tokenizer, you override the
normalize method to ignore apostrophes.
--dho
2011/9/12 SBS :
> In out situation we need it
2011/8/30 Joe MA :
> When searching a single collection, no problem. But if I want to search the
> two collections at the same time, I need to know which collection the hit
> came from so I can retrieve the base_path from the database. These
> base_paths can be different. As mentioned, this w
2011/8/29 Uwe Schindler :
> Why do you need to know the subreader? If you want to get the document's
> stored fields, use the MultiReader.
>
> If you really want to know the subreader, use this:
> http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/util/ReaderUtil.html#subReader(int,
>
2011/8/29 Joseph MarkAnthony :
> Greetings,
> In the past (Lucene version 2.x) I successfully used
> MultiSearcher.subsearcher() to identify the searchable within a MultiSearcher
> to which a hit belonged.
>
> In moving to Lucene 3.3, MultiSearcher is now deprecated, and I am trying to
> crea
For what it's worth, I've seen this happen too (using the stock Lucene
3.3 Java APIs), but it requires me to index many millions of
documents, and doesn't start being a really big problem until the
indexes get to be closer to 250GB in size. When they reach around 1TB,
it will take around an hour fo
and
perhaps my hackish solution will work for you (if you're not already
doing this). But indeed on searches returning several million records,
it's kind of silly to keep spinning.
Kind regards,
Devon H. O'Dell
> Thanks.
>
> - Chris
>
I have my own collector, but implemented this functionality by running
the search in a thread pool and terminating the FutureTask running the
job if it took longer than some configurable amount of time. That
seemed to do the trick for me. (In my case, the IndexReader is
explicitly opened readonly,
2011/4/1 Yogesh Dabhi :
> Hi
>
> Concurrently 5 user access same lucene directory for searching document
>
> That time I got bellow exception
>
> org.apache.lucene.store.AlreadyClosedException: this IndexReader is
> closed
>
> is there a way to handle such error
Use a ReentrantReaderWriterLock aro
2011/3/30 Simon Willnauer :
> On Wed, Mar 30, 2011 at 8:14 AM, Li Li wrote:
>> merge will also change docid
>> all segments' docId begin with 0
>
> for all released version this is not true. Before trunk (and I think
> its in 3.1 also) merge only merged continuous segments so the actual
> per-segm
2011/3/24 Uwe Schindler :
> Don't use MultiSearcher. Instead create a MultiReader around the separate
> IndexReaders for each index and pass that MultiReader to a conventional
> IndexSearcher as IndexReader. MultiSearcher is very buggy.
Could you elaborate on this point at all, Uwe? I'm using
Para
2011/3/17 Ganesh :
> Is this bug https://issues.apache.org/jira/browse/LUCENE-2249 got fixed in
> 3.0.3?
The linked ticket shows that it was fixed in 3.0.3.
--dho
> Regards
> Ganesh
>
> - Original Message -
> From: "Ganesh"
> To:
> Sent: Thursday, March 17, 2011 7:03 PM
> Subject: Re:
There is a DuplicateFilter class in contrib that works pretty well.
2011/3/5 Grant Ingersoll :
> See http://wiki.apache.org/solr/Deduplication. Should be fairly easy to pull
> out if you are doing just Lucene.
>
> On Mar 5, 2011, at 1:49 AM, Mark wrote:
>
>> Is there a way one could detect dupli
17 matches
Mail list logo