Uwe,
I was using a bounded thread pool.
I don't know if the problem was the task overload or something about the
actual efficiency of searching a single segment rather than iterating over
multiple AtomicReaderContexts, but I'd lean toward task overload. I will do
some testing tonight to find out
Hi,
use a bounded thread pool.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Desidero [mailto:desid...@gmail.com]
> Sent: Tuesday, October 01, 2013 11:37 PM
> To: java-user@lucene.apache.org
> Su
For anyone who was wondering, this was actually resolved in a different
thread today. I misread the information in the
IndexSearcher(IndexReader,ExecutorService) constructor documentation - I
was under the impression that it was submitting a thread for each index
shard (MultiReader wraps 20 shards,
On Tue, Oct 1, 2013 at 3:58 PM, Desidero wrote:
> Benson,
>
> Rather than forcing a random number of small segments into the index using
> maxMergedSegmentMB, it might be better to split your index into multiple
> shards. You can create a specific number of balanced shards to control the
> paralle
Benson,
Rather than forcing a random number of small segments into the index using
maxMergedSegmentMB, it might be better to split your index into multiple
shards. You can create a specific number of balanced shards to control the
parallelism and then forceMerge each shard down to 1 segment to avo
I am really sorry if something made you confuse, as I said I am indexing a
folder
which contains mylogs.log,mylogs1.log,mylogs2.log etc, I am not indexing
them as a flat file.
I have tokenized my each line of text with regex and storing them as fields
like "messageType",
"timeStamp","message".
So
I'm still a bit confused about exactly what you're indexing, when, but
if you have a unique id and don't want to add or update a doc that's
already present, add the unique id to the index and search (TermQuery
probably) for each one and skip if already present.
Can't you change the log rotation/co
Hi
Basically my log folder consists of four log files like
abc.log,abc1.log,abc2.log,abc3.log, as my log appender is doing. Every 30
minutes content will be changed of all these file , for example after 30
minutes refresh my conent of abc1.log will be replaced with existing abc.log
content and ab
milliseconds as unique keys are a bad idea unless you are 100% certain
you'll never be creating 2 docs in the same millisecond. And are you
saying the log record A1 from file a.log indexed at 14:00 will have
the same unique id as the same record from the same file indexed at
14:30 or will it be di
I am afraid, my document in the above code has already a unique-key (will
milli-seconds I hope this is enough to differentiate with another records).
My requirement is simple, I have a folder with a.log,b.log and c.log files
which will be updated every 30 minutes, I want to update the index of the
You might want to set a smallish maxMergedSegmentMB in
TieredMergePolicy to "force" enough segments in the index ... sort of
the opposite of optimizing.
Really, IndexSearcher's approach to using one thread per segment is
rather silly, and, it's annoying/bad to expose change in behavior due
to segm
Maybe Lucene's new replication module is useful for this?
Mike McCandless
http://blog.mikemccandless.com
On Mon, Sep 30, 2013 at 3:08 PM, Neda Grbic wrote:
> Hi all
>
> I'm hoping to use Lucene in my project, but I have two master-master
> servers. Is there any good tutorial how to make Lucene
Hi Benson,
On Mon, Sep 30, 2013 at 5:21 PM, Benson Margulies wrote:
> The multithreaded index searcher fans out across segments. How aggressively
> does 'optimize' reduce the number of segments? If the segment count goes
> way down, is there some other way to exploit multiple cores?
forceMerge[1
I'm not aware of a lucene rather than Solr or whatever tutorial. A
search for something like "lucene sharding" will get hits.
Why don't you want to use Solr or Katta or similar? They've already
done much of the hard work.
How much data are you talking about?
What are your master-master require
You have to call updateDocument with the unique key of the document to update.
The unique key must be a separate, indexed, not necessarily stored key.
addDocument just adds a new instance of the document to the index, it cannot
determine if it’s a duplicate.
-
Uwe Schindler
H.-H.-Meier-Alle
15 matches
Mail list logo