Thanks for the response. Actually, I am more concerned with trying to use an
Object Store for the indexes. The next concern is the use of a local index
versus the sharded ones, but I'm more relaxed about that now after thinking
about it. I see that index shards could be up to 100 million documen
ok,thanks.
I modify my program like you suggest.But another problem appear:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:203)
at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:273)
at
Solr already logs the queries themselves although there isn't any way
that I know of to associate that with a user.
Although in Solr land, it seems that whatever servlet container that
you would use for Solr should be able to log all the URLs that hit
the server.
Best
Erick
On Mon, Feb 6, 2012 a
To complete this thread, I read the document itself with a 1 field
fieldSelector, so as not to bother with anything but exactly what I needed at
this point in the code (particular not the text body).
Then I saved the primary key (the path) of documents that visited this
CustomScoreQuery (functi
I have a set of Lucene indexes for which I need to log all accesses and possibly
queries. I can use kernel-level auditing to record file accesses, but what would
be the best approach to logging the strings for all queries against these indexes?
What comes to mind is a Lucene analogy to a databa
Will do.
On Tue, Feb 7, 2012 at 12:52 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> You tell NRTCachingDirectory how much RAM it's allowed to use, and it
> then caches newly flushed segments in a private RAMDirectory.
>
> But you should first test performance w/o it (after removing
You tell NRTCachingDirectory how much RAM it's allowed to use, and it
then caches newly flushed segments in a private RAMDirectory.
But you should first test performance w/o it (after removing the
commit calls). NRT is very fast...
Mike McCandless
http://blog.mikemccandless.com
On Mon, Feb 6,
Good point. I should remove the commits.
Any difference between NRTCashingDirectory and RAMDirectory? how to define
the "small"?
On Tue, Feb 7, 2012 at 12:42 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> You shouldn't call IW.commit when using NRT; that's the point of NRT
> (makin
You shouldn't call IW.commit when using NRT; that's the point of NRT
(making changes visible w/o calling commit).
Only call commit when you require that all changes be durable (surive
OS / JVM crash, power loss, etc.) on disk.
Also, you can use NRTCachingDirectory which acts like RAMDirectory for
Agree.
On Mon, Feb 6, 2012 at 11:53 PM, Uwe Schindler wrote:
> Hi Cheng,
>
> all pros and cons are explained in those articles written by Mike! As soon
> as there are harddisks in the game, there is a slowdown, what do you
> expect?
> If you need it faster, buy SSDs! :-)
>
> Uwe
>
> -
> Uwe
My original question is if there exists a way to configure writer when to
writer to FSDirectory. I think there may be something in
the IndexWriterConfig that can helps.
On Mon, Feb 6, 2012 at 11:50 PM, Ian Lea wrote:
> Well, yes. What would you expect? From the javadocs for
> IndexWriter.commi
Hi Cheng,
all pros and cons are explained in those articles written by Mike! As soon
as there are harddisks in the game, there is a slowdown, what do you expect?
If you need it faster, buy SSDs! :-)
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@t
Well, yes. What would you expect? From the javadocs for IndexWriter.commit()
Commits all pending changes (added & deleted documents, segment
merges, added indexes, etc.) to the index, and syncs all referenced
index files ... This may be a costly operation, so you should test the
cost in your app
I meant that when I use NRTManager and use commit(), the speed is slower
than when I use RAMDirectory.
In my case, NRTManager instance not only perform search but update/modify
indexes which should be visible to other threads. In RAMDirectory, the
commit() doesn't synchronize indexes with the FSDi
Uwe, when I meant speed is slow, I didn't refer to instant visibility of
changes, but that the changes may be synchronized with FSDirectory when I
use writer.commit().
When I use RAMDirectory, the writer.commit() seems much faster than using
NRTManager built upon FSDirectory. So, I am guessing the
What exactly do you mean by the "speed is slower"? Time taken to
update the index? Time taken for updates to become visible in search
results? Time taken for searches to run on the IndexSearcher returned
from SearcherManager? Something else?
--
Ian.
On Mon, Feb 6, 2012 at 3:27 PM, Cheng wr
Please review the following articles about NRT, absolutely instant updates
that are visible as they are done are almost impossible (even with
RAMDirectory):
http://goo.gl/mzAHt
http://goo.gl/5RoPx
http://goo.gl/vSJ7x
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetap
Ian,
I encountered an issue that I need to frequently update the index. The
NRTManager seems not very helpful on this front as the speed is slower than
RAMDirectory is used.
Any improvement advice?
On Mon, Feb 6, 2012 at 10:24 PM, Cheng wrote:
> That really helps! I will try it out.
>
> Than
That really helps! I will try it out.
Thanks.
On Mon, Feb 6, 2012 at 10:12 PM, Ian Lea wrote:
> You would use NRTManagerReopenThread as a standalone thread, not
> plugged into your Executor stuff. It is a utility class which you
> don't have to use. See the javadocs.
>
> But in your case I'd
You would use NRTManagerReopenThread as a standalone thread, not
plugged into your Executor stuff. It is a utility class which you
don't have to use. See the javadocs.
But in your case I'd use it, to start with anyway. Fire it up with
suitable settings and forget about it, except to call close(
Not sure if you got an answer to this or not. Don't recall seeing one
and gmail threading says not.
> Is the use of payloads I've described appropriate?
Sounds OK to me, although I'm not sure why you can't store the
metadata as a Document Field.
> Can I exclude/filter the matching terms based o
I don't understand this following portion:
IndexWriter iw = new IndexWriter(whatever - some standard disk index);
NRTManager nrtm = new NRTManager(iw, null);
NRTManagerReopenThread ropt = new NRTManagerReopenThread(nrtm, ...);
ropt.setXxx(...);
ropt.start();
I have a java ExecutorServices in
If you can use NRTManager and SearcherManager things should be easy
and blazingly fast rather than unbearably slow. The latter phrase is
not one often associated with lucene.
IndexWriter iw = new IndexWriter(whatever - some standard disk index);
NRTManager nrtm = new NRTManager(iw, null);
NRTMana
int doc will be for the subreader, not for the entire index.
oal.search.Collector has setNextReader(IndexReader reader, int
docBase) which you might somehow be able to use. Failing that I'd go
for FieldCache, or store the docids in a Set in a Map keyed by current
Reader, if that would give you wha
At least it doesn't give the same score for a doc which doesn't have
all the terms which I think at one point you claimed.
So to try and simplify this, you've got one field called content and
doc1: pqrst uvwx abcd
doc2: abcd pqrst uvwx
and the query "abcd^10.0 content:pqrst^5.0" gives the same s
Hi
The issue of searching file name is resolved with some
modifications in SearchFiles.java .
A field named path has been added in the code.
String field = "path";
Also appended parser.setAllowLeadingWildcard(true) for searching leading
wildcard strings, which was not available
26 matches
Mail list logo