>
> The codec intercepts merges in order to clean up files that are no longer
> referenced
>
What happens if a document is deleted while there's a reader open on the
index, and the segments are merged? Maybe I misunderstand what you meant by
this statement, but if the external file is deleted, sin
On 10/13/13 8:09 PM, Michael Sokolov wrote:
On 10/13/2013 1:52 PM, Adrien Grand wrote:
Hi Michael,
I'm not aware enough of operating system internals to know what
exactly happens when a file is open but it sounds to be like having
separate files per document or field adds levels of indirection
You can still use TopDocs.totalHits from searchAfter; that will be correct.
Providing "Last" with searchAfter is not really possible; it's also
somewhat strange (does anybody really use that?). Maybe you could
reverse your sort, take page 1, reverse its hits?
Mike McCandless
http://blog.mikemcc
Hi,
In my current implementation of Lucene 4.3 where there are millions of indexed
records, I do a regular search() and get the topDocs.totalHits as the count of
results.
As part of this, I store all the results in the session and then let the user
paginate through the results. With this, I am
Ian - Thank you for your inputs.
Regards,
Raghu
-Original Message-
From: Ian Lea [mailto:ian@gmail.com]
Sent: Tuesday, October 15, 2013 11:43 AM
To: java-user@lucene.apache.org
Subject: Re: QueryParser stripping off Hyphen from query
If you want to keep hyphens you could try Whites
Mike,
For now I'm using just a SpanQuery over a ~600MB index segment
single-threadedly (one segment - one thread, the complete setup is 30 segments
with the total of 20GB).
I'm trying to use Lucene for the morphologically annotated text corpus (namely,
Russian National Corpus).
The main query
Mike, you are right. I used StringField, but id_to_delete has a typo and
thus a mismatch.
Still good to confirm the understanding is correct.
Thanks for your helps.
On Thu, Oct 17, 2013 at 3:54 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> Your understanding is correct, and afte
DirectPostingsFormat holds all postings in RAM, uncompressed, as
simple java arrays. But it's quite RAM heavy...
The hotspots may also be in the queries you are running ... maybe you
can describe more how you're using Lucene?
Mike McCandless
http://blog.mikemccandless.com
On Thu, Oct 17, 2013
Hello!
I've tried two approaches: 1) RAMDirectory, 2) MMapDirectory + tmpfs. Both work
the same for me (the same bad:( ).
Thus, I think my problem is not disk access (although I always see getPayload()
in the VisualVM top).
So, maybe the hard part in the postings traversal is decompression?
Are
Boosting query clauses means more "this clause is more important than
that clause" rather than "make the score for this search higher". I
use it for biblio searching when want to search across multiple fields
and want matches in titles to be more important than matches in
blurbs.. Amended version
Your understanding is correct, and after reopen you should see the
document deleted, so I'm not sure offhand why you aren't.
BTW it's w.deleteDocuments not w.removeDocuments.
And you don't need to commit in order to see changes in the reopened
NRT reader (this is the whole point: commit is very c
Yes, I think you should have a play. But on an index that is as
realistic as you can make it - there may be variations in performance
of the different queries and filters depending on term frequencies and
loads of other stuff I don't understand. General point being simply
that YMMV.
--
Ian.
On
If you're using Solr you'd be better off asking this on the Solr list:
http://lucene.apache.org/solr/discussion.html.
You might also like to clarify what you want with regard to sentence
vs document. If you want to display the sentences of a matched doc,
surely you just do it: store what you need
Hi Team,
I have one requirement where i have to display sentences of valid document
if the keyword(input string) is found in that document.
I am thinking if parent-child relation will work?
DocBean
int doc_id
String doc_path
String content_id
ContentBean
int content_id
String content;
Need y
14 matches
Mail list logo