RE: recovering payload from fields

2010-02-27 Thread Christopher Condit
> It sounds like you need to iterate through all terms sequentially in a given > field in the doc, accessing offset & payload? In which case reanalyzing at > search time may be the best way to go. If it matters it doesn't need to be sequential. I just need access to all the payloads for a given

Changing TF method

2010-02-27 Thread PlusPlus
Hi, I want to change the Lucene's similarity in a way that I can add Fuzzy memberships to the terms of a document. Thus, TF value of a term in one document is not always 1, it can add 0.7 to the value of the TF ( (In my application, each term is contained in a document at most once). This mem

Re: Infinite loop when searching empty index

2010-02-27 Thread Michael McCandless
Hmm -- can you give more details on the possible file descriptor leak? Or a test case? Thanks. Mike On Sat, Feb 27, 2010 at 12:24 PM, Justin wrote: > Thanks for checking.  I think I tracked down the problem.  We apparently > extended most of these classes and more work was necessary to facili

Re: Infinite loop when searching empty index

2010-02-27 Thread Justin
Thanks for checking. I think I tracked down the problem. We apparently extended most of these classes and more work was necessary to facilitate the latest API. I just didn't dig deep enough, into nextDoc() which I thought too trivial to step into. The extended Scorer repeatedly returned the

Re: custom FieldCache cost too much time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread Michael McCandless
If you look at the javadocs for IndexWriter it explains how to do it. You just provide a class that implements the warm method, and inside that method you do whatever app specific things you need to do to the provided IndexReader to warm it. Note that the SearcherManager class from LIA2 handles se

Re: FieldCache cost too much time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread Michael McCandless
Sounds like you should simply open & warm the reader in a background thread... You might want to use the SearcherManager class from upcoming Lucene in Action 2nd edition (NOTE: I'm a co-author). You can download the source code @ http://manning.com/hatcher3. Mike ---

Re: If you could have one feature in Lucene...

2010-02-27 Thread Glen Newton
Hello Uwe. That will teach me for not keeping up with the versions! :-) So it is up to the application to keep track of what it used for compression. Understandable. Thanks! Glen On 27 February 2010 10:17, Uwe Schindler wrote: > Hi Glen, > > >> Pluggable compression allowing for alternatives to

RE: If you could have one feature in Lucene...

2010-02-27 Thread Uwe Schindler
Hi Glen, > Pluggable compression allowing for alternatives to gzip for text > compression for storing. > Specifically I am interested in bzip2[1] as implemented in Apache > Commons Compress[2]. > While bzip2 compression is considerable slower than gzip (although > decompression is not too much s

Re: If you could have one feature in Lucene...

2010-02-27 Thread Glen Newton
Pluggable compression allowing for alternatives to gzip for text compression for storing. Specifically I am interested in bzip2[1] as implemented in Apache Commons Compress[2]. While bzip2 compression is considerable slower than gzip (although decompression is not too much slower than gzip) it comp

Re: NumericField exact match

2010-02-27 Thread Yonik Seeley
On Fri, Feb 26, 2010 at 3:33 PM, Ivan Vasilev wrote: > Does it matter precision step when I use NumericRangeQuery for exact > matches? No. There is a full-precision version of the value indexed regardless of the precision step, and that's used for an exact match query. > I mean if I use the def

答复: custom FieldCache cost too m uch time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread luocanrao
Ps:in our evrioment a document have more than ten field,in a short time,may be have many update. installing a mergedSegmentWarmer on the writer,can you give me a small case, thanks very much! -邮件原件- 发件人: luocanrao [mailto:luocan19826...@sohu.com] 发送时间: 2010年2月27日 19:09 收件人: java-user@luce

答复: custom FieldCache cost too m uch time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread luocanrao
I set merge factor 4, every five minute I reopen the reader. yes most of the time is very fast. But sometimes it is very slow. For example,when start the program,the first query cosume 10s! So when newly created segment generated,the query cosume more than 1s. Our performance is key point. Sorry ,m

RE: Infinite loop when searching empty index

2010-02-27 Thread Uwe Schindler
I was doing the same, MatchAllDocsScorer is fine and also AbstractAllTermDocs. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Saturday, Febru

Re: Infinite loop when searching empty index

2010-02-27 Thread Michael McCandless
I turned this into a unit test... but I don't see it never returning... the test passes. How did you create your empty reader? Patch: Index: src/test/org/apache/lucene/search/TestMatchAllDocsQuery.java === --- src/test/org/apache/lu

Re: recovering payload from fields

2010-02-27 Thread Michael McCandless
You can also access payloads through the TermPositions enum, but, this is by term and then by doc. It sounds like you need to iterate through all terms sequentially in a given field in the doc, accessing offset & payload? In which case reanalyzing at search time may be the best way to go. You ca

Re: custom FieldCache cost too much time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread Michael McCandless
How are you opening a new reader? If it's a near-real-time reader (IndexWriter.getReader), or you use IndexReader.reopen, it should only be the newly created segments that have to generate the field cache entry, which most of the time should be fast. If you are already using those APIs and its st

custom FieldCache cost too much time. how can I preload the the custom fieldCache when new segment exits!

2010-02-27 Thread luocanrao
custom FieldCache cost too much time. So every first time,reopen the new reader ,it interfere the performance of search I hope someone can tell me,how can I preload the the custom fieldCache when new segment exits! Thanks again! here is source , In FieldComparator.setNextReader method ((C2