Re: how does lucene deal with intersection?

2012-07-20 Thread 齐保元
thank you,but what to do if i want to query "ratings" besides "cotent" at the same time.from the page you provided i dont see how join would work out. Michael McCandless 编写: >Maybe query time join? > >See http://www.searchworkings.org/blog/-/blogs/query-time-joining-in-lucene > >Mike McCandless

ReferenceManager.maybeRefreshBlocking() should not be declared throwing InterruptedException

2012-07-20 Thread Vitaly Funstein
This probably belongs in the JIRA, and is related to https://issues.apache.org/jira/browse/LUCENE-4025, but java.util.Lock.lock() doesn't throw anything. I believe the author of the change originally meant to use lockInterruptibly() inside but forgot to adjust the method sig after changing it back

Re: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Robert Muir
On Fri, Jul 20, 2012 at 8:24 PM, Mike O'Leary wrote: > Hi Robert, > I'm not trying to determine whether a document has term vectors, I'm trying > to determine whether the term vectors that are in the index have offsets and > positions > stored. Right: what i'm trying to tell you is that offsets

RE: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Mike O'Leary
Hi Robert, I'm not trying to determine whether a document has term vectors, I'm trying to determine whether the term vectors that are in the index have offsets and positions stored. Shouldn't the Field instance variables called storeOffsetWithTermVector and storePositionWithTermVector be set to

Re: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Robert Muir
I think its wrong for DumpIndex to look at term vector information from the Document that was retrieved from IndexReader.document, thats basically just a way of getting access to your stored fields. This tool should be using something like IndexReader.getTermFreqVector for the document to determin

RE: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Mike O'Leary
I neglected to mention that CreateTestIndex uses a collection of data files with .properties extensions that are included in the Lucene In Action source code download. Mike -Original Message- From: Mike O'Leary [mailto:tmole...@uw.edu] Sent: Friday, July 20, 2012 2:10 PM To: java-user@l

RE: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Mike O'Leary
Hi Robert, I put together the following two small applications to try to separate the problem I am having from my own software and any bugs it contains. One of the applications is called CreateTestIndex, and it comes with the Lucene In Action book's source code that you can download from Manning

Re: how does lucene deal with intersection?

2012-07-20 Thread Michael McCandless
Maybe query time join? See http://www.searchworkings.org/blog/-/blogs/query-time-joining-in-lucene Mike McCandless http://blog.mikemccandless.com On Fri, Jul 20, 2012 at 5:58 AM, 齐保元 wrote: > hi, >I have two collections:the first collection has documents like > 'docID,content', and th

Re: Problem with TermVector offsets and positions not being preserved

2012-07-20 Thread Robert Muir
Hi Mike: I wrote up some tests last night against 3.6 trying to find some way to reproduce what you are seeing, e.g. adding additional segments with the field specified without term vectors, without tv offsets, omitting TF, and merging them and checking everything out. I couldnt find any problems.

RE: Flushing Thread

2012-07-20 Thread Simon McDuff
Hi Simon W., See comments below. > Date: Fri, 20 Jul 2012 11:49:03 +0200> Subject: Re: Flushing Thread > From: simon.willna...@gmail.com > To: java-user@lucene.apache.org > > hey simon ;) > > > On Fri, Jul 20, 2012 at 2:29 AM, Simon McDuff wrote: > > > > Thank you Simon Willnauer! > > > > With

RE: RAM or SSD...

2012-07-20 Thread Dragon Fly
Thank you. > From: dawid.we...@gmail.com > Date: Thu, 19 Jul 2012 13:34:26 +0200 > Subject: Re: RAM or SSD... > To: java-user@lucene.apache.org > > Read this: > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Dawid > > On Thu, Jul 19, 2012 at 1:32 PM, Dragon Fly wr

how does lucene deal with intersection?

2012-07-20 Thread 齐保元
hi, I have two collections:the first collection has documents like 'docID,content', and the other collection has documents like 'docID,ratings'.Is there any fast algorithm to get the intersection between these two collections after search? I canot merge the fields together for particular

Re: Flushing Thread

2012-07-20 Thread Simon Willnauer
hey simon ;) On Fri, Jul 20, 2012 at 2:29 AM, Simon McDuff wrote: > > Thank you Simon Willnauer! > > With your explanation, we`ve decided to control the flushing by spawning > another thread. So the thread is available to still ingest ! :-) (correct me > if I'm wrong)We do so by checking the R

Re: how to deal with multi subject problem?

2012-07-20 Thread Ian Lea
Just add the different subjects to the document e.g. Doc doc = new Document(); for (String subject : subjects) { Field f = new Field("subject", subject, ...); doc.add(f); } Or concatenate the subjects and store the one long string. If you don't want a search to potentially match terms from m