Reusing Query instances

2011-04-29 Thread Otis Gospodnetic
Hi, Is there any reason why one would *not* want to reuse Query instances? I'm using MemoryIndex with a fixed set of queries and I'm executing them all on each new document that comes in. Because each document needs to have many tens of thousands of queries executed against it, I thought I'd j

[ANN] Luke 3.1.0 released

2011-04-29 Thread Andrzej Bialecki
Hi, I'm happy to announce the release of Luke 3.1.0. This release is based on Lucene 3.1.0. Binaries and source code are available from the project's page at Google Code: http://code.google.com/p/luke/ Changes in version 3.1.0 (released on 2011.04.30): * Issue 35: Lucene 3.1 compatible luke

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Thanks Dawid. – Steve From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Friday, April 29, 2011 4:45 PM To: java-user@lucene.apache.org Cc: Steven A Rowe Subject: Lucene 3.0.3 with debug information This is the e-mail you're looking for, Steven (it wasn't

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Michael McCandless
On Fri, Apr 29, 2011 at 4:25 PM, Paul Taylor wrote: >> Hmm maybe that is enough, Im not sure. I'm profiling with YourkitProfiler >> and it doesnt show anything within the lucene classes so I assumed this >> meant they didnt contain the neccessary debugging info but I would have >> thought that -g

Link to nightly build test reports on main Lucene site needs updating

2011-04-29 Thread Burton-West, Tom
Hello, I went to look at the "Hudson nightly builds" and tried to follow the link from the main Lucene page http://lucene.apache.org/java/docs/developer-resources.html#Nightly The links to the Clover Test Coverage Reports point to http://hudson.zones.apache.org/hudson/view/Lucene/job/Lucene-

Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
This is the e-mail you're looking for, Steven (it wasn't forwarded to the list, apparently). Dawid -- Forwarded message -- From: Paul Taylor Date: Fri, Apr 29, 2011 at 10:11 PM Subject: Re: Lucene 3.0.3 with debug information To: Dawid Weiss On 29/04/2011 15:17, Dawid Weiss w

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Hi Paul, On 4/29/2011 at 4:14 PM, Paul Taylor wrote: > On 29/04/2011 16:03, Steven A Rowe wrote: > > What did you find about Luke that's buggy? Bug reports are very > > useful; please contribute in this way. > > Please see previous post, in summary mistake on my part. Okay... Which previous post

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
On 29/04/2011 21:14, Paul Taylor wrote: Hmm maybe that is enough, Im not sure. I'm profiling with YourkitProfiler and it doesnt show anything within the lucene classes so I assumed this meant they didnt contain the neccessary debugging info but I would have thought that -g is all I need tha

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
Instead of profiling, provide some more info about the following: - what are the problematic (slow) queries -- are they generated from the code, are they parsed from text? What are they? Certain query types are slow(er) than other query types. - what is the index built from? Natural language (tex

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
On 29/04/2011 16:03, Steven A Rowe wrote: Hi Paul, What did you find about Luke that's buggy? Bug reports are very useful; please contribute in this way. Please see previous post, in summary mistake on my part. The official Lucene 3.0.3 distribution jars were compiled using the -g cmdline a

RE: SorterTemplate.quickSort causes StackOverflowError

2011-04-29 Thread Uwe Schindler
Hi Otis, Thanks for trying out. From what I see, the problem is at all not in MemoryIndex, so I suggest that you replace the mergeSort by quicksort again (for MemoryIndex, see below). The problem seem to be the comparators that's are in those Queries, which have no tie-breaker. MergeSort can handl

Re: SorterTemplate.quickSort causes StackOverflowError

2011-04-29 Thread Otis Gospodnetic
Hi, Yeah, that's what we were going to do, but instead we did: * changed MemoryIndex to use ArrayUtil.mergeSort * ran the up and did a thread dump that shows that SorterTemplate.quickSort in deep recursion again! * looked for other places where this call is made - found it in MultiPhraseQuery$Mu

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Hi Paul, What did you find about Luke that's buggy? Bug reports are very useful; please contribute in this way. The official Lucene 3.0.3 distribution jars were compiled using the -g cmdline argument to javac - by default, though, only line number and source file information is generated. If

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
> lucene/Search that is taking the time, I also had another attempt using > luke > > but find it incredibly buggy and of little use > Can you expand on this too? What kind of "incredible bugs" did you see? Without feedback there is little progress, so bug reports count. Dawid

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Simon Willnauer
Hey paul, you can simply checkout the tag or download the sources right? http://svn.apache.org/repos/asf/lucene/java/tags/lucene_3_0_3/ or http://ftp.download-by.net/apache//lucene/java/3.0.3/ simon On Fri, Apr 29, 2011 at 1:09 PM, Paul Taylor wrote: > Is there a built debug version of lucene 3

ComplexPhraseQueryParser with multiple fields

2011-04-29 Thread Chris Salem
Hi, I've just started using the ComplexPhraseQueryParser and it works great with one field but is there a way for it to work with multiple fields? For example, right now the query: job_title: "sales man*" AND NOT contact_name: "Chris Salem" throws this exception Caused by: org.apache.lucene.que

Re: SorterTemplate.quickSort causes StackOverflowError

2011-04-29 Thread Dawid Weiss
Don't know if this helps, but debugging stuff like this I simply add a (manually inserted or aspectj-injected) recursion count, add a breakpoint inside an if checking for recursion count >> X and run the vm with an attached socket debugger. This lets you run at (nearly) full speed and once you hit

Re: SorterTemplate.quickSort causes StackOverflowError

2011-04-29 Thread jm
maybe http://youdebug.kenai.com/ could be useful. If you are lucky you could get it to set a breakpoint when the recursive call has reached depth X. On Fri, Apr 29, 2011 at 1:40 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hi, > > OK, so it looks like it's not MemoryIndex and its C

Re: Are Okapi BM25 scores normalized into 0 and 1 ?

2011-04-29 Thread Paul Libbrecht
Patrick if the question is about the code snippert at the page you mention, which I copy below, I believe the answer is no and the author is aware of it since he is adding a comment about not-normalized in the second example. ScoreDocs and TopDocs are not returning normalized scores. Normalized

Re: SorterTemplate.quickSort causes StackOverflowError

2011-04-29 Thread Otis Gospodnetic
Hi, OK, so it looks like it's not MemoryIndex and its Comparator that are funky. After switching from quickSort call in MemoryIndex to mergeSort, the problem persists: '1205215856@qtp-684754483-7' Id=18, RUNNABLE on lock=, total cpu time=497060.ms user time=495210.msat org.apache.luc

Re: Are Okapi BM25 scores normalized into 0 and 1 ?

2011-04-29 Thread Patrick Diviacco
Can anybody provide me some information about it ? Even a small clue, I'm kinda stuck on this and the owner of the libraries do not answer emails. Thanks On 28 April 2011 13:49, Patrick Diviacco wrote: > Is Okapi BM25 (its implementation in Lucene: > nlp.uned.es/~jperezi/Lucene-BM25) returning

Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
Is there a built debug version of lucene 3.0.3 so I can profile it properly to find what part of the search is taking the time. Note:Ive already profiled by application and determined that it is the lucene/Search that is taking the time, I also had another attempt using luke but find it incred

Re: document with parent-child relationship

2011-04-29 Thread harsh srivastava
Hi, You can create three fields for a document to index e.g. Fields => parent_id parent_textchild_text Contents =>1 low pressure engine wheel, etc 2 Electronics laptop pc ...

document with parent-child relationship

2011-04-29 Thread svonec
Hello, I need an advice on how to create an document that has parent-child relationship. Here is an example: "low pressure" -> "engine" -> "wheel" -> "low pressure" string is the parent and "engine" and "wheel" are children. I'd like to be able to