Re: OpenRelevance

2009-10-16 Thread Chris Hostetter
: Sorry, Open Relevance Project More specificly... http://lucene.apache.org/openrelevance/ Omar: I've updated the old wiki page you found to make it clear that the proposal has moved forward and is already a real sub-project... : > > > I would like to know if people are interested in t

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
the doWait() call is synchronized on IndexWriter but it is also, as you suggest in a loop in a block synchronized on IndexWriter. The doWait() call returns immediately, still holding the IndexWriter lock from the loop in the synchornized block as my stack trace shows, without blocking and giving t

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Michael McCandless
I'm glad you worked around it! But I don't fully understand the issue. That doWait is inside a sync(writer) block... if the Future manages to interrupt it, then that thread will release the lock when it exits that sync block. Actually, if the thread was indeed interrupted, you may be hitting thi

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
I discovered the problem and fixed it's effect on my code: Using the source for Lucene version 2.4.1, in IndexWriter.optimize() there is a call to doWait() on line 2283 this method attempts to wait for a second in order to give other threads it has spawned a chance to acquire it's mutex and comple

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
Indeed it looks like the thread running MergerThread started (After passing off to ConcurentMergeScheduler) by the thread calling IndexWriter.optimize() is indeed waiting on the mutex for the IndexWriter to be free so it can use the object to call mergeInit(). The IndexWriter however has entered a

RE: Filter for searching in result lists with 2.9

2009-10-16 Thread Uwe Schindler
Hi Christian, Why not create the filter using QueryWrapperFilter using the previous Query? Or simply combine the previous with the new query using a BooleanQuery with Occur.MUST? Even with the new API it is not possible what you want to do. The IndexReader passed to getDocIdSet/bits is not the sa

Filter for searching in result lists with 2.9

2009-10-16 Thread Christian Reuschling
Hi guys, in our app we gives the possibility to search inside a set of documents, which is the result list of a former search. Thus, someone can shrink down a search according different criterias. For this, we implemented a simple Filter that simply gets a TopDocs Object and creates a bitSet out

Re: OpenRelevance

2009-10-16 Thread Grant Ingersoll
Sorry, Open Relevance Project On Oct 16, 2009, at 12:44 PM, Omar Alonso wrote: What's ORP? --- On Fri, 10/16/09, Grant Ingersoll wrote: From: Grant Ingersoll Subject: Re: OpenRelevance To: java-user@lucene.apache.org Date: Friday, October 16, 2009, 3:32 AM I definitely am, but am lacking i

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
After tracing through the lucene source more it seems that what is happening is after I call Future.cancel(true) on my parent thread, optimize() is called and this method launches it's own thread using a ConcurrentMergeScheduler$MergeThread to do the actual merging. When this Thread comes around t

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Michael McCandless
But if the Future.cancel call turns out to be a no-op (simply waits instead of interrupting the thread), how could it be that the deadlock only happens when you call it? Weird. Are you really sure it's not actually calling Thread.interrupt? That stack trace looks like a normal "optimize is waiti

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
It doesn't look like my Future.cancel(true) is actually interrupting the thread. It only does so "if necessary" and in this case seems to be letting the Thread finish gracefully without need for interruption. The stack trace leading up to the hanging IndexWriter.optimize() method is below, though

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Mark Miller
Mark Miller wrote: > Michael Busch wrote: > >> Why will just saying once again "Hey, let's just release more often" >> work now if it hasn't in the last two years? >> >> Mich >> > > I don't know that we need to release more often to take advantage of > major numbers. 2.2 was released in 0

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Mark Miller
Michael Busch wrote: > Why will just saying once again "Hey, let's just release more often" > work now if it hasn't in the last two years? > > Mich I don't know that we need to release more often to take advantage of major numbers. 2.2 was released in 07 - we could have just released 2.9 right a

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Michael Busch
On 10/16/09 10:27 AM, Steven A Rowe wrote: On 10/16/2009 at 2:58 AM, Michael Busch wrote: B) best effort drop-in back compatibility for the next minor version number only, and deprecations may be removed after one minor release (e.g. v3.3 will be compat with v3.2, but not v3.4) This i

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Michael McCandless
My guess is it's the invocation of Thread.interrupt (which Future.cancel(true) calls if the task is running) that lead to the deadlock. Is it possible to get the stack trace of the thrown exception when the thread was interrupted? Maybe indeed something in IW isn't cleaning up its state on being

Re: IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
thanks for getting back. I do not lock on the IndexWriter object itself but all methods in my consumer class that use IndexWriter are synchronized (locking my singleton consumer object itself). The thread is waiting at IndexWriter.doWait(). What might cuase this? thanks - C>T> On Fri, Oct 16,

Re: Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread stefcl
Apologies, my previous message crossed yours. Good to hear that it's not intended behavior, I was worried. thanks for the fix! Kind regards stefcl wrote: > > Thanks, > Even if you add to the example a document called "giga", I'm not sure that > searching "giga~0.8" would return anything.

Re: Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread Mark Miller
It was a bug and Mike fixed it. The bug was that exact matches where not being returned as you state. Will be fixed in 2.9.1. stefcl wrote: > Thanks, > Even if you add to the example a document called "giga", I'm not sure that > searching "giga~0.8" would return anything. > > It seems a bit weir

Re: Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread stefcl
Thanks, Even if you add to the example a document called "giga", I'm not sure that searching "giga~0.8" would return anything. It seems a bit weird because an exact search (which I guess should be more or less equivalent to a fuzzy search with nearly ~1 similarity) would actually return some re

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Mark Miller
Steven A Rowe wrote: > On 10/16/2009 at 2:58 AM, Michael Busch wrote: > >> B) best effort drop-in back compatibility for the next minor version >> number only, and deprecations may be removed after one minor release >> (e.g. v3.3 will be compat with v3.2, but not v3.4) >> > > This is only t

RE: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Steven A Rowe
On 10/16/2009 at 2:58 AM, Michael Busch wrote: > B) best effort drop-in back compatibility for the next minor version > number only, and deprecations may be removed after one minor release > (e.g. v3.3 will be compat with v3.2, but not v3.4) This is only true on a per-feature basis. For example,

RE: IndexWriter optimize() deadlock

2009-10-16 Thread Uwe Schindler
Do you use the IndexWriter as mutex in a synchronized() block? This is not supported and may hang. Never lock on IndexWriter instances. IndexWriter itself is thread safe. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Messag

Re: Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread Michael McCandless
OK I've committed the fix on the 2.9.x branch, so it'll be included in the 2.9.1 release. Thanks for raising this! Mike On Fri, Oct 16, 2009 at 12:02 PM, Michael McCandless wrote: > This looks to have been caused by: > >    http://issues.apache.org/jira/browse/LUCENE-1124 > > Which short circui

IndexWriter optimize() deadlock

2009-10-16 Thread Christopher Tignor
Hello, I am trying to track down the cause of my code hanging on calling IndexWriter.optimize() at its doWait() method. It appears, thus that it is watiing on other merges to happen which is a bit confusing to me: My application is a simple producer consumer model where documents are added to a q

Re: OpenRelevance

2009-10-16 Thread Omar Alonso
What's ORP? --- On Fri, 10/16/09, Grant Ingersoll wrote: > From: Grant Ingersoll > Subject: Re: OpenRelevance > To: java-user@lucene.apache.org > Date: Friday, October 16, 2009, 3:32 AM > I definitely am, but am lacking in > time at the moment.  A good place to ask is over on the > ORP mailing

Re: NPE in NearSpansUnordered

2009-10-16 Thread Peter Keegan
I can reproduce this with a unit test - will post to JIRA shortly. Peter On Fri, Oct 16, 2009 at 8:06 AM, Peter Keegan wrote: > next() is called in PayloadNearQuery->setFreqCurrentDoc: > super.setFreqCurrentDoc(); > But, I think it should be called before 'getPayloads'. That doesn't fix the > NPE

Re: Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread Michael McCandless
This looks to have been caused by: http://issues.apache.org/jira/browse/LUCENE-1124 Which short circuits all matching if the term is too short relative to the min similarity. But I guess something must be wrong w/ the formula. I'll reopen that issue & mark fix for 2.9.1. Mike On Fri, Oct

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Yonik Seeley
On Fri, Oct 16, 2009 at 4:54 AM, Jukka Zitting wrote: > Hi, > > On Fri, Oct 16, 2009 at 10:23 AM, Danil ŢORIN wrote: >> What about creating major version more often? > > +1 We're not going to run out of version numbers, so I don't see a > reason not to upgrade the major version number when making

Difference between 2.4.1 and 2.9.0 (possible regression?)

2009-10-16 Thread stefcl
Hello, We are re currently migrating from 2.4.1 to 2.9.0. We've noticed some changes in the results of fuzzy queries. We have made this small test case : StandardAnalyzer analyzer = new StandardAnalyzer(); Directory index = new RAMDirectory(); IndexWriter w = new IndexWriter(index, ana

Re: NPE in NearSpansUnordered

2009-10-16 Thread Peter Keegan
next() is called in PayloadNearQuery->setFreqCurrentDoc: super.setFreqCurrentDoc(); But, I think it should be called before 'getPayloads'. That doesn't fix the NPE, though. The empty PQ occurs in the outermost span in the query, and seems to fail on the last document it scores. Prior to 2.9, I'd be

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Erdinc Yilmazel
I'd go with B. I never do drop-in replacement of a jar even if it is a minor release for any library. I always recompile. I think the major version number shouldn't be changed unless there are lots of API changes or changes in the index format. On Fri, Oct 16, 2009 at 12:38 PM, Mark Miller wrote

RE: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Uwe Schindler
> > So please tell us which you prefer as a back compatibility policy for > > Lucene: > > I don't do drop in but recompile anyway, so it doesn't matter for me. > It is only important that the documentation is clear about what has to > be done. > > > B) best effort drop-in back compatibility for t

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Mark Miller
Jukka Zitting wrote: > Hi, > > On Fri, Oct 16, 2009 at 10:23 AM, Danil ŢORIN wrote: > >> What about creating major version more often? >> > > +1 We're not going to run out of version numbers, so I don't see a > reason not to upgrade the major version number when making > backwards-incompat

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Stefan Trcek
On Friday 16 October 2009 08:57:37 Michael Busch wrote: > > So please tell us which you prefer as a back compatibility policy for > Lucene: I don't do drop in but recompile anyway, so it doesn't matter for me. It is only important that the documentation is clear about what has to be done. > B) b

Re: NPE in NearSpansUnordered

2009-10-16 Thread Grant Ingersoll
And, you don't get this on 2.4.1? Are you sure you've called next()? Is it by chance on the first document it tries to score that it fails? -Grant On Oct 15, 2009, at 1:28 PM, Peter Keegan wrote: The query is: +payloadNear([spanNear([contents:insurance, contents:agent], 1, false),

Re: is Lucene 3.0 coming soon?

2009-10-16 Thread Ivan Vasilev
OK, thanks guys! Grant Ingersoll wrote: On Oct 16, 2009, at 6:05 AM, Uwe Schindler wrote: I would recommend to adopt your app to 2.9 and enable deprecation warnings. As soon as all deprecation warning disappear during compile, you are able to just go to 3.0 (just drop in jars when available)

Re: OpenRelevance

2009-10-16 Thread Paul Libbrecht
Not something for the very soon future, but I'd be interested to base on such an infrastructure for a mathematical-formulæ search corpus (both semantic and presentation math). I believe the OpenRelevance infrastructure might present a best practice or infrastructure to be based on for such.

Re: which version

2009-10-16 Thread R.A.Ittoo
thanks. I had an old lucene (bundled in another application called GATE). I can now use the new 2.9 version. ashwin On Fri, 16 Oct 2009 12:09:10 +0200 "Uwe Schindler" wrote: Are you sure, that there is no older lucene version somwhere in your classpath? Such problems are mostly caused by this

Re: OpenRelevance

2009-10-16 Thread Grant Ingersoll
I definitely am, but am lacking in time at the moment. A good place to ask is over on the ORP mailing lists, maybe that will help kickstart things over there. -Grant On Oct 15, 2009, at 3:32 PM, Omar Alonso wrote: Hi folks, I would like to know if people are interested in the OpenRelevan

Re: is Lucene 3.0 coming soon?

2009-10-16 Thread Grant Ingersoll
On Oct 16, 2009, at 6:05 AM, Uwe Schindler wrote: I would recommend to adopt your app to 2.9 and enable deprecation warnings. As soon as all deprecation warning disappear during compile, you are able to just go to 3.0 (just drop in jars when available). This is why we have 2.9. 2.9: it is

Re: which version

2009-10-16 Thread Ian Lea
http://www.mail-archive.com/java-user@lucene.apache.org/msg27095.html shows a way to check what version you actually are using. -- Ian. On Fri, Oct 16, 2009 at 11:09 AM, Uwe Schindler wrote: > Are you sure, that there is no older lucene version somwhere in your > classpath? Such problems are m

RE: which version

2009-10-16 Thread Uwe Schindler
Are you sure, that there is no older lucene version somwhere in your classpath? Such problems are mostly caused by this. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: R.A.Ittoo [mailto:r.a.it...@rug.n

RE: is Lucene 3.0 coming soon?

2009-10-16 Thread Uwe Schindler
I would recommend to adopt your app to 2.9 and enable deprecation warnings. As soon as all deprecation warning disappear during compile, you are able to just go to 3.0 (just drop in jars when available). This is why we have 2.9. 2.9: it is just 3.0 with the deprecations not yet removed. No other ch

is Lucene 3.0 coming soon?

2009-10-16 Thread Ivan Vasilev
Hi Lucene Guys, I am interested what is your plan date for releasing Lucene 3.0. I am asking because seeing on the changes in Lucene 2.9 (especially changes in backward compatibility) I guess that it will be difficult for us to adopt our app to Lucene 2.9. I see in your Jira there are not many

which version

2009-10-16 Thread R.A.Ittoo
hi i am using lucene version 2.9 When calling the StandardAnalyzer constructor with the VErsion.LUCENE_VERSION as parameter, I get the error "symbol not found constructor StandardAnalyzer(org.apache.lucene.util.Version)" this is strange, as it is supposed to be correct according to the API doc an

Re: PrefixQueries on large indexes (4M+ Documents) using a partial Query partial Filter solution

2009-10-16 Thread Michael McCandless
Super! Mike On Fri, Oct 16, 2009 at 4:06 AM, Shaun Senecal wrote: > Thanks Mike.  The queries are now running faster than they ever were before, > and are returning the expected results! > > > On Fri, Oct 16, 2009 at 7:39 AM, Shaun Senecal wrote: > >> Ah!  I thought that the ConstantScoreQuery w

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Jukka Zitting
Hi, On Fri, Oct 16, 2009 at 10:23 AM, Danil ŢORIN wrote: > What about creating major version more often? +1 We're not going to run out of version numbers, so I don't see a reason not to upgrade the major version number when making backwards-incompatible changes. BR, Jukka Zitting

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Christian Reuschling
Hello Michael, I also would prefer B - it also shortens the time to have a benefit of new Lucene features in our applications. It forces our lazy programmers (I am of course ;) ) to deal with them - and reduces the efford to change to a major release afterwards. Maybe some minimum time waiting bef

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread gabriele renzi
On Fri, Oct 16, 2009 at 9:39 AM, Paul Elschot wrote: > I'd prefer B), with a minimum period of about two months to the > next release in case it removes deprecations. for what my vote counts, seconded - To unsubscribe, e-mail:

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Danil ŢORIN
I'd vote A with following addition: What about creating major version more often? If there are incremental improvements which don't clutter the code too much continue with 3.0 -> 3.1 -> 3.2 -> .. -> 3.X Once there are significant changes which are hard to maintain backward compatible start a 4.0

Re: PrefixQueries on large indexes (4M+ Documents) using a partial Query partial Filter solution

2009-10-16 Thread Shaun Senecal
Thanks Mike. The queries are now running faster than they ever were before, and are returning the expected results! On Fri, Oct 16, 2009 at 7:39 AM, Shaun Senecal wrote: > Ah! I thought that the ConstantScoreQuery would also be rewritten into a > BooleanQuery, resulting in the same exception.

Re: Proposal for changing Lucene's backwards-compatibility policy

2009-10-16 Thread Paul Elschot
On Friday 16 October 2009 08:57:37 Michael Busch wrote: > Hello Lucene users: > > In the past we have discussed our backwards-compatibility policy > frequently on the Lucene developer mailinglist and we are thinking about > making some significant changes. In this mail I'd like to outline the > pr