RE: arguments in favour of lucene over commercial competition

2010-06-24 Thread Itamar Syn-Hershko
> -Original Message- > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > Sent: Friday, June 25, 2010 1:09 AM > To: java-user@lucene.apache.org > Subject: Re: arguments in favour of lucene over commercial competition > > And I was just thinking the other day how it would be cool

Re: arguments in favour of lucene over commercial competition

2010-06-24 Thread Otis Gospodnetic
And I was just thinking the other day how it would be cool to take, say, Lucene 1.4, then some 2.* version and now the latest 3.* version and compare. :) Want to do it and share? I don't think anyone has done this before. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene

Re: Problems with homebrew ParallelWriter

2010-06-24 Thread Justin
Nevermind, it is blocking... public void optimize() throws CorruptIndexException, IOException { optimize(true); } - Original Message From: Justin To: java-user@lucene.apache.org Sent: Thu, June 24, 2010 3:56:17 PM Subject: Re: Problems with homebrew ParallelWriter So is In

Re: Problems with homebrew ParallelWriter

2010-06-24 Thread Justin
So is IndexWriter::optimize() non-blocking, even with SerialMergeScheduler? That might explain our problem in trying to use optimize() to make maxDoc() match between the two indexes before adding readers to ParallelReader. I see that we could call optimize(true). - Original Message ---

Re: Could multiple indexers change same collections at the same time?

2010-06-24 Thread Yonik Seeley
Yes, all of that still applies to Lucene 3x and 4x, and is unlikely to change any time soon. -Yonik http://www.lucidimagination.com On Thu, Jun 24, 2010 at 1:51 PM, Zhang, Lisheng wrote: > Hi, > > I remembered I tested earlier lucene 1.4 and 2.4, and found the following: > > # it is OK for multi

Could multiple indexers change same collections at the same time?

2010-06-24 Thread Zhang, Lisheng
Hi, I remembered I tested earlier lucene 1.4 and 2.4, and found the following: # it is OK for multiple searchers to search the same collection. # it is OK for one IndexerWriter to edit and multiple searchers to search at the same time. # it is generally NOT OK for multiple IndexerWriter to

RE: URL Tokenization

2010-06-24 Thread Steven A Rowe
Hi Sudha, Sorry, I should have mentioned that the existing patch is intended for use only against the trunk version (i.e., version 4.0-dev). Instructions for checking out a working copy from Subversion are here: http://wiki.apache.org/lucene-java/HowToContribute Once you've done that, chang

Re: Problems with homebrew ParallelWriter

2010-06-24 Thread Justin
Hi Mike, We did use IndexWriter::setInfoStream. Apparently there is a lot to sift through. I'll let you know if we make any discoveries useful for others. Thanks! Justin - Original Message From: Michael McCandless To: java-user@lucene.apache.org Sent: Thu, June 24, 2010 4:04:47 A

Re: Problems with homebrew ParallelWriter

2010-06-24 Thread Justin
Hi Shai, > Is it synchronized public synchronized void addDocument(Document document) throws CorruptIndexException, IOException { Document document2 = new Document(); document2.add(...); writer1.addDocument(document); writer2.addDocument(document2); } > did you encounter

Re: URL Tokenization

2010-06-24 Thread Sudha Verma
Hi Steve, Thanks for the quick reply and implementing support for URL tokenization. Another newbie question about applying this patch. I have the Lucene 3.0.2 source and I downloaded the patch and tried to apply it: lucene-3.0.2> patch -p0 < LUCENE-2167.patch Comes back with the error message:

Roadmap for next major release

2010-06-24 Thread Ganesh
Hello all, What is the road map of next major release? Few days back, many have posted their expectation / ideas for the next release. What is the plan and what all the things we could expect from the next release Regards Ganesh Send free SMS to your Friends on Mobile from your Yahoo! Messenger

Re: Problems with homebrew ParallelWriter

2010-06-24 Thread Michael McCandless
I agree w/ Shai -- from your description it looks like your docs should be in sync (assuming no exceptions, and a serial doc/del stream going in). If you turn on infoStream for all the writers & post the results, we can look for where they diverge... Mike On Wed, Jun 23, 2010 at 11:48 PM, Shai E

Re: Overriding Lucene's term weights computation

2010-06-24 Thread Naama Kraus
OK, got it. Thanks Yuval. Naama On Thu, Jun 24, 2010 at 10:44 AM, Yuval Feinstein wrote: > Naama, > AFAIK, payloads store an arbitrary byte array per position > (see > > http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ > and > > http://www.lucidimagination.com/blog/

Re: Overriding Lucene's term weights computation

2010-06-24 Thread Naama Kraus
Great ! Thanks Dionisis. Naama On Thu, Jun 24, 2010 at 10:27 AM, Dionisis Koumouras wrote: > Naama, > I recently faced a similar problem. Overriding the way Lucene uses > TermVectors seemed quite complex for me. I used the payloads mechanism > instead so that I could store a float payload with ea

RE: Help with Numeric Range

2010-06-24 Thread Uwe Schindler
Hi Todd, I found the bug(s) in your Lucene-Only RAMDIr test: - Move the reader=writer.getReader() after the writer.commit(), else you see an empty index from the reader (the IR is only a snapshot of the IW at the time it was retrieved. After the commit, you have to reopen the rea

Re: arguments in favour of lucene over commercial competition

2010-06-24 Thread jm
I want to add some perf numbers too, to show how it has improved in the last versions (not that it was bad before) does anyone have a link to a nice page with numbers/graphs ? On Thu, Jun 24, 2010 at 7:43 AM, Otis Gospodnetic wrote: > Coincidentally, just after I replied to this thread I received

RE: Overriding Lucene's term weights computation

2010-06-24 Thread Yuval Feinstein
Naama, AFAIK, payloads store an arbitrary byte array per position (see http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ and http://www.lucidimagination.com/blog/2010/04/18/refresh-getting-started-with-payloads/) You define what you put in the payload during indexing,

Re: Overriding Lucene's term weights computation

2010-06-24 Thread Dionisis Koumouras
Naama, I recently faced a similar problem. Overriding the way Lucene uses TermVectors seemed quite complex for me. I used the payloads mechanism instead so that I could store a float payload with each word. Then, I overloaded the similarity class to change the way results are scored, based on the p