RE: Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread Uwe Schindler
The precision depends on how you convert you datetime to an integer value. For NumericRangeQuery the precision doesn't really matter. Only index gets bigger. If you want day resolution just divide Date.getTime() by e.g. 8640L (which is milliseconds). You have to do the same conversation on both

Re: Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread Jake Mannix
When such precision is needed, this is a great idea. When it's far more than overkill (like when only days are necessary), is there anything to gain by doing this? -jake On Wed, Oct 7, 2009 at 10:17 PM, Uwe Schindler wrote: > I would propose to use NumericRangeQuery and NumericField supplie

RE: 2.9: TopScoreDocCollector

2009-10-07 Thread Uwe Schindler
The simpliest is (and also recommended): Do not create such collectors for yourself. Just use the Query methods returning TopDocs. The developers of these classes strongly discourage direct use, as a lot of automatism is ignored then. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen

RE: Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread Uwe Schindler
I would propose to use NumericRangeQuery and NumericField supplied by Lucene 2.9. This has no such limitations. You can index your dates as numeric value (e.g. Date.getTime()) and query downto the milliseconds. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail:

Re: 2.9: TopScoreDocCollector

2009-10-07 Thread Jake Mannix
Hi Eric, Different Query classes have different options on whether they can score docs out of order, or if they always proceed in order, so the way to make sure you're choosing the right value, if you don't know which you need, is to ask your Query (or more appropriately, it's Weight): Query

Search By Phrase Not Working

2009-10-07 Thread sadronmeldir
Hello all, I'm having some difficult getting queries on phrases to work properly, and I can't figure out why. For example, a search for ("Heart of Fire") yields no results when it should be returning two. Below is a snippet of my code. I'm probably overlooking something trivial, but any help wo

2.9: TopScoreDocCollector

2009-10-07 Thread Angel, Eric
According to the documentation for 2.9, TopScoreDocCollector.create(numHits, boolean), the second parameter is whether documents are scored in order by the input - How do I choose? In other words, how would I know if the documents are scored in order or not? Eric

Re: Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread Jake Mannix
On Wed, Oct 7, 2009 at 4:42 PM, mitu2009 wrote: > > Hi, > > I've two sets of search indexes. TestIndex (used in our test environment) > and ProdIndex(used in PRODUCTION environment). Lucene search query: > +date:[20090410184806 TO 20091007184806] works fine for test index but > gives > this error

Re: Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread Adriano Crestani
Hi, Can you provide to us the exception stack trace? Thanks, Adriano Crestani On Wed, Oct 7, 2009 at 7:42 PM, mitu2009 wrote: > > Hi, > > I've two sets of search indexes. TestIndex (used in our test environment) > and ProdIndex(used in PRODUCTION environment). Lucene search query: > +date:[200

Help needed figuring out reason for maxClauseCount is set to 1024 error

2009-10-07 Thread mitu2009
Hi, I've two sets of search indexes. TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment). Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index. "maxClauseCount is set to 1024" If I

Re: Best strategy for reindexing large amount of data

2009-10-07 Thread Jake Mannix
Hi Maarten, Five minutes is not tremendously frequently, and I imagine should be pretty fine, but again: it depends on how big your index is, how fast your grandfathering events are trickling in, how fast your new events are coming in, and how heavy your query load is. All of those factors ca

Re: Best strategy for reindexing large amount of data

2009-10-07 Thread Maarten_D
Hi Jake, Thanks for your answer. I hadn't realised that doing the updates in reverse chronological order actually plays well with the IO cache and the way Lucene writes its indices to disk. Good to hear. One question though, if you don't mind: you say that updating can work as long as I don't reo

Re: Index splitter

2009-10-07 Thread Jake Mannix
As long as you don't have to split up a fully optimized index, or one with the wrong number of segments for the division you want to do, that would be useful. Of course, sometimes you need to split up the big segments into smaller ones too, but the only way I've done that in the past is basically:

Index splitter

2009-10-07 Thread Jason Rutherglen
We have a way to merges indexes together with IW.addIndexes, however not the opposite, split up an index with multiple segments. I think I can simply manufacture a new segmentinfos in a new directory, copy over the segments files from those segments, delete the copied segments from the source, and

Reducing memory use!

2009-10-07 Thread Felipe Lobo
Hi, i read in some places that reusing the Document e Field instances reduces the memory use when indexing, is that true?? but looked into Field class, it can only set the name, value and the other params on the constructor, they dont have a set method. if reuse them really reduces memory use, how

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-07 Thread Mark Miller
Solr just copies them into the same directory - Lucene files are write once, so its not much different than what happens locally. Nigel wrote: > Right now we logically re-open an index by making an updated copy of the > index in a new directory (using rsync etc.), opening the new copy, and > closi

Re: Best strategy for reindexing large amount of data

2009-10-07 Thread Jake Mannix
I think a Hadoop cluster is maybe a bit overkill for this kind of thing - it's pretty common to have to do "grandfathering" of an index when you have new features, and just doing it in place with IndexWriter.update() can work just fine as long as you are not very frequently reopening your index. T

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-07 Thread Nigel
Right now we logically re-open an index by making an updated copy of the index in a new directory (using rsync etc.), opening the new copy, and closing the old one. We don't use IndexReader.reopen() because the updated index is in a different directory (as opposed to being updated in-place). (Rea

Re: Best strategy for reindexing large amount of data

2009-10-07 Thread Jason Rutherglen
Maarten, Depending on the hardware available you can use a Hadoop cluster to reindex more quickly. With Amazon EC2 one can spin up several nodes, reindex, then tear them down when they're no longer needed. Also you can simply update in place the existing documents in the index, though you'd need t

Best strategy for reindexing large amount of data

2009-10-07 Thread Maarten_D
Hi, I've searched the mailinglists and documentation for a clear answer to the following question, but haven't found one, so here goes: We use Lucene to index and search a constant stream of messages: our index is always growing. In the past, if we added new features to the software that required

Re: fa package

2009-10-07 Thread mastcheshmi
mastcheshmi wrote: > > > Simon Willnauer wrote: >> >> see contrib/analyzers/ >> >> http://lucene.apache.org/java/2_9_0/api/contrib-analyzers/org/apache/lucene/analysis/fa/PersianAnalyzer.html >> >> simon >> >> On Wed, Oct 7, 2009 at 10:17 AM, mastcheshmi >> wrote: >>> >>> I download lucen

RE: Document loading

2009-10-07 Thread Dragon Fly
Thanks. > Date: Wed, 7 Oct 2009 00:05:43 +0200 > Subject: Re: Document loading > From: simon.willna...@googlemail.com > To: java-user@lucene.apache.org > > Hi, > a call to IndexSearcher.doc(docId) will load the document. Internally > this call forwards to IndexReader.document(docId) which could

Re: fa package

2009-10-07 Thread Simon Willnauer
download this: http://apache.autinity.de/lucene/java/lucene-2.9.0.zip extract it into /some/path/lucene29 go to cd /some/path/lucene29/contrib/analyzer/common copy jar file cp lucene-analyzers-2.9.0.jar /your/project/path find the class :) If that does not help I clueless @robert: nice try :D

Re: fa package

2009-10-07 Thread mastcheshmi
mastcheshmi wrote: > > > Simon Willnauer wrote: >> >> see contrib/analyzers/ >> >> http://lucene.apache.org/java/2_9_0/api/contrib-analyzers/org/apache/lucene/analysis/fa/PersianAnalyzer.html >> >> simon >> >> On Wed, Oct 7, 2009 at 10:17 AM, mastcheshmi >> wrote: >>> >>> I download lucen

Re: fa package

2009-10-07 Thread Robert Muir
the fa package is in lucene-2.9.0.zip, in contrib\analyzers\common\lucene-analyzers-2.9.0.jar On Wed, Oct 7, 2009 at 6:39 AM, mastcheshmi wrote: > > where can I download fa package? > :confused: > -- > View this message in context: > http://www.nabble.com/fa-package-tp25782364p25783879.html > Se

Re: fa package

2009-10-07 Thread mastcheshmi
where can I download fa package? :confused: -- View this message in context: http://www.nabble.com/fa-package-tp25782364p25783879.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-m

ApacheCon US

2009-10-07 Thread Grant Ingersoll
Just a friendly reminder to all about Lucene ecosystem events at ApacheCon US this year. We have two days of talks on pretty much every project under Lucene (see http://lucene.apache.org/#14+August+2009+-+Lucene+at+US+ApacheCon ) plus a meetup and a two day training on Lucene and a 1 day trai

Re: fa package

2009-10-07 Thread Simon Willnauer
if you download the lucene distribution there should be a folder contrib. This folder contains a lot of contrib directories. Go to contrib/analyzers/common and add the file lucene-analyzers-2.9.0.jar to your buildpath. This jar should contain the persian analyzer. Again this is not part of the luc

Re: How to setup a scalable deployment?

2009-10-07 Thread Chris Were
Thanks for all the excellent replies. Lots of great of software mentioned that I'd never heard of -- and I thought I'd Google'd this subject to death already! Cheers, Chris.

Re: fa package

2009-10-07 Thread mastcheshmi
Simon Willnauer wrote: > > see contrib/analyzers/ > > http://lucene.apache.org/java/2_9_0/api/contrib-analyzers/org/apache/lucene/analysis/fa/PersianAnalyzer.html > > simon > > On Wed, Oct 7, 2009 at 10:17 AM, mastcheshmi > wrote: >> >> I download lucene 2.9. >> I didnt find fa package. >>

Re: fa package

2009-10-07 Thread Simon Willnauer
see contrib/analyzers/ http://lucene.apache.org/java/2_9_0/api/contrib-analyzers/org/apache/lucene/analysis/fa/PersianAnalyzer.html simon On Wed, Oct 7, 2009 at 10:17 AM, mastcheshmi wrote: > > I download lucene 2.9. > I didnt find fa package. > I want use persianAnalyzer. > what do id do? > --

fa package

2009-10-07 Thread mastcheshmi
I download lucene 2.9. I didnt find fa package. I want use persianAnalyzer. what do id do? -- View this message in context: http://www.nabble.com/fa-package-tp25782364p25782364.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --