Re: lucene-contrib maven artifact

2009-11-04 Thread Chris Hostetter
I don't use maven, but ... : When I add this dependency to my pom.xml, an error occurs. org.apache.lucene:lucene-contrib:jar:2.9.0 is missing. ... : I am trying to use org.apache.lucene.search.vectorhighlight.*; : : http://repo1.maven.org/maven2/org/apache/lucene/lucene-contrib/2.9.0/

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Britske
Yeah I understand. Thanks anyway, it cleared my head a bit, Geert-Jan Erick Erickson wrote: > > You're right, my comment was irrelevant. Mostly, I try to make sure > that people aren't asking an "XY problem", That is, asking for how > to do X when what they really want is Y. And most of the p

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Erick Erickson
You're right, my comment was irrelevant. Mostly, I try to make sure that people aren't asking an "XY problem", That is, asking for how to do X when what they really want is Y. And most of the posts I've seen wondering about doc IDs were exactly that, but yours clearly isn't. And I'm going to have

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Britske
please ignore the garbage at the end ;-) Britske wrote: > > This issue is related to post: "merging Parallel indexes (can > indexWriter.addIndexesNoOptimize be used?)" > > Among another thing described in the post above, I'm experimenting with a > combination of sharding and vertical partitio

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Britske
This issue is related to post: "merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)" Among another thing described in the post above, I'm experimenting with a combination of sharding and vertical partitioning which I feel will increase my indexing performance a lot, which at

RE: void touchFile() should return the boolean result of the setLastModified

2009-11-04 Thread Uwe Schindler
We discussed about this method yesterday in the evening. The abstract base class defines the method as throwing an IOException. So the correct behaviour would be to throw an IOException if setLastModified returns false (which happens according to the docs, if the date cannot be changed because of a

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Erick Erickson
H, why do you care? That is, what is it you're trying to do that makes this question necessary? There might be a better solution than trying to depend on doc IDs. Because I don't think you can assume that, even if it is deterministic with the version you're using now that it would be in some o

Re: addIndexesNoOptimize on shards --> is docid deterministic and calculable? (IF docids of shards seperately are known)

2009-11-04 Thread Britske
Just to clarify question changed the subject: addIndexesNoOptimize on shards --> is docid deterministic and calculable? (IF docids of shards seperately are known) Britske wrote: > > Hi, > > say I have: > - Indexreader[] readers = {reader1, reader2, reader3} //containing all > different doc

addIndexesNoOptimize on shards --> is docid deterministic and calculable?

2009-11-04 Thread Britske
Hi, say I have: - Indexreader[] readers = {reader1, reader2, reader3} //containing all different docs - I know the internal docids of documents in reader1, reader2, reader3 seperately Does doing IndexWriter.addIndexesNoOptimize(Indexreader[] readers) on these readers give me a determinstic and

Re: merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)

2009-11-04 Thread Britske
Yeah excellent! This should indeed work! Thanks, Geert-Jan Jérôme Thièvre wrote: > > Hello Geert-Jan, > > it's possible to merge several parallel physical indexes (viewed as one > logical index with a ParallelReader). > Just use the method IndexWriter.addIndexes(IndexReader[] readers): > >

Re: merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)

2009-11-04 Thread Britske
Yeah passing in ParallelReader should work. I'll try that thanks! Probably still going to look into this low-level stuff though. Also because some indexes I use (Index B in my example) are pretty costly to index at the moment. At the same time the indexing client has a lot of knowledge about the

Free live video streaming of ApacheCon US 2009

2009-11-04 Thread Michael McCandless
Team, For those Lucene fanatics not in Oakland this week for ApacheCon US, don't miss the FREE live video streaming, starting today: http://streaming.linux-magazin.de/en/program-apachecon-us-2009.htm Note that there are many talks available, covering Apache Hadoop, Apache HTTPD, Lucene, as wel

Re: merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)

2009-11-04 Thread Jérôme Thièvre
Hello Geert-Jan, it's possible to merge several parallel physical indexes (viewed as one logical index with a ParallelReader). Just use the method IndexWriter.addIndexes(IndexReader[] readers): IndexReader[] physicalReaders = ...; // Your readers here IndexWriter iw = new IndexWriter(...); P

Re: merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)

2009-11-04 Thread Michael McCandless
Roughly, your approach sounds correct. You essentially need to concatenate tis, tii, frq, prx, but adjusting all absolute pointers accordingly. If you look at how SegmentMerge, in its mergeTerms/mergeTermInfos/appendPostings, makes us of the FormatPostingsFields/Terms/Docs/PositionsConsumer, that

Re: merging Parallel indexes (can indexWriter.addIndexesNoOptimize be used?)

2009-11-04 Thread Britske
Thanks, but it's already guaranteed that the indexes are in sync. So I could (and do) use parallelReader to search them both at the sime time. This is what my running index looks like. However at certain points I was considering to store a frozen index from the parallel index for backup/ other

Re: void touchFile() should return the boolean result of the setLastModified

2009-11-04 Thread Michael McCandless
I agree it's not great that touchFile swallows the return status from File.setLastModified, but, technically changing it would break our jar drop-in back compat. Actually, I think instead, we should deprecate the method? As best I can tell, Lucene does not use this anywhere. I'll open an issue,

Re: rewrite()ing BooleanQuery results in empty clauses

2009-11-04 Thread Michael McCandless
For 2.9, I believe there's hardly any runtime cost to the embedded BooleanQuery instances that have no clauses -- when the scorer method is invoked, it will return null, which will in turn translate into an DocIdSet.EMPTY_DOCIDSET that ends the iteration immediately on the first call to nextDoc. B

rewrite()ing BooleanQuery results in empty clauses

2009-11-04 Thread Shaun Senecal
I am rewriting some BooleanQueries and the end result contains some empty queries. The initial query is of the form: Field1:foo* Field2:foo* Field3:foo* Field4:foo* Field5:foo* Field6:foo* The rewritten query is of the form: ConstantScore(Field1:foo*) ConstantScore(Field2:foo*) ConstantScore(Quer