Hi, > Goofing off with my index, I ran across this example > http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a- > positional-match-in-lucene/ > for > using span queries to see what else is around a word that hits. Noticeably, > there's a nice getSpans(IndexReader) method that just takes in the index reader > and returns all the span objects, something not present in Lucene 4. > I'm trying to replicate this in Lucene 4.0 to see how viable it is and despite > having my span query hit on 10 documents, I cannot retrieve any spans. The API > for doing this got remarkably more complex! > > My code reads as follows: > IndexReader ir = search.getIndexReader(); TermContext tmctxt = > TermContext.build(ir.getTopReaderContext(), > testSpan.getTerm(), false); > Map termMap = new HashMap(); > termMap.put(testSpan.getTerm(), tmctxt); AtomicReaderContext ac = new > IndexReader.AtomicReaderContext(ir);
Don't do this, to get a top level IndexReader context, use IR.getTopReaderContext(). What you do here is creating an atomic context on an index reader that might not be atomic, this can be the reason for failures. Should also throw random exceptions. BTW: There is currently lot's of work done refactoring IndexReaders in two separate classes (CompositeIndexReader and AtomicIndexReader, so the many UnsupportedOperationEx methods will go away; see https://issues.apache.org/jira/browse/LUCENE-2858). You can then only get and execute spans/queries/filters/termsenum/docsenum on AtomicIndexReader and the corresponding contexts will be type safe. Currently this is one of the parts in the Lucene API that's very inconsistent and programmer unfriendly, because most IndexReaders in Lucene (like DirectoryReader or MultiReader) are composite readers that no longer have low-level terms/postings APIs. The new API will separate both types strictly. Also stuff like reopen will move away from the abstract IndexReader interface. The above code will completely fail to compile after the IR refactoring :-) The problem is here that you get the IndexReader that's a composite reader from the IndexSearcher but you try to execute Queries on it. This is no longer possible. You have to ask the reader for the index segments and do the search on the low-level atomic SegmentReaders separately. Alternatively wrap your IR with SlowMultiReaderWrapper that creates an atomic "view" on an index, but its simply slow, but emulates the behavior still possible in Lucene 3.x [but also slow there] :-) > Bits bits = new Bits.MatchAllBits(0); > Spans spans = testSpan.getSpans(ac, bits, termMap); This asks for spans with no deleted documents and an Index of size 0 -> cannot work. > However, spans never returns a spans object, spans.next() always returns false. > > Am I missing anything? > > Thanks! > Stephen --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org