Hello, I cannot extract document term vectors from an index, and have not turned up much in some determined googling. In short, when I call IndexReader.getTermVector(docID, field) or IndexReader.getTermVectors(docID) and then navigate down to the Terms for the specified field, I get a null result.
// Indexing: String bodyText = "this is foobar"; final FieldType BodyOptions = new FieldType(); BodyOptions.setIndexed(true); BodyOptions.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); BodyOptions.setStored(true); BodyOptions.setStoreTermVectors(true); BodyOptions.setTokenized(true); Document doc = new Document(); doc.add(new Field("body", bodyText, BodyOptions)); When I examine docs in Luke, I can see the term vectors. // Retrieving (at a later time) DirectoryReader dirRdr = DirectoryReader.open(FSDirectory.open(new File(path))); SlowCompositeReaderWrapper rdr = new SlowCompositeReaderWrapper(dirRdr); for (int i = 0; i < rdr.maxDoc(); ++i) { int numTerms = 0; Terms terms = rdr.getTermVector(i, "body"); if (terms != null) { TermsEnum term = terms.iterator(null); while (term.next() != null) { ++numTerms; } System.out.println("doc " + i + " had " + numTerms + " terms"); } else { System.err.println("null term vector on doc " + i); } } On every doc, the Terms object I get back from getTermVector(i, "body") is null. Jon -- Jon Stewart, Principal (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org