Thanks Doron, that's definitely completely backwards!! Good thing the API is gone.
Mike McCandless http://blog.mikemccandless.com On Thu, Jul 18, 2013 at 7:50 AM, Doron Cohen <[email protected]> wrote: > Hi, just an FYI - may be helpful for anyone obliged to use 4.0.0 or 4.1.0 - > it seems that this method is actually doing the opposite of its intention. > > I did not find mentions of this in the lists or elsewhere. > > This is the code for o.a.l.search.FieldCacheImpl.DocTermsImpl.exists(int): > public boolean exists(int docID) { > return docToOffset.get(docID) == 0; > } > > Its description says: "Returns true if this doc has this field and is not > deleted". > But it returns true for docs not containing the field and false for those > that do contain it. > > A simple workaround is to not to call this method before calling getTerm() > but rather just rely on getTerm() logic: "... returns the same BytesRef, or > an empty (length=0) BytesRef if the doc did not have this field or was > deleted." > > So usage code can be like this: > DocTerms values = FieldCache.DEFAULT.getTerms(reader, FIELD_NAME); > BytesRef term = new BytesRef(); > for (int docid=0; docid<values.size(); docid++) { > term = values.getTerm(docid, term); > if (term.length>0) { > doSomethingWith(term.utf8ToString()); > } > } > FieldCache.DEFAULT.purge(reader); > > I am not sure about the overhead of this comparing to first checking > exists(), but it at least work correctly. > > The code for exists() was as above until R1442497 > (http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java?revision=1442497&view=markup) > and then in R1443717 > (http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java?r1=1442497&r2=1443717&diff_format=h) > the API was change as part of LUCENE-4547 (DocValues improvements) which was > included in 4.2. > > Simple code to demonstrate this (here with 4.1 but same results with 4.0): > > RAMDirectory d = new RAMDirectory(); > IndexWriter w = new IndexWriter(d, new IndexWriterConfig(Version.LUCENE_41, > new SimpleAnalyzer(Version.LUCENE_41))); > w.addDocument(new Document()); // Empty doc (0, 0) > Document doc = new Document(); // Real doc (1, 1) > doc.add(new StringField("f1", "v1", Store.NO)); > w.addDocument(doc); > w.addDocument(new Document()); // Empty doc (2, 2) > w.addDocument(new Document()); // Empty doc (3, 3) > w.commit(); // Commit - so we'll have two atomic readers > doc = new Document(); // RealDoc (0, 4) > doc.add(new StringField("f1", "v2", Store.NO)); > w.addDocument(doc); > w.addDocument(new Document()); // Empty doc (1, 5) > w.close(); > > IndexReader r = DirectoryReader.open(d); > BytesRef br = new BytesRef(); > for (AtomicReaderContext leaf : r.leaves()) { > System.out.println("--- new atomic reader"); > AtomicReader reader = leaf.reader(); > DocTerms a = FieldCache.DEFAULT.getTerms(reader, "f1"); > for (int i = 0; i < reader.maxDoc(); ++i) { > int n = leaf.docBase + i; > System.out.println(n+" exists: "+a.exists(i)); > br = a.getTerm(i, br); > if (br.length > 0) { > System.out.println(n + " "+br.utf8ToString()); > } > } > } > > The result printing: > > --- new atomic reader > 0 exists: true > 1 exists: false > 1 v1 > 2 exists: true > 3 exists: true > --- new atomic reader > 4 exists: false > 4 v2 > 5 exists: true > > Indeed, exists() results are wrong. > > So again, just an FYI, as this API no longer exists... > > Regards, > Doron --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
