As the documentation states:
Lucene is an inverted index that does not have per-document fields. It only
knows terms pointing to documents. The query you are searching is a query
that returns all documents which have no term. To execute this query, it
will get the term index and iterate all terms of a field, mark those in a
bitset and negates that. The filter/query I told you uses the FieldCache to
do this. Since 3.6 (also in 3.5, but there it is buggy/API different) there
is another fieldcache that returns exactly that bitset. The filter mentioned
only uses that bitset from this new fieldcache. Fieldcache is populated on
first access and keeps alive as long as the underlying index segment is open
(means as long as IndexReader is open and the parts of the index is not
refreshed). If you are also sorting against your fields or doing other
queries using FieldCache, there is no overhead, otherwise the bitset is
populated on first access to the filter.

Lucene 3.5 has no easy way to implement that filter, a "NULL" pseudo term is
the only solution (and also much faster on the first access in Lucene 3.6).
Later accesses hitting the cache in 3.6 will be faster, of course.

Another hacky way to achieve the same results is (works with almost any
Lucene version):
BooleanQuery consisting of: MatchAllDocsQuery() as MUST clause and
PrefixQuery(field, "") as MUST_NOT clause. But the PrefixQuery will do a
full term index scan without caching :-). You may use CachingWrapperFilter
with PrefixFilter instead. 

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Tim Eck [mailto:tim...@gmail.com]
> Sent: Thursday, February 16, 2012 10:14 PM
> To: java-user@lucene.apache.org
> Subject: RE: query for documents WITHOUT a field?
> 
> Thanks for the fast response. I'll certainly have a look at the upcoming
3.6.x
> release. What is the expected performance for using a negated filter?
> In particular does it defeat the index in any way and require a full index
scan?
> Is it different between regular fields and numeric fields?
> 
> For 3.5 and earlier though, is there any suggestion other than magic
values?
> 
> -----Original Message-----
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Thursday, February 16, 2012 1:07 PM
> To: java-user@lucene.apache.org
> Subject: RE: query for documents WITHOUT a field?
> 
> Lucene 3.6 will have a FieldValueFilter that can be negated:
> 
> Query q = new ConstantScoreQuery(new FieldValueFilter("field", true));
> 
> (see http://goo.gl/wyjxn)
> 
> Lucen 3.5 does not yet have it, you can download 3.6 snapshots from
Jenkins:
> http://goo.gl/Ka0gr
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Tim Eck [mailto:t...@terracottatech.com]
> > Sent: Thursday, February 16, 2012 9:59 PM
> > To: java-user@lucene.apache.org
> > Subject: query for documents WITHOUT a field?
> >
> > My apologies if this answer is readily available someplace, I've
> > searched around and not found a definitive answer.
> >
> >
> >
> > I'd like to run a query for documents that _do not_ contain particular
> indexed
> > fields to implement something like a SQL-like query where a column is
> null.
> >
> >
> >
> > I understand I could possibly use a magic value to represent "null",
> > but
> the data
> > I'm searching doesn't led itself to reserving a value for null. I also
> understand I
> > could add an extra field to hold this boolean isNull state but would
> > love
> a better
> > solution :-)
> >
> >
> >
> > TIA
> >
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to