I already mentioned that pseudo NULL term, but the user asked for another solution... -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
Jamie Johnson <jej2...@gmail.com> schrieb: Another possible solution is while indexing insert a custom token which is impossible to show up in the index otherwise, then do the filter based on that token. On Thu, Feb 16, 2012 at 4:41 PM, Uwe Schindler <u...@thetaphi.de> wrote: > As the documentation states: > Lucene is an inverted index that does not have per-document fields. It only > knows terms pointing to documents. The query you are searching is a query > that returns all documents which have no term. To execute this query, it > will get the term index and iterate all terms of a field, mark those in a > bitset and negates that. The filter/query I told you uses the FieldCache to > do this. Since 3.6 (also in 3.5, but there it is buggy/API different) there > is another fieldcache that returns exactly that bitset. The filter mentioned > only uses that bitset from this new fieldcache. Fieldcache is populated on > first access and keeps alive as long as the underlying index segment is open > (means as long as IndexReader is open and the parts of the index is not > refreshed). If you are also sorting against your fields or doing other > queries using FieldCache, there is no overhead, otherwise the bitset is > populated on first access to the filter. > > Lucene 3.5 has no easy way to implement that filter, a "NULL" pseudo term is > the only solution (and also much faster on the first access in Lucene 3.6). > Later accesses hitting the cache in 3.6 will be faster, of course. > > Another hacky way to achieve the same results is (works with almost any > Lucene version): > BooleanQuery consisting of: MatchAllDocsQuery() as MUST clause and > PrefixQuery(field, "") as MUST_NOT clause. But the PrefixQuery will do a > full term index scan without caching :-). You may use CachingWrapperFilter > with PrefixFilter instead. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -----Original Message----- >> From: Tim Eck [mailto:tim...@gmail.com] >> Sent: Thursday, February 16, 2012 10:14 PM >> To: java-user@lucene.apache.org >> Subject: RE: query for documents WITHOUT a field? >> >> Thanks for the fast response. I'll certainly have a look at the upcoming > 3.6.x >> release. What is the expected performance for using a negated filter? >> In particular does it defeat the index in any way and require a full index > scan? >> Is it different between regular fields and numeric fields? >> >> For 3.5 and earlier though, is there any suggestion other than magic > values? >> >> -----Original Message----- >> From: Uwe Schindler [mailto:u...@thetaphi.de] >> Sent: Thursday, February 16, 2012 1:07 PM >> To: java-user@lucene.apache.org >> Subject: RE: query for documents WITHOUT a field? >> >> Lucene 3.6 will have a FieldValueFilter that can be negated: >> >> Query q = new ConstantScoreQuery(new FieldValueFilter("field", true)); >> >> (see http://goo.gl/wyjxn) >> >> Lucen 3.5 does not yet have it, you can download 3.6 snapshots from > Jenkins: >> http://goo.gl/Ka0gr >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> > -----Original Message----- >> > From: Tim Eck [mailto:t...@terracottatech.com] >> > Sent: Thursday, February 16, 2012 9:59 PM >> > To: java-user@lucene.apache.org >> > Subject: query for documents WITHOUT a field? >> > >> > My apologies if this answer is readily available someplace, I've >> > searched around and not found a definitive answer. >> > >> > >> > >> > I'd like to run a query for documents that _do not_ contain particular >> indexed >> > fields to implement something like a SQL-like query where a column is >> null. >> > >> > >> > >> > I understand I could possibly use a magic value to represent "null", >> > but >> the data >> > I'm searching doesn't led itself to reserving a value for null. I also >> understand I >> > could add an extra field to hold this boolean isNull state but would >> > love >> a better >> > solution :-) >> > >> > >> > >> > TIA >> > >> > >> >> >> >>_____________________________________________ >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >>_____________________________________________ >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > >_____________________________________________ > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > _____________________________________________ To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org