I have a bunch of Lucene indices lying around, and I want to start adding a new field to documents in new indices that I'm generating. So, for a given index, either every document in the index will have that field or no document will have that field.
The new field has a default value; and I would like to write a query that, when applied to old indices, matches all documents, while when applied to new indices, it will only match documents with that specific default value. (Probably the query will include other restrictions, but the other restrictions have nothing to do with the new field, so they'll apply to both indices.) I can, of course, write two different queries, one for the old indices and one for the new indices; for layering reasons, I'd prefer not to do that, but it's a possibility. (I can't, however, go back to the old indices and add the new field in.) Any suggestions for how to write a single query that will work in both places? Basically, what I want is a query that says something like (field IS MISSING) OR (field = DEFAULT_VALUE) If it matters, the new field will only take one of a small number of values, ten or so. The one hint I've turned up when googling is this: http://stackoverflow.com/questions/4365369/solr-search-for-documents-where-a-field-doesnt-exist It talks in terms of Solr, but hopefully I can figure out how to translate that into stock Lucene? Thinking out loud about what it suggests, I guess maybe I can generate a WildcardQuery for my field with * (which I hope won't be too expensive, given how few values my field has), and then do something like (field = DEFAULT_VALUE) OR NOT (field matches *) And then I have to translate that into Lucene BooleanQuery syntax; I think I can probably handle that step of things (I've done that sort of thing before), but if anybody has tips, I'm all ears. Basically, any suggestions would be welcome, whether about the basic approach or about the details. And I would in particular very much appreciate advice as to whether or not WildcardQuery(field, *) will have good performance if field only takes a small number of values. -- David Carlton carl...@sumologic.com