Perhaps the most straight-forward way would be to index a known unique value for each document that would have had a null entry. Conceptually, when a field would be null, index the value "nothinghere". Then you can just search on documents where the value is equal to "nothinghere".
Alternatively, you could create a filter using TermEnum/TermDocs for each document that had an entry in the field in question and then invert it to get your real filter. The trick here is to build your term something like new Term("field", ""); which enumerates all the terms in a field. Since a filter is just a bitset, there are operations for inverting it. You can even store these filters somewhere and reuse them (note, they will have to be re-built when you change your index). For 2 million documents, they would be quite small, somewhere around 250K bytes. Building these filters is quite fast, so the first thing I would try is building them dynamically. If you use a CachingWrapperFilter, they will be cached automatically for the duration of your program. But I'd go with the simple way first, that of indexing a unique value for the null fields and searching on that.... Best Erick On 12/6/06, Supriya Kumar Shyamal <[EMAIL PROTECTED]> wrote:
Hi, I have some question regarding the search. In a document we can have several fields but not all fields have the value in all documents i.e. some fields in a document can have null or empty string. But how to search for a null field value in a document using the IndexSearcher? Any idea will be very greaful. Since I have index with 2 million documents adn Ic an't see thorgh all documents usign Luke. Thanks, supriya -- Mit freundlichen Grüßen / Regards Supriya Kumar Shyamal Software Developer tel +49 (30) 443 50 99 -22 fax +49 (30) 443 50 99 -99 email [EMAIL PROTECTED] ___________________________ artnology GmbH Milastr. 4 10437 Berlin ___________________________ http://www.artnology.com __________________________________________________________________________ News / Aktuelle Projekte: * artnology gewinnt Ausschreibung des Bundesministeriums des Innern: Softwarelösung für die Verwaltung der Sammlung zeitgenössischer Kunstwerke zur kulturellen Repräsentation des Bundes. Projektreferenzen: * Globaler eShop und Corporate-Site für Springer: www.springeronline.com * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de * Service-Center-Plattform für Biogen: www.ms-life.de * eCRM-System für Grünenthal: www.gruenenthal.com ___________________________________________________________________________ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]