Hi List, I have a situation similar to indexing a mailing list, with each mail indexed as a Doc. Mails from a same thread share a same thread ID, which is indexed in a separate field. Now I want to search through all the mails using some keywords, and list all the unique thread IDs which I can pass to the database calls.
I tried DuplicateFilter, which didn't work well - by missing some results. I went through the code, and found all the filters are basically pre-filters, in other words, they generate the bitsets based on the index, and filter the duplicates out (in the case of DuplicateFilter) before being applied to the result collector. It causes problem when some mails contain the searching keywords but were filtered out as they were set to false in the bitset aready. Any solutions for this? is there any sort of post-filtering things exist, that filter records in the search result (could be slow), rather than in the whole collection? Thanks.