Hi,
 
The current usage of BitSets in filters in Lucene is limited to applying only 
on docIDs i.e. I can only construct a filter out of a BitSet if I have the 
DocumentIDs handy.

However, with every update/delete i.e. CRUD modification, these will change, 
and I have to again redo the whole process to fetch the latest docIDs. 

Assume a scenario where I need to tag millions of documents with a tag like 
"Finance", "IT", "Legal", etc.

Unless, I can cache these filters in memory, the cost of constructing this 
filter at run time per query is not practical. If I could map the documents to 
a numeric long identifier and put them in a BitMap, I could then cache them 
because the size reduces drastically. However, I cannot use this numeric long 
identifier in Lucene filters because it is not a docID but another regular 
field.

Please help with this scenario. Thanks,

-----------------------
Thanks n Regards,
Sandeep Ramesh Khanzode

Reply via email to