I have a design where I will be using multiple index shards to hold approx 7.5 million documents per index per month over many years. These will be large static R/O indexes but the corresponding smaller parallel index will get many frequent changes.

I understand from previous replies by Hoss that the technique to handle this is to use parallel indexes where the parallel index gets rebuilt periodically with the changing data.

However, this 'periodically' needs to be quite frequent to try to provide responsive changes to the index, potentially several times a dat. One problem is that there can be updates to any of the data in almost any month, so an update by a user to 120 documents, one document per month for 10 years, requires a full rebuild of the 120 index shards of 7.5m docs each...

I was wondering what the technical reasons were why a 'delete+add' could not allow the original docId to be re-used, thus keeping the two parallel indexes in sync without requiring a rebuild.

If this could be overcome, this would make this parallel index pattern so much more useful for large volume data sets.

Any thoughts
Antony





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to