We have a primary table and we need search capability by batchid column. So we are creating a manual index for search by batch id. We are using buckets to restrict a row size in batch id index table to 50mb. As batch size may vary drastically ( ie one batch id may be associated to 100k row keys in primary table while other may be associated with 100million row keys), we are creating a metadata table to track the approximate data while insertions for a batch in primary table, so that batch id index table has dynamic no of buckets/rows. As more data is inserted for a batch in primary table, new set of 10 buckets are added. At any point in time, clients will write to latest 10 buckets created for a batch od index in round robin to avoid hotspots.
Comments required on the following: 1. I want to know any suggestios on above design? 2. Whats the best approach for updating/deleting from index table. When a row is manually purged from primary table, we dont know where that row key exists in x number of buckets created for its batch id? Thanks Anuj Sent from Yahoo Mail on Android