klsince commented on issue #7320: URL: https://github.com/apache/pinot/issues/7320#issuecomment-901581362
Re 2, local index cleanup works as part of segment reloading. The cleanup happens in the same thread doing segment reloading, and it leverages the current failure handling mechanism (as in [code](https://github.com/apache/pinot/blob/master/pinot-server/src/main/java/org/apache/pinot/server/starter/helix/HelixInstanceDataManager.java#L276)) of segment reloading to keep disk states consistent upon failures, and be atomic when swapping in the cleaned segment. With this atomicity, the queries can continue to work with the existing segment until the new one is swapped in. Just like adding new indices during segment reloading, removing indices is also done inside the segment folder **not** accessed by ongoing queries, so it's safe to modify the files in the folder. To cleanup, we simply copy the indices defined in table config into a temp file, then rename it back to index file, effectively removing those not set in table config any more. In implementation, we use transferTo() for copy as in [the PR](https://github.com/apache/pinot/pull/7301/files#diff-52f126f7138a706a5fddb9257af1c558c4623269bc69308212a77c06021cbef7R433) Besides, cleanup happens after adding new indices so that newly created indices are kept after cleanup. In implementation, it just needs to do cleanup after closing all PinotDataBuffers as in [the PR](https://github.com/apache/pinot/pull/7301/files#diff-52f126f7138a706a5fddb9257af1c558c4623269bc69308212a77c06021cbef7R365). Hope this helps clarify a bit more, and feel free to comment the PR. Thanks! Re 1, I assume it's not always available to ssh to servers to delete segments, so an option to force segment download via Pinot restful API can be convenient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
