anuengineer commented on issue #108: HDDS-1987. Fix listStatus API URL: https://github.com/apache/hadoop-ozone/pull/108#issuecomment-548105219 1. First of all, this is complicated code. But not your fault, what you need to do at this level is complicated. 2. I agree that is one way of doing this change; save one question. How are we guaranteeing that the iterator that you are holding is the only valid one to the double buffer. Think about this edge case. Suppose you enter the listing with the iterator pointing to the buffer A. You build the deletedKeys list, or you build the added keys list. Then you are doing an/O to the underlying DB — and read a set of keys, then you are walking that set of keys to do a subtract from deletedKeySet and addition of addedKeys. makes sense. But this can take a while, during this time two things might happen. 1. The Buffer A might get tumbled over and Buffer B might becomes the active one. I want you to consider if this is a possibility. -- The new Buffer B is active, a Key is deleted, but you don't know about it. Then you are operating against an inconsistent snap shot of data -- which is ok, since we can define that as the behavior of the system. In other words, the list operation might return key names which are already deleted. The DB I/O delay might make it acute. 2. The buffer A might get committed -- now all the work you did is for nothing -- the data is already there in the DB.. You might want to define what should be the behavior at this level. 3. In your patch, you don't check if you have seen all the deleted Key from the DB, which is again ok, since you have no way of predicting if the double buffer is actually syncing and when. Another way to write this code might be to ask the table that controls the double buffer to do the right thing. That way, you are make these reasonings more local. Just my 2 cents. 4. Also the keyManagerImpl is a file that we want to depricate in the long run.. we made a mistake and wrote the file system code inside that class, just FYI. Nothing to do with this patch.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org