Sadanand Shenoy created HDDS-12984: -------------------------------------- Summary: Use InodeID to identify the SST files inside the tarball Key: HDDS-12984 URL: https://issues.apache.org/jira/browse/HDDS-12984 Project: Apache Ozone Issue Type: Sub-task Reporter: Sadanand Shenoy Assignee: Sadanand Shenoy
The current behaviour is such that the leader constructs the tarball with the SST files based on batch size and sends this tarball to the follower. Follower upon receiving, untars and sends another request (with the list of received files passed via the excludedSSTFiles param on the http request to the leader) to get the next tarball. This continues till only one tarball remains and a hardlink file is constructed that describes the filePath to be constructed on the follower. If suppose an sst file is present in 10 snapshots and the active OM DB , it will have an entry like this s1/1.sst -> activeOMDB/1.sst s2/1.sst -> activeOMDB/1.sst .. s10/1.sst -> activeOMDB/1.sst When the leader receives the exclude file list, The algorithm performed to exclude an already sent sst file is this; {code:java} function constructTarball(excludedFiles): tobeSentFiles = [] seenInodes = set() for all sst files in om.db and snapshot dbs: if file in excludedFiles: continue inodeVal = inode(file) if inodeVal in seenInodes: continue seenInodes.add(inodeVal) tobeSentFiles.append(file) return tobeSentFiles {code} the time complexity here becomes O(n^2) As part of this Jira the idea is to send Inode ID's as filenames due to which the time complexity can be reduced. something like this {code:java} function constructTarball(excludedInodes): tobeSentFiles = [] seenInodes = set() for all sst files in om.db and snapshot dbs: inodeID = inode(file) if inodeID in excludedInodes or inodeID in seenInodes: continue seenInodes.add(inodeID) tobeSentFiles.append(file) return tobeSentFiles {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org