[ https://issues.apache.org/jira/browse/HDDS-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HDDS-12984: ---------------------------------- Labels: pull-request-available (was: ) > Use InodeID to identify the SST files inside the tarball > -------------------------------------------------------- > > Key: HDDS-12984 > URL: https://issues.apache.org/jira/browse/HDDS-12984 > Project: Apache Ozone > Issue Type: Sub-task > Reporter: Sadanand Shenoy > Assignee: Sadanand Shenoy > Priority: Major > Labels: pull-request-available > > The current behaviour is such that the leader constructs the tarball with the > SST files based on batch size and sends this tarball to the follower. > Follower upon receiving, untars and sends another request (with the list of > received files passed via the excludedSSTFiles param on the http request to > the leader) to get the next tarball. This continues till only one tarball > remains and a hardlink file is constructed that describes the filePath to be > constructed on the follower. If suppose an sst file is present in 10 > snapshots and the active OM DB , it will have an entry like this > s1/1.sst -> activeOMDB/1.sst > s2/1.sst -> activeOMDB/1.sst > .. s10/1.sst -> activeOMDB/1.sst > When the leader receives the exclude file list, The algorithm performed to > exclude an already sent sst file is this; > > {code:java} > function constructTarball(excludedFiles): > tobeSentFiles = [] > seenInodes = set() > for all sst files in om.db and snapshot dbs: > if file in excludedFiles: > continue > inodeVal = inode(file) > if inodeVal in seenInodes: > continue > seenInodes.add(inodeVal) > tobeSentFiles.append(file) > return tobeSentFiles > {code} > the time complexity here becomes O(n^2) > As part of this Jira the idea is to send Inode ID's as filenames due to which > the time complexity can be reduced. > something like this > > {code:java} > function constructTarball(excludedInodes): > tobeSentFiles = [] > seenInodes = set() > for all sst files in om.db and snapshot dbs: > inodeID = inode(file) > if inodeID in excludedInodes or inodeID in seenInodes: > continue > seenInodes.add(inodeID) > tobeSentFiles.append(file) > return tobeSentFiles > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org