[ 
https://issues.apache.org/jira/browse/HDDS-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-12984:
----------------------------------
    Labels: pull-request-available  (was: )

> Use InodeID to identify the SST files inside the tarball
> --------------------------------------------------------
>
>                 Key: HDDS-12984
>                 URL: https://issues.apache.org/jira/browse/HDDS-12984
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Sadanand Shenoy
>            Assignee: Sadanand Shenoy
>            Priority: Major
>              Labels: pull-request-available
>
> The current behaviour is such that the leader constructs the tarball with the 
> SST files based on batch size and sends this tarball to the follower. 
> Follower upon receiving, untars and sends another request (with the list of 
> received files  passed via the excludedSSTFiles param on the http request to 
> the leader) to get the next tarball. This continues till only one tarball 
> remains and a hardlink file is constructed that describes the filePath to be 
> constructed on the follower. If suppose an sst file is present in 10 
> snapshots and the active OM DB , it will have an entry like this
> s1/1.sst -> activeOMDB/1.sst
> s2/1.sst -> activeOMDB/1.sst
> .. s10/1.sst -> activeOMDB/1.sst
> When the leader receives the exclude file list, The algorithm performed to 
> exclude an already sent sst file is this;
>  
> {code:java}
> function constructTarball(excludedFiles):
>     tobeSentFiles = []
>     seenInodes = set()
>     for all sst files in om.db and snapshot dbs:
>         if file in excludedFiles:
>             continue
>         inodeVal = inode(file)
>         if inodeVal in seenInodes:
>             continue
>         seenInodes.add(inodeVal)
>         tobeSentFiles.append(file)
>     return tobeSentFiles
> {code}
>  the time complexity here becomes O(n^2)
> As part of this Jira the idea is to send Inode ID's as filenames due to which 
> the time complexity can be reduced.
> something like this
>  
> {code:java}
> function constructTarball(excludedInodes):
>     tobeSentFiles = []
>     seenInodes = set()
>     for all sst files in om.db and snapshot dbs:
>         inodeID = inode(file)
>         if inodeID in excludedInodes or inodeID in seenInodes:
>             continue
>         seenInodes.add(inodeID)
>         tobeSentFiles.append(file)
>     return tobeSentFiles
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to