[
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jingcheng Du updated HBASE-18693:
---------------------------------
Comment: was deleted
(was: Thanks Huaxiang.
HDFS move doesn't copy the data, right, it doesn't, it is supposed to be a
rename operation.
My concern is if we restore a snapshot twice which is possible, how to handle
such operations?
In HBase, we compact the hfile links in compaction, I think compacting hfile
links in MOB compaction is reasonable too.
Or we can skip the hfile links in most of MOB compaction, and compact the links
in a longer interval (like a month)?
I prefer the 1st option. What's your idea? Thanks.
)
> adding an option to restore_snapshot to move mob files from archive dir to
> working dir
> --------------------------------------------------------------------------------------
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
> Issue Type: Improvement
> Components: mob
> Affects Versions: 2.0.0-alpha-2
> Reporter: huaxiang sun
> Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are
> saved. There could be many files (one million) in a single mob directory.
> When one mob table is restored or cloned from snapshot, links are created for
> these mob files. This creates a scaling issue for mob compaction. In mob
> compaction's select() logic, for each hFileLink, it needs to call NN's
> getFileStatus() to get the size of the linked hfile. Assume that one such
> call takes 20ms, 20ms * 1000000 = 6 hours.
> To avoid this overhead, we want to add an option so that restore_snapshot can
> move mob files from archive dir to working dir. clone_snapshot is more
> complicated as it can clone a snapshot to a different table so moving that
> can destroy the snapshot. No option will be added for clone_snapshot.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)