initTableSnapshotMapperJob writes into this directory (indirectly) via RestoreSnapshotHelper.restoreHdfsRegions
Is this expected? I would have expected writes to be limited to the temp directory passed in the init call Brian On Sep 7, 2014, at 8:17 AM, Ted Yu <[email protected]> wrote: > The files under archive directory are referenced by snapshots. > Please don't delete them manually. > > You can delete unused snapshots. > > Cheers > > On Sep 7, 2014, at 4:08 AM, Brian Jeltema <[email protected]> > wrote: > >> >> On Sep 6, 2014, at 9:32 AM, Ted Yu <[email protected]> wrote: >> >>> Can you post your hbase-site.xml ? >>> >>> /apps/hbase/data/archive/data/default is where HFiles are archived (e.g. >>> when a column family is deleted, HFiles for this column family are stored >>> here). >>> /apps/hbase/data/data/default seems to be your hbase.rootdir >> >> hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I think >> that's the default that Ambari creates. >> >> So the HFiles in the archive subdirectory have been discarded and can be >> deleted safely? >> >>> bq. a problem I'm having running map/reduce jobs against snapshots >>> >>> Can you describe the problem in a bit more detail ? >> >> I don't understand what I'm seeing well enough to ask an intelligent >> question yet. >> I appear to be scanning duplicate rows when using initTableSnapshotMapperJob, >> but I'm trying to get a better understanding of how this works, since It's >> probably just >> something I'm doing wrong. >> >> Brian >> >>> Cheers >>> >>> >>> On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema < >>> [email protected]> wrote: >>> >>>> I'm trying to track down a problem I'm having running map/reduce jobs >>>> against snapshots. >>>> Can someone explain the difference between files stored in: >>>> >>>> /apps/hbase/data/archive/data/default >>>> >>>> and files stored in >>>> >>>> /apps/hbase/data/data/default >>>> >>>> (Hadoop 2.4, HBase 0.98) >>>> >>>> Thanks >> >
