> Your cluster is an insecure HBase deployment, right ? Yes
> > Are all files under /apps/hbase/data/archive/data/default owned by user > 'hdfs’ ? No. However the ownership failure isn’t what I’m concerned about; I understand what caused that. But the stack trace illustrated behavior of initTableSnapshotMapperJob that I didn’t expect, and I’m just trying to understand what it’s doing. > > BTW in tip of 0.98, with HBASE-11742, related code looks a bit different. > > Cheers > > > On Sun, Sep 7, 2014 at 8:27 AM, Brian Jeltema < > [email protected]> wrote: > >> >>> Eclipse doesn't show that RestoreSnapshotHelper.restoreHdfsRegions() is >>> called by initTableSnapshotMapperJob (in master branch) >>> >>> Looking at TableMapReduceUtil.java in 0.98, I don't see direct relation >>> between the two. >>> >>> Do you have stack trace or something else showing the relationship ? >> >> Right. That’s what I meant by ‘indirectly’. This is a stack trace that was >> caused by an ownership conflict: >> >> java.io.IOException: java.util.concurrent.ExecutionException: >> org.apache.hadoop.security.AccessControlException: Permission denied: >> user=hbase, access=WRITE, >> inode="/apps/hbase/data/archive/data/default/Host/c41d632d5eee02e1883215460e5c261d/p":hdfs:hdfs:drwxr-xr-x >> at >> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) >> at >> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251) >> at >> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232) >> at >> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5509) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5491) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5465) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3608) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3578) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3552) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558) >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) >> at >> org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:131) >> at >> org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:475) >> at >> org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:208) >> at >> org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:733) >> at >> org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.setInput(TableSnapshotInputFormat.java:397) >> at >> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableSnapshotMapperJob(TableMapReduceUtil.java:301) >> at >> net.digitalenvoy.hp.job.ParseHostnamesJob.run(ParseHostnamesJob.java:77) >> at net.digitalenvoy.hp.HostProcessor.run(HostProcessor.java:165) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> at net.digitalenvoy.hp.HostProcessor.main(HostProcessor.java:47) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) >> >>> >>> Cheers >>> >>> >>> On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema < >>> [email protected]> wrote: >>> >>>> initTableSnapshotMapperJob writes into this directory (indirectly) via >>>> RestoreSnapshotHelper.restoreHdfsRegions >>>> >>>> Is this expected? I would have expected writes to be limited to the temp >>>> directory passed in the init call >>>> >>>> Brian >>>> >>>> On Sep 7, 2014, at 8:17 AM, Ted Yu <[email protected]> wrote: >>>> >>>>> The files under archive directory are referenced by snapshots. >>>>> Please don't delete them manually. >>>>> >>>>> You can delete unused snapshots. >>>>> >>>>> Cheers >>>>> >>>>> On Sep 7, 2014, at 4:08 AM, Brian Jeltema < >>>> [email protected]> wrote: >>>>> >>>>>> >>>>>> On Sep 6, 2014, at 9:32 AM, Ted Yu <[email protected]> wrote: >>>>>> >>>>>>> Can you post your hbase-site.xml ? >>>>>>> >>>>>>> /apps/hbase/data/archive/data/default is where HFiles are archived >>>> (e.g. >>>>>>> when a column family is deleted, HFiles for this column family are >>>> stored >>>>>>> here). >>>>>>> /apps/hbase/data/data/default seems to be your hbase.rootdir >>>>>> >>>>>> hbase.rootdir is defined to be hdfs://foo:8020/apps/hbase/data. I >> think >>>> that's the default that Ambari creates. >>>>>> >>>>>> So the HFiles in the archive subdirectory have been discarded and can >>>> be deleted safely? >>>>>> >>>>>>> bq. a problem I'm having running map/reduce jobs against snapshots >>>>>>> >>>>>>> Can you describe the problem in a bit more detail ? >>>>>> >>>>>> I don't understand what I'm seeing well enough to ask an intelligent >>>> question yet. >>>>>> I appear to be scanning duplicate rows when using >>>> initTableSnapshotMapperJob, >>>>>> but I'm trying to get a better understanding of how this works, since >>>> It's probably just >>>>>> something I'm doing wrong. >>>>>> >>>>>> Brian >>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> >>>>>>> On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> I'm trying to track down a problem I'm having running map/reduce >> jobs >>>>>>>> against snapshots. >>>>>>>> Can someone explain the difference between files stored in: >>>>>>>> >>>>>>>> /apps/hbase/data/archive/data/default >>>>>>>> >>>>>>>> and files stored in >>>>>>>> >>>>>>>> /apps/hbase/data/data/default >>>>>>>> >>>>>>>> (Hadoop 2.4, HBase 0.98) >>>>>>>> >>>>>>>> Thanks >>>>>> >>>>> >>>> >>>> >> >>
