Hello Ravi, Thank you for your response. I have another question:
I am trying to trace a call from org.apache.hadoop.fs.FsShell to NameNode. I am running a simple basic "ls" to understand how the mechanism works. What I want to know is which classes are used on the way when the "ls" is computed. To understand the mechanism, I print some messages on the way. Below is my prints. I get lost from line 34 to 35. I also could not find how DFSClient, DistributedFileSystem and FileSystem are being used on this way when running "ls". Any comment/help is appreciated. I basically want to know the call sequence of a basic shell command "ls". Thanks Yasin 1 - FsShell.main: [-ls, /test] 2 - FsShell.run: [-ls, /test] 3 - FsShell.init 1 4 - FsShell.init 2 5 - FsShell.registerCommands 6 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.FsCommand 7 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.AclCommands 8 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.CopyCommands 9 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Count 10 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Delete 11 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Display 12 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.find.Find 13 - CommandFactory.registerCommands org.apache.hadoop.fs.FsShellPermissions 14 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.FsUsage 15 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Ls 16 - LS.registerCommands: org.apache.hadoop.fs.shell.CommandFactory@51c8530f 17 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Mkdir 18 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.MoveCommands 19 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.SetReplication 20 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Stat 21 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Tail 22 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Test 23 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Touch 24 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.Truncate 25 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.SnapshotCommands 26 - CommandFactory.registerCommands org.apache.hadoop.fs.shell.XAttrCommands 27 - FsShell.run:cmd: -ls 28 - FsShell.run:Command Name: ls 29 - FsShell.run: Started:********************************** org.apache.hadoop.fs.shell.Ls@3bd94634 30 - Command.run 31 - LS.processOptions: args[0]: /test 32 - Command.run. After ProcessOptions 33 - Command.processRawArguments 34 - Command.expandArguments: /test 35 - UserGroupInformation.getCurrentUser() 36 - UserGroupInformation.getLoginUser() 1 37 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.ensureInitialized() 1 38 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.initialize: 1 39 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.initialize: 2 40 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.initialize: 3 41 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.initialize: 4 42 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.ensureInitialized() 2 43 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 1 yasin (auth:null) 44 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 2 45 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 3 46 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 4 47 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.spawnAutoRenewalThreadForUserCreds() 1 48 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 7 49 - 16/09/17 14:34:05 INFO security.UserGroupInformation: UserGroupInformation.loginUserFromSubject: 8 50 - UserGroupInformation.getLoginUser() 2 51 - UserGroupInformation.getCurrentUser() 52 - DistributedFileSystem.DistributedFileSystem() 53 - DistributedFileSystem.DistributedFileSystem() 54 - FileSystem.createFileSystemhdfs://localhost:54310 55 - FileSystem.initializehdfs://localhost:54310 56 - DistributedFileSystem.initialize hdfs://localhost:54310 57 - UserGroupInformation.getCurrentUser() 58 - DFSClient.DFSClient 2: 59 - NamenodeProxies.creatProxy : 60 - UserGroupInformation.getCurrentUser() 61 - NamenodeProxies.creatNNProxyWithClientProtocol:localhost/ 127.0.0.1:54310 62 - DFSClient.DFSClient 1: 63 - Command.processArguments 1 64 - Command.processArguments /test 65 - Command.processArgument: item: /test 66 - Command.recursePath: /test 67 - DistributedFileSystem.listStatusInternal:Path:test 68 - DFSCleint.listpaths: 1 /test 69 - DFSCleint.listpaths: 2 /test 70 - Found 2 items 71 - LS.processPaths items: [Lorg.apache.hadoop.fs.shell.PathData;@75f95314 72 - Command.processPaths: item: /test/output 73 - =====LS.processPath:item: /test/output 74 - =====LS.processPath Before out: drwxr-xr-x - yasin supergroup 0 2016-09-09 11:40 /test/output 75 - drwxr-xr-x - yasin supergroup 0 2016-09-09 11:40 /test/output 76 - =====LS.processPath After out: drwxr-xr-x - yasin supergroup 0 2016-09-09 11:40 /test/output 77 - Command.processPaths: item: /test/text2gb.txt 78 - =====LS.processPath:item: /test/text2gb.txt 79 - =====LS.processPath Before out: -rw-r--r-- 3 yasin supergroup 2002096300 2016-09-06 17:04 /test/text2gb.txt 80 - -rw-r--r-- 3 yasin supergroup 2002096300 2016-09-06 17:04 /test/text2gb.txt 81 - =====LS.processPath After out: -rw-r--r-- 3 yasin supergroup 2002096300 2016-09-06 17:04 /test/text2gb.txt 82 - Command.run. After processRawArguments 83 - FsShell.run: STOP:********************************** Yasin Celik On Wed, Aug 10, 2016 at 1:32 PM, Ravi Prakash <ravihad...@gmail.com> wrote: > Hi Yasin! > > Without knowing more about your project, here are answers to your > questions. > > It's trivially easy to start only the Datanode. The HDFS code is very > modular. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs- > project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ > server/datanode/DataNode.java . https://github.com/apache/ > hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop > is a script you can use. > > Obviously though the Datanode will try to talk to a Namenode via the > Namenode RPC mechanism. https://github.com/apache/ > hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/ > java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java > > If you wanted to modify the Namenode, here's the RPC interface it exports > : https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs- > project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ > NameNodeRpcServer.java . > > Good luck with your project! > HTH > Ravi > > On Wed, Aug 10, 2016 at 9:11 AM, Yasin Celik <yasinceli...@gmail.com> > wrote: > >> Hello All, >> >> I working on a P2P storage project for research purpose. >> I want to use HDFS DataNode as a part of a research project. >> One possibility is using only DataNode as a storage engine and do >> everything else at upper level. In this case I will have all the metadata >> management and replication mechanism at upper level and use DataNode only >> for storing data per node. >> >> The second possibility is using also NameNode for metadata management and >> modify it to fit in my project. >> >> I have been trying to find where to start. How much modularity is there in >> HDFS? >> Can I use only DataNode alone and modify it to fit in my project? What are >> inputs and outputs of DataNode? Where should I start? >> >> If I decide to use also NameNode, where should start? >> >> Any comment/help is appreciated. >> >> Thanks >> >> Yasin Celik >> > >