We set-up a dedicated LAN consisting of the 2 computers using a switch. I think that made a difference and the 2 node cluster is working fine now. Also now we are working on Ubuntu and not Fedora.
Thanks for all the help. Shefali On Thu, 12 Feb 2009 shefali pawar wrote : >I changed the value... It is still not working! > >Shefali > >On Tue, 10 Feb 2009 22:23:10 +0530 wrote > >in hadoop-site.xml > >change master:54311 > > > >to hdfs://master:54311 > > > > > >--nitesh > > > >On Tue, Feb 10, 2009 at 9:50 PM, shefali pawar wrote: > > > >> I tried that, but it is not working either! > >> > >> Shefali > >> > >> On Sun, 08 Feb 2009 05:27:54 +0530 wrote > >> >I ran into this trouble again. This time, formatting the namenode didnt > >> >help. So, I changed the directories where the metadata and the data was > >> >being stored. That made it work. > >> > > >> >You might want to check this up at your end too. > >> > > >> >Amandeep > >> > > >> >PS: I dont have an explanation for how and why this made it work. > >> > > >> > > >> >Amandeep Khurana > >> >Computer Science Graduate Student > >> >University of California, Santa Cruz > >> > > >> > > >> >On Sat, Feb 7, 2009 at 9:06 AM, jason hadoop wrote: > >> > > >> >> On your master machine, use the netstat command to determine what ports > >> and > >> >> addresses the namenode process is listening on. > >> >> > >> >> On the datanode machines, examine the log files,, to verify that the > >> >> datanode has attempted to connect to the namenode ip address on one of > >> >> those > >> >> ports, and was successfull. > >> >> > >> >> The common ports used for datanode -> namenode rondevu are 50010, 54320 > >> and > >> >> 8020, depending on your hadoop version > >> >> > >> >> If the datanodes have been started, and the connection to the namenode > >> >> failed, there will be a log message with a socket error, indicating what > >> >> host and port the datanode used to attempt to communicate with the > >> >> namenode. > >> >> Verify that that ip address is correct for your namenode, and reachable > >> >> from > >> >> the datanode host (for multi homed machines this can be an issue), and > >> that > >> >> the port listed is one of the tcp ports that the namenode process is > >> >> listing > >> >> on. > >> >> > >> >> For linux, you can use command > >> >> *netstat -a -t -n -p | grep java | grep LISTEN* > >> >> to determine the ip addresses and ports and pids of the java processes > >> that > >> >> are listening for tcp socket connections > >> >> > >> >> and the jps command from the bin directory of your java installation to > >> >> determine the pid of the namenode. > >> >> > >> >> On Sat, Feb 7, 2009 at 6:27 AM, shefali pawar > >wrote: > >> >> > >> >> > Hi, > >> >> > > >> >> > No, not yet. We are still struggling! If you find the solution please > >> let > >> >> > me know. > >> >> > > >> >> > Shefali > >> >> > > >> >> > On Sat, 07 Feb 2009 02:56:15 +0530 wrote > >> >> > >I had to change the master on my running cluster and ended up with > >> the > >> >> > same > >> >> > >problem. Were you able to fix it at your end? > >> >> > > > >> >> > >Amandeep > >> >> > > > >> >> > > > >> >> > >Amandeep Khurana > >> >> > >Computer Science Graduate Student > >> >> > >University of California, Santa Cruz > >> >> > > > >> >> > > > >> >> > >On Thu, Feb 5, 2009 at 8:46 AM, shefali pawar wrote: > >> >> > > > >> >> > >> Hi, > >> >> > >> > >> >> > >> I do not think that the firewall is blocking the port because it > >> has > >> >> > been > >> >> > >> turned off on both the computers! And also since it is a random > >> port > >> >> > number > >> >> > >> I do not think it should create a problem. > >> >> > >> > >> >> > >> I do not understand what is going wrong! > >> >> > >> > >> >> > >> Shefali > >> >> > >> > >> >> > >> On Wed, 04 Feb 2009 23:23:04 +0530 wrote > >> >> > >> >I'm not certain that the firewall is your problem but if that port > >> is > >> >> > >> >blocked on your master you should open it to let communication > >> >> through. > >> >> > >> Here > >> >> > >> >is one website that might be relevant: > >> >> > >> > > >> >> > >> > > >> >> > >> > >> >> > > >> >> > >> http://stackoverflow.com/questions/255077/open-ports-under-fedora-core-8-for-vmware-server > >> >> > >> > > >> >> > >> >but again, this may not be your problem. > >> >> > >> > > >> >> > >> >John > >> >> > >> > > >> >> > >> >On Wed, Feb 4, 2009 at 12:46 PM, shefali pawar wrote: > >> >> > >> > > >> >> > >> >> Hi, > >> >> > >> >> > >> >> > >> >> I will have to check. I can do that tomorrow in college. But if > >> >> that > >> >> > is > >> >> > >> the > >> >> > >> >> case what should i do? > >> >> > >> >> > >> >> > >> >> Should i change the port number and try again? > >> >> > >> >> > >> >> > >> >> Shefali > >> >> > >> >> > >> >> > >> >> On Wed, 04 Feb 2009 S D wrote : > >> >> > >> >> > >> >> > >> >> >Shefali, > >> >> > >> >> > > >> >> > >> >> >Is your firewall blocking port 54310 on the master? > >> >> > >> >> > > >> >> > >> >> >John > >> >> > >> >> > > >> >> > >> >> >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar > >wrote: > >> >> > >> >> > > >> >> > >> >> > > Hi, > >> >> > >> >> > > > >> >> > >> >> > > I am trying to set-up a two node cluster using Hadoop0.19.0, > >> >> with > >> >> > 1 > >> >> > >> >> > > master(which should also work as a slave) and 1 slave node. > >> >> > >> >> > > > >> >> > >> >> > > But while running bin/start-dfs.sh the datanode is not > >> starting > >> >> > on > >> >> > >> the > >> >> > >> >> > > slave. I had read the previous mails on the list, but > >> nothing > >> >> > seems > >> >> > >> to > >> >> > >> >> be > >> >> > >> >> > > working in this case. I am getting the following error in > >> the > >> >> > >> >> > > hadoop-root-datanode-slave log file while running the > >> command > >> >> > >> >> > > bin/start-dfs.sh => > >> >> > >> >> > > > >> >> > >> >> > > 2009-02-03 13:00:27,516 INFO > >> >> > >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: > >> STARTUP_MSG: > >> >> > >> >> > > > >> /************************************************************ > >> >> > >> >> > > STARTUP_MSG: Starting DataNode > >> >> > >> >> > > STARTUP_MSG: host = slave/172.16.0.32 > >> >> > >> >> > > STARTUP_MSG: args = [] > >> >> > >> >> > > STARTUP_MSG: version = 0.19.0 > >> >> > >> >> > > STARTUP_MSG: build = > >> >> > >> >> > > > >> >> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19-r > >> >> > >> >> > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 > >> >> > >> >> > > > >> ************************************************************/ > >> >> > >> >> > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 0 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 1 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 2 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 3 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 4 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 5 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 6 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 7 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 8 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: > >> >> > Retrying > >> >> > >> >> connect > >> >> > >> >> > > to server: master/172.16.0.46:54310. Already tried 9 > >> time(s). > >> >> > >> >> > > 2009-02-03 13:00:37,738 ERROR > >> >> > >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: > >> >> > >> java.io.IOException: > >> >> > >> >> Call > >> >> > >> >> > > to master/172.16.0.46:54310 failed on local exception: No > >> >> route > >> >> > to > >> >> > >> >> host > >> >> > >> >> > > at org.apache.hadoop.ipc.Client.call(Client.java:699) > >> >> > >> >> > > at > >> >> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > >> >> > >> >> > > at $Proxy4.getProtocolVersion(Unknown Source) > >> >> > >> >> > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) > >> >> > >> >> > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) > >> >> > >> >> > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) > >> >> > >> >> > > at > >> org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> >> > >> > >> >> > > >> >> > >> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> >> > >> > >> >> > > >> >> > >> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> >> > >> > >> >> > > >> >> > >> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> >> > >> > >> >> > > >> >> > >> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> >> > >> >> > > >> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284) > >> >> > >> >> > > Caused by: java.net.NoRouteToHostException: No route to host > >> >> > >> >> > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> >> > Method) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> > >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > >> >> > >> >> > > at > >> >> > sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > >> > >> >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299) > >> >> > >> >> > > at > >> >> > >> >> > > > >> >> > org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) > >> >> > >> >> > > at > >> >> > >> org.apache.hadoop.ipc.Client.getConnection(Client.java:772) > >> >> > >> >> > > at org.apache.hadoop.ipc.Client.call(Client.java:685) > >> >> > >> >> > > ... 12 more > >> >> > >> >> > > > >> >> > >> >> > > 2009-02-03 13:00:37,739 INFO > >> >> > >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: > >> SHUTDOWN_MSG: > >> >> > >> >> > > > >> /************************************************************ > >> >> > >> >> > > SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32 > >> >> > >> >> > > > >> ************************************************************/ > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > Also, the Pseudo distributed operation is working on both > >> the > >> >> > >> machines. > >> >> > >> >> And > >> >> > >> >> > > i am able to ssh from 'master to master' and 'master to > >> slave' > >> >> > via a > >> >> > >> >> > > password-less ssh login. I do not think there is any problem > >> >> with > >> >> > >> the > >> >> > >> >> > > network because cross pinging is working fine. > >> >> > >> >> > > > >> >> > >> >> > > I am working on Linux (Fedora 8) > >> >> > >> >> > > > >> >> > >> >> > > The following is the configuration which i am using > >> >> > >> >> > > > >> >> > >> >> > > On master and slave, /conf/masters looks like this: > >> >> > >> >> > > > >> >> > >> >> > > master > >> >> > >> >> > > > >> >> > >> >> > > On master and slave, /conf/slaves looks like this: > >> >> > >> >> > > > >> >> > >> >> > > master > >> >> > >> >> > > slave > >> >> > >> >> > > > >> >> > >> >> > > On both the machines conf/hadoop-site.xml looks like this > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > fs.default.name > >> >> > >> >> > > hdfs://master:54310 > >> >> > >> >> > > The name of the default file system. A URI whose > >> >> > >> >> > > scheme and authority determine the FileSystem > >> implementation. > >> >> > The > >> >> > >> >> > > uri's scheme determines the config property > >> (fs.SCHEME.impl) > >> >> > naming > >> >> > >> >> > > the FileSystem implementation class. The uri's authority > >> is > >> >> > used > >> >> > >> to > >> >> > >> >> > > determine the host, port, etc. for a filesystem. > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > mapred.job.tracker > >> >> > >> >> > > master:54311 > >> >> > >> >> > > The host and port that the MapReduce job tracker runs > >> >> > >> >> > > at. If "local", then jobs are run in-process as a single > >> map > >> >> > >> >> > > and reduce task. > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > dfs.replication > >> >> > >> >> > > 2 > >> >> > >> >> > > Default block replication. > >> >> > >> >> > > The actual number of replications can be specified when the > >> >> file > >> >> > is > >> >> > >> >> > > created. > >> >> > >> >> > > The default is used if replication is not specified in > >> create > >> >> > time. > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > > >> >> > >> >> > > namenode is formatted succesfully by running > >> >> > >> >> > > > >> >> > >> >> > > "bin/hadoop namenode -format" > >> >> > >> >> > > > >> >> > >> >> > > on the master node. > >> >> > >> >> > > > >> >> > >> >> > > I am new to Hadoop and I do not know what is going wrong. > >> >> > >> >> > > > >> >> > >> >> > > Any help will be appreciated. > >> >> > >> >> > > > >> >> > >> >> > > Thanking you in advance > >> >> > >> >> > > > >> >> > >> >> > > Shefali Pawar > >> >> > >> >> > > Pune, India > >> >> > >> >> > > > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> > > >> >> > >> > >> >> > > > >> >> > > >> >> > >> > > >> > > > > > > > >-- > >Nitesh Bhatia > >Dhirubhai Ambani Institute of Information & Communication Technology > >Gandhinagar > >Gujarat > > > >"Life is never perfect. It just depends where you draw the line." > > > >visit: > >http://www.awaaaz.com - connecting through music > >http://www.volstreet.com - lets volunteer for better tomorrow > >http://www.instibuzz.com - Voice opinions, Transact easily, Have fun > >
