Aaron: Which log file do I look into - there are alot of them. Here s what the error looks like: [mith...@node19:~]$ cd hadoop [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 0 time(s). 09/04/14 10:09:30 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 1 time(s). 09/04/14 10:09:31 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 2 time(s). 09/04/14 10:09:32 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 3 time(s). 09/04/14 10:09:33 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 4 time(s). 09/04/14 10:09:34 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 5 time(s). 09/04/14 10:09:35 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 6 time(s). 09/04/14 10:09:36 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 7 time(s). 09/04/14 10:09:37 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 8 time(s). 09/04/14 10:09:38 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 9 time(s). Bad connection to FS. command aborted.
Node19 is a slave and Node18 is the master. Mithila On Tue, Apr 14, 2009 at 8:53 PM, Aaron Kimball <[email protected]> wrote: > Are there any error messages in the log files on those nodes? > - Aaron > > On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra <[email protected]> > wrote: > > > I ve drawn a blank here! Can't figure out what s wrong with the ports. I > > can > > ssh between the nodes but cant access the DFS from the slaves - says "Bad > > connection to DFS". Master seems to be fine. > > Mithila > > > > On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra <[email protected]> > > wrote: > > > > > Yes I can.. > > > > > > > > > On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky <[email protected] > > >wrote: > > > > > >> Can you ssh between the nodes? > > >> > > >> -jim > > >> > > >> On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra <[email protected]> > > >> wrote: > > >> > > >> > Thanks Aaron. > > >> > Jim: The three clusters I setup had ubuntu running on them and the > dfs > > >> was > > >> > accessed at port 54310. The new cluster which I ve setup has Red Hat > > >> Linux > > >> > release 7.2 (Enigma)running on it. Now when I try to access the dfs > > from > > >> > one > > >> > of the slaves i get the following response: dfs cannot be accessed. > > When > > >> I > > >> > access the DFS throught the master there s no problem. So I feel > there > > a > > >> > problem with the port. Any ideas? I did check the list of slaves, it > > >> looks > > >> > fine to me. > > >> > > > >> > Mithila > > >> > > > >> > > > >> > > > >> > > > >> > On Mon, Apr 13, 2009 at 2:58 PM, Jim Twensky <[email protected] > > > > >> > wrote: > > >> > > > >> > > Mithila, > > >> > > > > >> > > You said all the slaves were being utilized in the 3 node cluster. > > >> Which > > >> > > application did you run to test that and what was your input size? > > If > > >> you > > >> > > tried the word count application on a 516 MB input file on both > > >> cluster > > >> > > setups, than some of your nodes in the 15 node cluster may not be > > >> running > > >> > > at > > >> > > all. Generally, one map job is assigned to each input split and if > > you > > >> > are > > >> > > running your cluster with the defaults, the splits are 64 MB each. > I > > >> got > > >> > > confused when you said the Namenode seemed to do all the work. Can > > you > > >> > > check > > >> > > conf/slaves and make sure you put the names of all task trackers > > >> there? I > > >> > > also suggest comparing both clusters with a larger input size, say > > at > > >> > least > > >> > > 5 GB, to really see a difference. > > >> > > > > >> > > Jim > > >> > > > > >> > > On Mon, Apr 13, 2009 at 4:17 PM, Aaron Kimball < > [email protected]> > > >> > wrote: > > >> > > > > >> > > > in hadoop-*-examples.jar, use "randomwriter" to generate the > data > > >> and > > >> > > > "sort" > > >> > > > to sort it. > > >> > > > - Aaron > > >> > > > > > >> > > > On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi < > > [email protected]> > > >> > > wrote: > > >> > > > > > >> > > > > Your data is too small I guess for 15 clusters ..So it might > be > > >> > > overhead > > >> > > > > time of these clusters making your total MR jobs more time > > >> consuming. > > >> > > > > I guess you will have to try with larger set of data.. > > >> > > > > > > >> > > > > Pankil > > >> > > > > On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra < > > >> [email protected]> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > Aaron > > >> > > > > > > > >> > > > > > That could be the issue, my data is just 516MB - wouldn't > this > > >> see > > >> > a > > >> > > > bit > > >> > > > > of > > >> > > > > > speed up? > > >> > > > > > Could you guide me to the example? I ll run my cluster on it > > and > > >> > see > > >> > > > what > > >> > > > > I > > >> > > > > > get. Also for my program I had a java timer running to > record > > >> the > > >> > > time > > >> > > > > > taken > > >> > > > > > to complete execution. Does Hadoop have an inbuilt timer? > > >> > > > > > > > >> > > > > > Mithila > > >> > > > > > > > >> > > > > > On Mon, Apr 13, 2009 at 1:13 AM, Aaron Kimball < > > >> [email protected] > > >> > > > > >> > > > > wrote: > > >> > > > > > > > >> > > > > > > Virtually none of the examples that ship with Hadoop are > > >> designed > > >> > > to > > >> > > > > > > showcase its speed. Hadoop's speedup comes from its > ability > > to > > >> > > > process > > >> > > > > > very > > >> > > > > > > large volumes of data (starting around, say, tens of GB > per > > >> job, > > >> > > and > > >> > > > > > going > > >> > > > > > > up in orders of magnitude from there). So if you are > timing > > >> the > > >> > pi > > >> > > > > > > calculator (or something like that), its results won't > > >> > necessarily > > >> > > be > > >> > > > > > very > > >> > > > > > > consistent. If a job doesn't have enough fragments of data > > to > > >> > > > allocate > > >> > > > > > one > > >> > > > > > > per each node, some of the nodes will also just go unused. > > >> > > > > > > > > >> > > > > > > The best example for you to run is to use randomwriter to > > fill > > >> up > > >> > > > your > > >> > > > > > > cluster with several GB of random data and then run the > sort > > >> > > program. > > >> > > > > If > > >> > > > > > > that doesn't scale up performance from 3 nodes to 15, then > > >> you've > > >> > > > > > > definitely > > >> > > > > > > got something strange going on. > > >> > > > > > > > > >> > > > > > > - Aaron > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > On Sun, Apr 12, 2009 at 8:39 AM, Mithila Nagendra < > > >> > > [email protected]> > > >> > > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Hey all > > >> > > > > > > > I recently setup a three node hadoop cluster and ran an > > >> > examples > > >> > > on > > >> > > > > it. > > >> > > > > > > It > > >> > > > > > > > was pretty fast, and all the three nodes were being used > > (I > > >> > > checked > > >> > > > > the > > >> > > > > > > log > > >> > > > > > > > files to make sure that the slaves are utilized). > > >> > > > > > > > > > >> > > > > > > > Now I ve setup another cluster consisting of 15 nodes. I > > ran > > >> > the > > >> > > > same > > >> > > > > > > > example, but instead of speeding up, the map-reduce task > > >> seems > > >> > to > > >> > > > > take > > >> > > > > > > > forever! The slaves are not being used for some reason. > > This > > >> > > second > > >> > > > > > > cluster > > >> > > > > > > > has a lower, per node processing power, but should that > > make > > >> > any > > >> > > > > > > > difference? > > >> > > > > > > > How can I ensure that the data is being mapped to all > the > > >> > nodes? > > >> > > > > > > Presently, > > >> > > > > > > > the only node that seems to be doing all the work is the > > >> Master > > >> > > > node. > > >> > > > > > > > > > >> > > > > > > > Does 15 nodes in a cluster increase the network cost? > What > > >> can > > >> > I > > >> > > do > > >> > > > > to > > >> > > > > > > > setup > > >> > > > > > > > the cluster to function more efficiently? > > >> > > > > > > > > > >> > > > > > > > Thanks! > > >> > > > > > > > Mithila Nagendra > > >> > > > > > > > Arizona State University > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >
