I have contacted with the administor of our cluster and he gave me the access. Now my program can work under full distributed mode.

Thanks a lot.

Jasmine
----- Original Message ----- From: "jason hadoop" <[email protected]>
To: <[email protected]>
Sent: Sunday, April 26, 2009 12:13 PM
Subject: Re: Can't start fully-distributed operation of Hadoop in Sun Grid Engine


It may be that the sun grid is similar to the EC2 and the machines have an
internal IPaddress/name that MUST be used for inter machine communication
and an external IPaddress/name that is only for internet access.

The above overly complex sentence basically states there may be some
firewall rules/tools in the sun grid that you need to be aware of and use.

On Sun, Apr 26, 2009 at 6:31 AM, Jasmine (Xuanjing) Huang <
[email protected]> wrote:

Hi, Jason,

Thanks for your advice, after insert port into the file of
"hadoop-site.xml", I can start namenode and run job now.
But my system works only when I set localhost to masters and add localhost
(as well as some other nodes) to slavers file. And all the tasks are
Data-local map tasks. I wonder if whether I enter fully distributed mode, or
still in pseudo mode.

As for the SGE, I am only a user and know little about it. This is the user
manual of our cluster:
http://www.cs.umass.edu/~swarm/index.php?n=Main.UserDoc<http://www.cs.umass.edu/%7Eswarm/index.php?n=Main.UserDoc>

Best,
Jasmine

----- Original Message ----- From: "jason hadoop" <[email protected]>
To: <[email protected]>
Sent: Sunday, April 26, 2009 12:06 AM
Subject: Re: Can't start fully-distributed operation of Hadoop in Sun Grid
Engine



 the parameter you specify for fs.default name should be of the form
hdfs://host:port and the parameter you specify for the mapred.job.tracker
MUST be host:port. I haven't looked at 18.3,  but it appears that the
:port
is mandatory.

In your case, the piece of code parsing the fs.default.name variable is
not
able to tokenize it into protocol host and port correctly

recap:
fs.default.name hdfs://namenodeHost:port
mapred.job.tracker jobtrackerHost:port
sepecify all the parts above and try again.

Can you please point me at information on using the sun grid, I want to
include a paragraph or two about it in my book.

On Sat, Apr 25, 2009 at 4:28 PM, Jasmine (Xuanjing) Huang <
[email protected]> wrote:

 Hi, there,

My hadoop system (version: 0.18.3) works well under standalone and
pseudo-distributed operation. But if I try to run hadoop in
fully-distributed mode in Sun Grid Engine, Hadoop always failed -- in
fact,
the jobTracker and TaskzTracker can be started, but the namenode and
secondary namenode cannot be started. Could anyone help me with it?

My SGE scripts looks like:

#!/bin/bash
#$ -cwd
#$ -S /bin/bash
#$ -l long=TRUE
#$ -v JAVA_HOME=/usr/java/latest
#$ -v HADOOP_HOME=*********
#$ -pe hadoop 6
PATH="$HADOOP_HOME/bin:$PATH"
hadoop fs -put ********
hadoop jar *****
hadoop fs -get *********

Then the output looks like:
Exception in thread "main" java.lang.NumberFormatException: For input
string: ""
     at
java.lang.NumberFormatException.forInputString(NumberFormatException.
java:48)
     at java.lang.Integer.parseInt(Integer.java:468)
     at java.lang.Integer.parseInt(Integer.java:497)
     at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144)
     at org.apache.hadoop.dfs.NameNode.getAddress(NameNode.java:116)
     at
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFil
eSystem.java:66)
     at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339
)
     at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:118)
     at org.apache.hadoop.fs.FsShell.init(FsShell.java:88)
     at org.apache.hadoop.fs.FsShell.run(FsShell.java:1703)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
     at org.apache.hadoop.fs.FsShell.main(FsShell.java:1852)

And the log of NameNode looks like
2009-04-25 17:27:17,032 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ************
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.18.3
************************************************************/
2009-04-25 17:27:17,147 ERROR org.apache.hadoop.dfs.NameNode:
java.lang.NumberFormatException: For i
nput string: ""
     at

java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
     at java.lang.Integer.parseInt(Integer.java:468)
     at java.lang.Integer.parseInt(Integer.java:497)
     at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144)
     at org.apache.hadoop.dfs.NameNode.getAddress(NameNode.java:116)
     at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:136)
     at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)
     at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830)
     at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839)

2009-04-25 17:27:17,149 INFO org.apache.hadoop.dfs.NameNode:
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ***************

Best,
Jasmine




--
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422





--
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422


Reply via email to