Re: Hadoop Streaming job error - Need help urgent

Chris Nauroth Mon, 22 Apr 2013 11:50:09 -0700

(Moving to user list, hdfs-dev bcc'd.)

Hi Prithvi,


>From a quick scan, it looks to me like one of your commands ends up using
"input_path" as a string literal instead of replacing with the value of the
input_path variable.  I've pasted the command below.  Notice that one of
the -file options used "input_path" instead of "$input_path".

Is that the problem?

Hope this helps,
--Chris



    $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D
mapred.task.timeout=0 -D
mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))"
-D mapred.reduce.tasks=$num_of_reducer -input
input_BC_N$((num_of_node))_M$((num_of_mapper))
-output $output_path -file brandes_mapper -file src/mslab/BC_reducer.py
-file src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper
$input_path $num_of_node" -reducer "./BC_reducer.py"



On Mon, Apr 22, 2013 at 10:11 AM, prithvi dammalapati <
d.prithvi...@gmail.com> wrote:

> I have the following hadoop code to find the betweenness centrality of a
> graph
>
>     java_home=/usr/lib/jvm/java-1.7.0-openjdk-amd64
>     hadoop_home=/usr/local/hadoop/hadoop-1.0.4
>     hadoop_lib=$hadoop_home/hadoop-core-1.0.4.jar
>     hadoop_bin=$hadoop_home/bin/hadoop
>     hadoop_config=$hadoop_home/conf
>
> hadoop_streaming=$hadoop_home/contrib/streaming/hadoop-streaming-1.0.4.jar
>     #task specific parameters
>     source_code=BetweennessCentrality.java
>     jar_file=BetweennessCentrality.jar
>     main_class=mslab.BetweennessCentrality
>     num_of_node=38012
>     num_of_mapper=100
>     num_of_reducer=8
>     input_path=/data/dblp_author_conf_adj.txt
>     output_path=dblp_bc_N$(($num_of_node))_M$((num_of_mapper))
>     rm build -rf
>     mkdir build
>     $java_home/bin/javac -d build -classpath .:$hadoop_lib
> src/mslab/$source_code
>     rm $jar_file -f
>     $java_home/bin/jar -cf $jar_file -C build/ .
>     $hadoop_bin --config $hadoop_config fs -rmr $output_path
>     $hadoop_bin --config $hadoop_config jar $jar_file $main_class
> $num_of_node       $num_of_mapper
>
>     rm brandes_mapper
>
>     g++ src/mslab/mapred_brandes.cpp -O3 -o brandes_mapper
>     $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D
> mapred.task.timeout=0 -D 
> mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))"
> -D mapred.reduce.tasks=$num_of_reducer -input
> input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file
> brandes_mapper -file src/mslab/BC_reducer.py -file
> src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper
> $input_path $num_of_node" -reducer "./BC_reducer.py"
>
> When I run this code in a shell script, i get the following errors:
>
>     Warning: $HADOOP_HOME is deprecated.
>     File: /home/hduser/Downloads/mgmf/trunk/input_path does not exist, or
> is not readable.
>     Streaming Command Failed!
>
> but the file exits at the specified path
>
>     /Downloads/mgmf/trunk/data$ ls
>     dblp_author_conf_adj.txt
>
> I have also added the input file into HDFS using
>
>     /usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /source /destination
>
> Can someone help me solve this problem?
>
>
> Any help is appreciated,
> Thanks
> Prithvi
>

Re: Hadoop Streaming job error - Need help urgent

Reply via email to