(Moving to user list, hdfs-dev bcc'd.) Hi Prithvi,
>From a quick scan, it looks to me like one of your commands ends up using "input_path" as a string literal instead of replacing with the value of the input_path variable. I've pasted the command below. Notice that one of the -file options used "input_path" instead of "$input_path". Is that the problem? Hope this helps, --Chris $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D mapred.task.timeout=0 -D mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))" -D mapred.reduce.tasks=$num_of_reducer -input input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file brandes_mapper -file src/mslab/BC_reducer.py -file src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper $input_path $num_of_node" -reducer "./BC_reducer.py" On Mon, Apr 22, 2013 at 10:11 AM, prithvi dammalapati < d.prithvi...@gmail.com> wrote: > I have the following hadoop code to find the betweenness centrality of a > graph > > java_home=/usr/lib/jvm/java-1.7.0-openjdk-amd64 > hadoop_home=/usr/local/hadoop/hadoop-1.0.4 > hadoop_lib=$hadoop_home/hadoop-core-1.0.4.jar > hadoop_bin=$hadoop_home/bin/hadoop > hadoop_config=$hadoop_home/conf > > hadoop_streaming=$hadoop_home/contrib/streaming/hadoop-streaming-1.0.4.jar > #task specific parameters > source_code=BetweennessCentrality.java > jar_file=BetweennessCentrality.jar > main_class=mslab.BetweennessCentrality > num_of_node=38012 > num_of_mapper=100 > num_of_reducer=8 > input_path=/data/dblp_author_conf_adj.txt > output_path=dblp_bc_N$(($num_of_node))_M$((num_of_mapper)) > rm build -rf > mkdir build > $java_home/bin/javac -d build -classpath .:$hadoop_lib > src/mslab/$source_code > rm $jar_file -f > $java_home/bin/jar -cf $jar_file -C build/ . > $hadoop_bin --config $hadoop_config fs -rmr $output_path > $hadoop_bin --config $hadoop_config jar $jar_file $main_class > $num_of_node $num_of_mapper > > rm brandes_mapper > > g++ src/mslab/mapred_brandes.cpp -O3 -o brandes_mapper > $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D > mapred.task.timeout=0 -D > mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))" > -D mapred.reduce.tasks=$num_of_reducer -input > input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file > brandes_mapper -file src/mslab/BC_reducer.py -file > src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper > $input_path $num_of_node" -reducer "./BC_reducer.py" > > When I run this code in a shell script, i get the following errors: > > Warning: $HADOOP_HOME is deprecated. > File: /home/hduser/Downloads/mgmf/trunk/input_path does not exist, or > is not readable. > Streaming Command Failed! > > but the file exits at the specified path > > /Downloads/mgmf/trunk/data$ ls > dblp_author_conf_adj.txt > > I have also added the input file into HDFS using > > /usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /source /destination > > Can someone help me solve this problem? > > > Any help is appreciated, > Thanks > Prithvi >