On 01/25/2015 03:23 PM, Shalini Ravishankar wrote:
Hello Everyone,
I am trying to read(open) and write files in hdfs inside a python script. But
having error.
Please copy/paste the full error traceback.
Can someone tell me what is wrong here.
Code (full): sample.py
#!/usr/bin/python
from subprocess import Popen, PIPE
print "Before Loop"
cat = Popen(["hadoop", "fs", "-cat", "./sample.txt"],
stdout=PIPE)
I don't know anything about hadoop, and when you run it separately, you
used different parameters. So you can do a lot towards testing it yourself.
Start by running hadoop fs -cat ... from shell to see whether it
displays anything. You should be able to use exactly the same arguments
as you use in the Popen call.
Then if that seems to work as you expect, comment out your 'put' code
below, and add some prints to the loop. Does that look reasonable?
At that point, if both look reasonable, then try the inverse. Write
some known data to the 'put' command, and see if it makes it into the
appropriate file. Once again, you should be able to also test the
program parameters and behavior from the shell, typing manually into stdin.
put = Popen(["hadoop", "fs", "-put", "-", "./modifiedfile.txt"],
stdin=PIPE)
for line in cat.stdout:
line += "Blah"
print line
put.stdin.write(line)
cat.stdout.close()
cat.wait()
put.stdin.close()
put.wait()
When I execute :
hadoop jar
/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar -file
./sample.py -mapper './sample.py' -input sample.txt -output fileRead
It executes properly I couldn't find the file which supposed to create in hdfs
modifiedfile
And When I execute :
hadoop fs -getmerge ./fileRead/ file.txt
Inside the file.txt, I got :
Before Loop
Before Loop
Can someone please tell me what I am doing wrong here ?? I dont think it reads
from the sample.txt
I would really appreciate the help.
--
Thanks & Regards,
Shalini Ravishankar.
--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list