Hi,

I have written a script which generates JSON files, loads it into a
dictionary, adds a few attributes and uploads the modified files to HDFS.
After the files are generated, if I perform a select * from..; on the table
which points to this location, I get "null, null...." as the result. I also
tried without the added attributes and it did not make a difference. I
strongly suspect the data.
Currently I am using strip() to eliminate trailing and leading whitespaces
and newlines. Wondering if embedded "\n" that is, json string objects
containing "\n" in the value, causes such issues.
There are no parsing errors, so I am not able to debug this issue. Are
there any flags that I can set to figure out what is happening within the
parser code?

I set this:
hive -hiveconf hive.root.logger=DEBUG,console

But the output is not really useful:

blocks=[LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734;
getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010,
192.168.1.66:50010, 192.168.1.63:50010]}]

lastLocatedBlock=LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734;
getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010,
192.168.1.66:50010, 192.168.1.63:50010]}
  isLastBlockComplete=true}
13/07/30 11:49:41 DEBUG hdfs.DFSClient: Connecting to datanode
192.168.1.61:50010
null
null
null
null
null
null
null
null
null
null
null
null
null
null
null
null
13/07/30 11:49:41 INFO exec.

Also, the attributes I am adding are current year, month day and time. So
they are not null for any record. I even moved existing files which did not
have these fields set so that there are no records with these fields as
null. However, I dont think this is an issue as the advantage of JSON/Hive
JSON serde is that it allows object struct to be dynamic. Right?

Any suggestion regarding debugging would be very helpful.

thanks
Sunita

Reply via email to