Hi, I have written a script which generates JSON files, loads it into a dictionary, adds a few attributes and uploads the modified files to HDFS. After the files are generated, if I perform a select * from..; on the table which points to this location, I get "null, null...." as the result. I also tried without the added attributes and it did not make a difference. I strongly suspect the data. Currently I am using strip() to eliminate trailing and leading whitespaces and newlines. Wondering if embedded "\n" that is, json string objects containing "\n" in the value, causes such issues. There are no parsing errors, so I am not able to debug this issue. Are there any flags that I can set to figure out what is happening within the parser code?
I set this: hive -hiveconf hive.root.logger=DEBUG,console But the output is not really useful: blocks=[LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734; getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010, 192.168.1.66:50010, 192.168.1.63:50010]}] lastLocatedBlock=LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734; getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010, 192.168.1.66:50010, 192.168.1.63:50010]} isLastBlockComplete=true} 13/07/30 11:49:41 DEBUG hdfs.DFSClient: Connecting to datanode 192.168.1.61:50010 null null null null null null null null null null null null null null null null 13/07/30 11:49:41 INFO exec. Also, the attributes I am adding are current year, month day and time. So they are not null for any record. I even moved existing files which did not have these fields set so that there are no records with these fields as null. However, I dont think this is an issue as the advantage of JSON/Hive JSON serde is that it allows object struct to be dynamic. Right? Any suggestion regarding debugging would be very helpful. thanks Sunita