Re: Error when loading json to spark

2017-01-01 Thread Raymond Xie
Thank you very much Marco, is your code in Scala? do you have a python example? Can anyone give me a python example to handle json data on Spark? ** *Sincerely yours,* *Raymond* On Sun, Jan 1, 2017 at 12:29 PM, Marco Mistroni wrote: > Hi >y

Re: Error when loading json to spark

2017-01-01 Thread Marco Mistroni
Hi you will need to pass the schema, like in the snippet below (even though the code might have been superseeded in spark 2.0) import sqlContext.implicits._ val jsonRdd = sc.textFile("file:///c:/tmp/1973-01-11.json") val schema = (new StructType).add("hour", StringType).add("month",

Re: Error when loading json to spark

2017-01-01 Thread Raymond Xie
I found the cause: I need to "put" the json file onto hdfs first before it can be used, here is what I did: hdfs dfs -put /root/Downloads/data/json/world_bank.json hdfs://localhost:9000/json df = sqlContext.read.json("/json/") df.show(10) . However, there is a new problem here, the json da

Re: Error when loading json to spark

2017-01-01 Thread Raymond Xie
Thank you Miguel, here is the output: >>>df = sqlContext.read.json("/root/Downloads/data/json") 17/01/01 07:28:19 INFO json.JSONRelation: Listing hdfs://localhost:9000/root/Downloads/data/json on driver 17/01/01 07:28:19 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estim

Re: Error when loading json to spark

2016-12-31 Thread Miguel Morales
Looks like it's trying to treat that path as a folder, try omitting the file name and just use the folder path. On Sat, Dec 31, 2016 at 7:58 PM, Raymond Xie wrote: > Happy new year!!! > > I am trying to load a json file into spark, the json file is attached here. > > I received the following erro