Thank you very much Marco, is your code in Scala? do you have a python
example? Can anyone give me a python example to handle json data on Spark?
**
*Sincerely yours,*
*Raymond*
On Sun, Jan 1, 2017 at 12:29 PM, Marco Mistroni wrote:
> Hi
>y
Hi
you will need to pass the schema, like in the snippet below (even though
the code might have been superseeded in spark 2.0)
import sqlContext.implicits._
val jsonRdd = sc.textFile("file:///c:/tmp/1973-01-11.json")
val schema = (new StructType).add("hour", StringType).add("month",
I found the cause:
I need to "put" the json file onto hdfs first before it can be used, here
is what I did:
hdfs dfs -put /root/Downloads/data/json/world_bank.json
hdfs://localhost:9000/json
df = sqlContext.read.json("/json/")
df.show(10)
.
However, there is a new problem here, the json da
Thank you Miguel, here is the output:
>>>df = sqlContext.read.json("/root/Downloads/data/json")
17/01/01 07:28:19 INFO json.JSONRelation: Listing
hdfs://localhost:9000/root/Downloads/data/json on driver
17/01/01 07:28:19 INFO storage.MemoryStore: Block broadcast_2 stored as
values in memory (estim
Looks like it's trying to treat that path as a folder, try omitting
the file name and just use the folder path.
On Sat, Dec 31, 2016 at 7:58 PM, Raymond Xie wrote:
> Happy new year!!!
>
> I am trying to load a json file into spark, the json file is attached here.
>
> I received the following erro