ith;
>
> *Cc: *user@spark.apache.org;
>
> *Subject:*Re: Problem while loading saved data
>
>
> Hi Ewan,
>
> Yes, 'people.parquet' is from the first attempt and in that attempt it
> tried to save the same people.json.
>
> It seems that the same folder is c
org;
Subject:Re: Problem while loading saved data
Hi Ewan,
Yes, 'people.parquet' is from the first attempt and in that attempt it tried to
save the same people.json.
It seems that the same folder is created on both the nodes and contents of the
files are distributed between the two servers
hich would
normally be created when the write completes, can you show us your write output?
Thanks,
Ewan
From: Amila De Silva [mailto:jaa...@gmail.com]
Sent: 03 September 2015 05:44
To: Guru Medasani
Cc: user@spark.apache.org
Subject: Re: Problem while loading saved data
Hi Guru,
Thanks for
Hi Guru,
Thanks for the reply.
Yes, I checked if the file exists. But instead of a single file what I
found was a directory having the following structure.
people.parquet
└── _temporary
└── 0
├── task_201509030057_4699_m_00
│ └── part-r-0-b921ed54-53fa-459b-881c-ccc
Hi Amila,
Error says that the ‘people.parquet’ file does not exist. Can you manually
check to see if that file exists?
> Py4JJavaError: An error occurred while calling o53840.parquet.
> : java.lang.AssertionError: assertion failed: No schema defined, and no
> Parquet data file or summary file f
Hi All,
I have a two node spark cluster, to which I'm connecting using IPython
notebook.
To see how data saving/loading works, I simply created a dataframe using
people.json using the Code below;
df = sqlContext.read.json("examples/src/main/resources/people.json")
Then called the following to sa