Hi Naresh,

Thank you for the quick response, appreciate it.
Removing the option("header","true") and trying

df = spark.read.parquet("test.parquet"), now can read the parquet works.
However, I would like to find a way to have the data in csv/readable.
still I cannot save df as csv as it throws.
ava.lang.UnsupportedOperationException: CSV data source does not support
struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data
type.

Any idea?


Best regards,

Mina


On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud <nareshgoud.du...@gmail.com>
wrote:

> In case of storing as parquet file I don’t think it requires header.
> option("header","true")
>
> Give a try by removing header option and then try to read it.  I haven’t
> tried. Just a thought.
>
> Thank you,
> Naresh
>
>
> On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani <aslanim...@gmail.com> wrote:
>
>> Hi,
>>
>>
>> I am using pyspark. To transform my sample data and create model, I use
>> stringIndexer and OneHotEncoder.
>>
>>
>> However, when I try to write data as csv using below command
>>
>> df.coalesce(1).write.option("header","true").mode("
>> overwrite").csv("output.csv")
>>
>>
>> I get UnsupportedOperationException
>>
>> java.lang.UnsupportedOperationException: CSV data source does not
>> support struct<type:tinyint,size:int,indices:array<int>,values:array<double>>
>> data type.
>>
>> Therefore, to save data and avoid getting the error I use
>>
>>
>> df.coalesce(1).write.option("header","true").mode("
>> overwrite").save("output")
>>
>>
>> The above command saves data but it's in parquet format.
>> How can I read parquet file and convert to csv to observe the data?
>>
>> When I use
>>
>> df = spark.read.parquet("1.parquet"), it throws:
>>
>> ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
>> outstanding blocks
>>
>> Your input is appreciated.
>>
>>
>> Best regards,
>>
>> Mina
>>
>>
>>
>> --
> Thanks,
> Naresh
> www.linkedin.com/in/naresh-dulam
> http://hadoopandspark.blogspot.com/
>
>

Reply via email to