Hi,

I am using spark 2.2 csv reader

I have data in following format

123|123|"abc"||""|"xyz"

the requirement is || has to be treated as null
and "" has to be treated as blank character of length 0

I was using option sep as pipe
And option quote as ""
Parsed the data and using regex I was able to fulfill all the mentioned
conditions.
It started failing when I started column values like this "|" i.e.
separator itself has become a column value , spark csv reader started using
this value and made extra columns.

After this I used the escape option on "|", but results are similar.

I then tried dataset with split on "\\|" which had similar outcome

Is there any way to resolve this , with csv reader ?


Thanks and Regards,
Snehasish

Reply via email to