Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-05 Thread Saurabh Gulati
and 2 single quotes together'' are looking like a single double quote ". Mvg/Regards Saurabh Gulati From: Saurabh Gulati Sent: 05 January 2023 12:24 To: Sean Owen Cc: User Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used w

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-05 Thread Saurabh Gulati
Its the same input except that headers are also being read with csv reader. Mvg/Regards Saurabh Gulati From: Sean Owen Sent: 04 January 2023 15:12 To: Saurabh Gulati Cc: User Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-04 Thread Sean Owen
h Gulati > Data Platform > ---------- > *From:* Sean Owen > *Sent:* 04 January 2023 14:25 > *To:* Saurabh Gulati > *Cc:* Mich Talebzadeh ; User < > user@spark.apache.org> > *Subject:* Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter u

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-04 Thread Saurabh Gulati
s from df.show()​ and df.select("c").show()​ Mvg/Regards Saurabh Gulati Data Platform ____________________ From: Sean Owen Sent: 04 January 2023 14:25 To: Saurabh Gulati Cc: Mich Talebzadeh ; User Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter use

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-04 Thread Sean Owen
That input is just invalid as CSV for any parser. You end a quoted col without following with a col separator. What would the intended parsing be and how would it work? On Wed, Jan 4, 2023 at 4:30 AM Saurabh Gulati wrote: > > @Sean Owen Also see the example below with quotes > feedback: > > "a"

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-04 Thread Saurabh Gulati
Hey guys, much appreciate your quick responses. To answer your questions, @Mich Talebzadeh We get data from multiple sources, and we don't have any control over what they put in. In this case the column is supposed to contain some feedback and it can also contai