[ 
https://issues.apache.org/jira/browse/SPARK-51199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927475#comment-17927475
 ] 

Snehal Bhatnagar commented on SPARK-51199:
------------------------------------------

Hi [~andreasfranz], I would like to start contributing here, is there any way I 
might help with this?

> Valid CSV records considered malformed
> --------------------------------------
>
>                 Key: SPARK-51199
>                 URL: https://issues.apache.org/jira/browse/SPARK-51199
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.5.4
>         Environment: SparkContext: Running Spark version 3.5.4
> SparkContext: OS info Mac OS X, 15.3, aarch64
> SparkContext: Java version 17.0.14 2025-01-21 LTS
> OpenJDK Runtime Environment Corretto-17.0.14.7.1 (build 17.0.14+7-LTS)
> OpenJDK 64-Bit Server VM Corretto-17.0.14.7.1 (build 17.0.14+7-LTS, mixed 
> mode, sharing)
>            Reporter: Andreas Franz
>            Priority: Major
>
> There is an issue parsing CSV files with a combination of escaped double 
> quotes and commas in a field.
> I've created a small example that demonstrates the issue:
> {code:java}
> package com.example
> import org.apache.spark.sql.SparkSession
> object Example {
>     def main(args: Array[String]): Unit = {
>         val spark = SparkSession.builder()
>             .appName("CSV Example")
>             .master("local[*]")
>             .config("spark.driver.host", "localhost")
>             .config("spark.ui.enabled", "false")
>             .getOrCreate()
>         val csv = spark
>             .read
>             .option("header", "true")
>             .option("mode", "FAILFAST")
>             .csv("./src/main/scala/com/example/example.csv")
>         csv.show(2, truncate = false)
>         spark.stop()
>     }
> } {code}
> {code:java}
> id,region_name,gp_id,gp_name,gp_group_id,gp_group_name,gp_group_region_name 
> 111234567,east,1122723,"Test 1",,, 001234567,east,1122723,"Foo ""Bar"", New 
> York, US",,,
> {code}
> According to [https://www.ietf.org/rfc/rfc4180.txt|http://example.com/] this 
> is a valid CSV record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to