Spark-CSV: Multiple delimiters and Null fields support

Anas Sherwani Mon, 06 Jul 2015 02:16:12 -0700

Hi all,

Apparently, we can only specify character delimiter for tokenizing data
using Spark-CSV. But what if we have a log file with multiple delimiters or
even a multi-character delimiter? e.g. (field1,field2:field3) with
delimiters [,:] and (field1::field2::field3) with a single multi-character
delimiter [::].


Further, is there a way to specify null fields? e.g. if the data contains
"\n" in any field, a null should be stored against that field in DataFrame.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-CSV-Multiple-delimiters-and-Null-fields-support-tp23644.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark-CSV: Multiple delimiters and Null fields support

Reply via email to