Hi all, Apparently, we can only specify character delimiter for tokenizing data using Spark-CSV. But what if we have a log file with multiple delimiters or even a multi-character delimiter? e.g. (field1,field2:field3) with delimiters [,:] and (field1::field2::field3) with a single multi-character delimiter [::].
Further, is there a way to specify null fields? e.g. if the data contains "\n" in any field, a null should be stored against that field in DataFrame. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-CSV-Multiple-delimiters-and-Null-fields-support-tp23644.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org