Hi all,

I am using the databricks csv library to load some data into a data frame.
https://github.com/databricks/spark-csv


I am trying to confirm that failfast mode works correctly and aborts
execution upon receiving an invalid csv file.  But have not been able to
see it fail yet after testing numerous invalid csv files.  Any advice?

spark 1.3.1 running on mapr vm 4.1.0 java 1.7


SparkConf conf = new SparkConf().setAppName("Dataframe testing");

JavaSparkContext sc = new JavaSparkContext(conf);


SQLContext sqlContext = new SQLContext(sc);
HashMap<String, String> options = new HashMap<String, String>();
options.put("header", "true");
options.put("path", args[0]);
options.put("mode", "FAILFAST");
//partner data
DataFrame partnerData = sqlContext.load("com.databricks.spark.csv", options
);
//register partnerData table in spark sql
partnerData.registerTempTable("partnerData");

partnerData.printSchema();
partnerData.show();


It just runs like normal, and will output the data, even with an invalid
csv file.


Thanks!

Reply via email to