Re: Skipping Bad Records in Spark
Hi Quizhuang - you have two options: 1) Within the map step define a validation function that will be executed on every record. 2) Use the filter function to create a filtered dataset prior to processing. On 11/14/14, 10:28 AM, "Qiuzhuang Lian" wrote: >Hi, > >MapReduce has the feature of skippi
Skipping Bad Records in Spark
Hi, MapReduce has the feature of skipping bad records. Is there any equivalent in Spark? Should I use filter API to do this? Thanks, Qiuzhuang