combitedTextFile and CombineTextInputFormat

2016-05-14 Thread Alexander Pivovarov
Hello Everyone Do you think it would be useful to add combinedTextFile method (which uses CombineTextInputFormat) to SparkContext? It allows one task to read data from multiple text files and control number of RDD partitions by setting mapreduce.input.fileinputformat.split.maxsize def combine

Re: Nested/Chained case statements generate codegen over 64k exception

2016-05-14 Thread Reynold Xin
It might be best to fix this with fallback first, and then figure out how we can do it more intelligently. On Sat, May 14, 2016 at 2:29 AM, Jonathan Gray wrote: > Hi, > > I've raised JIRA SPARK-15258 (with code attached to re-produce problem) > and would like to have a go at fixing it but don'

Nested/Chained case statements generate codegen over 64k exception

2016-05-14 Thread Jonathan Gray
Hi, I've raised JIRA SPARK-15258 (with code attached to re-produce problem) and would like to have a go at fixing it but don't really know where to start. Could anyone provide some pointers? I've looked at the code associated with SPARK-13242 but was hoping to find a way to avoid the codegen fall