Re: Parsing Json object definition spanning multiple lines

2014-08-26 Thread Matei Zaharia
You can use sc.wholeTextFiles to read each file as a complete String, though it requires each file to be small enough for one task to process. On August 26, 2014 at 4:01:45 PM, Chris Fregly (ch...@fregly.com) wrote: i've seen this done using mapPartitions() where each partition represents a sin

Re: Parsing Json object definition spanning multiple lines

2014-08-26 Thread Chris Fregly
i've seen this done using mapPartitions() where each partition represents a single, multi-line json file. you can rip through each partition (json file) and parse the json doc as a whole. this assumes you use sc.textFile("/*.json") or equivalent to load in multiple files at once. each json file