Ok i find this slides of Yin Huai ( http://spark-summit.org/wp-content/uploads/2014/07/Easy-json-Data-Manipulation-Yin-Huai.pdf )
to read a Json file the code seem pretty simple : sqlContext.jsonFile("data.json") <---- Is this already available in the master branch??? But the question about the use a combination of resources (Memory processing & Disk processing) still remains. Thanks !! On Fri, Jul 4, 2014 at 9:49 AM, Abel Coronado Iruegas < acoronadoirue...@gmail.com> wrote: > Hi everybody > > Someone can tell me if it is possible to read and filter a 60 GB file of > tweets (Json Docs) in a Standalone Spark Deployment that runs in a single > machine with 40 Gb RAM and 8 cores??? > > I mean, is it possible to configure Spark to work with some amount of > memory (20 GB) and the rest of the process in Disk, and avoid OutOfMemory > exceptions???? > > Regards > > Abel >