Tar is not out of the box supported. Just store the file as .json.bz2 without using tar.
> On 8 Dec 2016, at 20:18, Maurin Lenglart <mau...@cuberonlabs.com> wrote: > > Hi, > I am trying to load a json file compress in .tar.bz2 but spark throw an error. > I am using pyspark with spark 1.6.2. (Cloudera 5.9) > > What will be the best way to handle that? > I don’t want to have a non-spark job that will just uncompressed the data… > > thanks