Re: .tar.bz2 in spark

Jörn Franke Thu, 08 Dec 2016 15:42:06 -0800

Tar is not out of the box supported. Just store the file as .json.bz2 without 
using tar.



> On 8 Dec 2016, at 20:18, Maurin Lenglart <mau...@cuberonlabs.com> wrote:
> 
> Hi,
> I am trying to load a json file compress in .tar.bz2 but spark throw an error.
> I am using pyspark with spark 1.6.2. (Cloudera 5.9)
>  
> What will be the best way to handle that?
> I don’t want to have a non-spark job that will just uncompressed the data…
>  
> thanks

Re: .tar.bz2 in spark

Reply via email to