actually how many tables are involved here.

what is the version of Hive used? Sorry I have no idea about Cloudera 5.5.1
spec.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 13 April 2016 at 19:20, pseudo oduesp <pseudo20...@gmail.com> wrote:

> hi guys ,
> i have this error after 5 hours of processing i make lot of joins 14 left
> joins
> with small table :
>
>
>
>  i saw in the spark ui  and console log evrithing ok but when he save
> last join i get this error
>
> Py4JJavaError: An error occurred while calling o115.parquet. _metadata is
> not a Parquet file (too small)
>
> i use 4 containers  26 go each and 8 cores i increase number of partition
> and  i use broadcast join  whithout succes i get log file but he s large 57
> mo i can't share with you .
>
> i use pyspark 1.5.0 on cloudera 5.5.1 and yarn  and i use
> hivecontext  for dealing with data.
>
>
>
>

Reply via email to