Hi,

> The job completes fine if we reduce the # of rows processed by reducing
>the # of days data being processed.
>

> It just gets stuck after all maps are completed. We checked the logs and
>it says the containers are released.

Looks like you're inserting into a bucketed & partitioned table and facing
connection timeouts due to GC pauses?

By default, the optimization slows down the 1-partition at a time ETL, so
it is disabled.

If your data load falls into the category of >1 partition & has bucketing,
you need to set 

set hive.optimize.sort.dynamic.partition=true;


The largest data-load done using a single SQL statement was the 100Tb ETL
loads for TPC-DS.

In hive-11, people had workarounds using explicit "DISTRIBUTE BY" or "SORT
BY" which didn't scale as well.

If you have those in your query, remove it.

>2016-01-08 19:33:33,119 INFO [Socket Reader #1 for port 43451]
>org.apache.hadoop.ipc.Server: Socket Reader #1 for port 43451:
>readAndProcess from client 39.0.8.17 threw exception
>[java.io.IOException: Connection reset by peer]

Whether that fixes it or not, there are other low-level issues which
trigger similar errors as you scale your cluster to 300+ nodes [1].

https://github.com/t3rmin4t0r/notes/wiki/Hadoop-Tuning-notes



Cheers,
Gopal
[1] - 
<http://www.slideshare.net/Hadoop_Summit/w-1205p230-aradhakrishnan-v3/10>






Reply via email to