Hi, > The job completes fine if we reduce the # of rows processed by reducing >the # of days data being processed. >
> It just gets stuck after all maps are completed. We checked the logs and >it says the containers are released. Looks like you're inserting into a bucketed & partitioned table and facing connection timeouts due to GC pauses? By default, the optimization slows down the 1-partition at a time ETL, so it is disabled. If your data load falls into the category of >1 partition & has bucketing, you need to set set hive.optimize.sort.dynamic.partition=true; The largest data-load done using a single SQL statement was the 100Tb ETL loads for TPC-DS. In hive-11, people had workarounds using explicit "DISTRIBUTE BY" or "SORT BY" which didn't scale as well. If you have those in your query, remove it. >2016-01-08 19:33:33,119 INFO [Socket Reader #1 for port 43451] >org.apache.hadoop.ipc.Server: Socket Reader #1 for port 43451: >readAndProcess from client 39.0.8.17 threw exception >[java.io.IOException: Connection reset by peer] Whether that fixes it or not, there are other low-level issues which trigger similar errors as you scale your cluster to 300+ nodes [1]. https://github.com/t3rmin4t0r/notes/wiki/Hadoop-Tuning-notes Cheers, Gopal [1] - <http://www.slideshare.net/Hadoop_Summit/w-1205p230-aradhakrishnan-v3/10>