OK, more updates. Today I was running the query with Yarn and also turned on DEBUG logging. Here's what I found from the task log for the dangling task.
I noticed that after the RowContainer has been created by Hive, there are a lot of IPC/RPC related logs (still printing), every 3 second apart. My guess is Hive is reading back the rows one row at a time from disk? One row every 3 seconds? No wonder my job never ends.. Could anyone confirm this? Appreciate it! 2017-10-20 13:43:23,144 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 1000 rows for join key [999] 2017-10-20 13:43:23,146 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 2000 rows for join key [999] 2017-10-20 13:43:23,153 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 4000 rows for join key [999] 2017-10-20 13:43:23,178 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 8000 rows for join key [999] 2017-10-20 13:43:23,205 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 16000 rows for join key [999] 2017-10-20 13:43:23,222 INFO [main] org.apache.hadoop.hive.ql.exec.persistence.RowContainer: RowContainer created temp file /tmp/hadoop-cloud1/nm-local-dir/usercache/dbruce/appcache/application_1507669571623_4416/container_e26_1507669571623_4416_01_000418/tmp/hive-rowcontainer5891981157220982604/RowContainer8985737720168249326.tmp 2017-10-20 13:43:23,354 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 32000 rows for join key [999] 2017-10-20 13:43:23,427 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 64000 rows for join key [999] 2017-10-20 13:43:23,597 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 128000 rows for join key [999] 2017-10-20 13:43:23,817 DEBUG [main] org.apache.hadoop.util.Progress: Illegal progress value found, progress is larger than 1. Progress will be changed to 1 2017-10-20 13:43:23,817 DEBUG [main] org.apache.hadoop.util.Progress: Illegal progress value found, progress is larger than 1. Progress will be changed to 1 ... ... 2017-10-20 13:43:25,239 DEBUG [main] org.apache.hadoop.util.Progress: Illegal progress value found, progress is larger than 1. Progress will be changed to 1 2017-10-20 13:43:25,239 DEBUG [main] org.apache.hadoop.util.Progress: Illegal progress value found, progress is larger than 1. Progress will be changed to 1 2017-10-20 13:43:25,248 DEBUG [main] org.apache.hadoop.io.nativeio.NativeIO: Got UserName dbruce for ID 1573865840 from the native implementation 2017-10-20 13:43:25,249 DEBUG [main] org.apache.hadoop.io.nativeio.NativeIO: Got GroupName db_finan for ID 10100 from the native implementation 2017-10-20 13:43:25,249 DEBUG [main] org.apache.hadoop.mapred.FileInputFormat: Time taken to get FileStatuses: 7 2017-10-20 13:43:25,249 INFO [main] org.apache.hadoop.mapred.FileInputFormat: Total input paths to process : 1 2017-10-20 13:43:25,255 DEBUG [main] org.apache.hadoop.mapred.FileInputFormat: Total # of splits generated by getSplits: 1, TimeTaken: 14 2017-10-20 13:43:27,095 DEBUG [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: [ 7382 7463 ] 2017-10-20 13:43:27,096 DEBUG [IPC Parameter Sending Thread #0] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 sending #7 2017-10-20 13:43:27,097 DEBUG [IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 got value #7 2017-10-20 13:43:27,097 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: Call: statusUpdate 2 2017-10-20 13:43:30,161 DEBUG [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: [ 7382 7463 ] 2017-10-20 13:43:30,161 DEBUG [IPC Parameter Sending Thread #0] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 sending #8 2017-10-20 13:43:30,162 DEBUG [IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 got value #8 2017-10-20 13:43:30,162 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: Call: statusUpdate 1 2017-10-20 13:43:33,163 DEBUG [IPC Parameter Sending Thread #0] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 sending #9 2017-10-20 13:43:33,163 DEBUG [IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 got value #9 2017-10-20 13:43:33,163 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: Call: ping 1 2017-10-20 13:43:36,164 DEBUG [IPC Parameter Sending Thread #0] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 sending #10 2017-10-20 13:43:36,165 DEBUG [IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416] org.apache.hadoop.ipc.Client: IPC Client (13419636) connection to / 10.224.174.71:33647 from job_1507669571623_4416 got value #10 2017-10-20 13:43:36,165 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: Call: ping 1 ... ... On Thu, Oct 19, 2017 at 11:35 PM, Daniel Bruce <dbse...@gmail.com> wrote: > Hi Gopal, > > Thanks for your input! In my case I'm using MapReduce not Tez. I figured > I'd better be more specific so as to provide you more details. > > For this job there are 298 maps and 74 reduces. All the maps completed > real fast within 1 minute, and 73 reduces completed in about 2 minutes. > > Now there is only 1 reduce task running (forever). Here's a screenshot for > the job details: https://ibb.co/eBDj6R > > I noticed one thing interesting: MAP_OUTPUT_RECORDS and > REDUCE_INPUT_RECORDS don't match for the whole job (99,073,863 vs. > 98,105,913). > > Here's a screenshot for the counters of the dangling reduce task: > https://ibb.co/dHgyY6 > > The ratio of REDUCE_INPUT_RECORDS and REDUCE_INPUT_GROUPS is 1. What does > it mean? > > For comparison, here's a screenshot for the counters of a different reduce > task which completed within 1 minute. It also has the ratio of 1: > https://ibb.co/mzoHRR > > Another comparison I did is for the tasklog. Below No. 1 is for the > dangling reduce task, and No. 2 and 3 are for two completed reduce tasks. > 1. https://ibb.co/caKVfm > 2. https://ibb.co/earJ0m > 3. https://ibb.co/edQiY6 > > I don't understanding what the running reduce task is doing. Any other > logs that could be helpful? > > Regards, > Daniel > > On Thu, Oct 19, 2017 at 9:45 PM, Gopal Vijayaraghavan <gop...@apache.org> > wrote: > >> > . I didn't see data skew for that reducer. It has similar amount of >> REDUCE_INPUT_RECORDS as other reducers. >> … >> > org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 8000 >> rows for join key [4092813312923569] >> >> >> The ratio of REDUCE_INPUT_RECORDS and REDUCE_INPUT_GROUPS is what is >> relevant. >> >> >> >> The row containers being spilled to disk means that at least 1 key in the >> join has > 10000 values. >> >> If you have Tez, this comes up when you run the SkewAnalyzer. >> >> https://github.com/apache/tez/blob/master/tez-tools/analyzer >> s/job-analyzer/src/main/java/org/apache/tez/analyzer/ >> plugins/SkewAnalyzer.java#L41 >> >> >> >> Cheers, >> >> Gopal >> > >