[ https://issues.apache.org/jira/browse/HIVE-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977412#comment-16977412 ]
Abhishek commented on HIVE-20587: --------------------------------- it can be tuned through by editing vi /usr/local/hadoop-3.1.3/etc/hadoop/mapred-site.xml <property> <name>mapreduce.map.memory.mb</name> <value>4096</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>4096</value> </property> rest could be fine > Reduce phase could not be completed on MapReduce engine > ------------------------------------------------------- > > Key: HIVE-20587 > URL: https://issues.apache.org/jira/browse/HIVE-20587 > Project: Hive > Issue Type: Bug > Reporter: Mykhailo Kysliuk > Priority: Minor > > Steps to reproduce: > {code} > CREATE TABLE IF NOT EXISTS `test`( > `account_id` bigint, > `subscription_id` bigint, > `previous_subscription_id` bigint, > `subscription_effective_from_dt` date, > `subscription_effective_to_dt` date, > `order_completed_dtm` timestamp, > `subscription_status` string) STORED AS ORC; > insert into test values (1, 1, 1, NULL, NULL, NULL, 'ss'); > SELECT > account_id, > subscription_id, > previous_subscription_id, > subscription_status, > subscription_effective_from_dt, > subscription_effective_to_dt > FROM > ( > SELECT > DISTINCT account_id, > subscription_id, > previous_subscription_id, > subscription_status, > subscription_effective_from_dt, > subscription_effective_to_dt, > order_completed_dtm, > ROW_NUMBER()OVER(PARTITION BY ACCOUNT_ID) as r1 > FROM test > WHERE account_id=1 > ) AS RTE where r1=1; > {code} > Expected result: > {code} > OK > 1 1 1 ss NULL NULL > {code} > Actual result: > {code} > Starting Job = job_1537276416358_0006, Tracking URL = > https://node10.cluster.com:8090/proxy/application_1537276416358_0006/ > Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job -kill > job_1537276416358_0006 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 1 > 2018-09-18 13:21:00,754 Stage-1 map = 0%, reduce = 0% > 2018-09-18 13:21:04,913 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:22:05,152 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:23:05,540 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:24:06,532 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:25:07,477 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:26:07,914 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:27:08,021 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:28:08,935 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:29:08,970 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:30:09,054 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:31:09,965 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:32:10,922 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:33:10,964 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:34:11,005 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:35:11,978 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:36:13,057 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:37:13,208 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:38:13,394 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:39:13,719 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:40:13,729 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:41:14,103 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:42:14,160 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:43:14,241 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:44:14,345 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:45:15,353 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:46:15,546 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:47:15,608 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:48:16,481 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:49:16,522 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:50:16,774 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:51:17,784 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:52:17,838 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:53:17,868 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:54:18,010 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:55:18,287 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:56:18,310 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:57:18,628 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:58:19,061 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 13:59:19,359 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 14:00:19,534 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 14:01:19,707 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 14:02:20,670 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.3 > sec > 2018-09-18 14:03:01,193 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 1.3 sec > MapReduce Total cumulative CPU time: 1 seconds 300 msec > Ended Job = job_1537276416358_0006 with errors > Error during job, obtaining debugging information... > Examining task ID: task_1537276416358_0006_m_000000 (and more) from job > job_1537276416358_0006 > Task with the most failures(4): > ----- > Task ID: > task_1537276416358_0006_r_000000 > ----- > Diagnostic Messages for this Task: > AttemptID:attempt_1537276416358_0006_r_000000_3 Timed out after 600 secs > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 1.3 sec HDFS Read: 0 > HDFS Write: 0 FAIL > Total MapReduce CPU Time Spent: 1 seconds 300 msec > {code} > On tez engine this query runs fine. > At yarn logs: > {code} > 2018-09-18 14:53:18,012 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated > containers 1 > 2018-09-18 14:53:18,012 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce > 2018-09-18 14:53:18,013 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_e1322_1537276416358_0014_01_000003 to > attempt_1537276416358_0014_r_000000_0 > 2018-09-18 14:53:18,013 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:1 > CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:1 RackLocal:0 > 2018-09-18 14:53:18,020 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapred.Task: mapOutputFile class: > org.apache.hadoop.mapred.MapRFsOutputFile > 2018-09-18 14:53:18,021 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved node10.cluster.com to > /default-rack > 2018-09-18 14:53:18,021 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_0 TaskAttempt Transitioned from > UNASSIGNED to ASSIGNED > 2018-09-18 14:53:18,021 INFO [ContainerLauncher #2] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_LAUNCH for container > container_e1322_1537276416358_0014_01_000003 taskAttempt > attempt_1537276416358_0014_r_000000_0 > 2018-09-18 14:53:18,021 INFO [ContainerLauncher #2] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching > attempt_1537276416358_0014_r_000000_0 > 2018-09-18 14:53:18,021 INFO [ContainerLauncher #2] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : node10.cluster.com:8099 > 2018-09-18 14:53:18,030 INFO [ContainerLauncher #2] > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: > Shuffle port returned by ContainerManager for > attempt_1537276416358_0014_r_000000_0 : 13562 > 2018-09-18 14:53:18,030 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: > [attempt_1537276416358_0014_r_000000_0] using containerId: > [container_e1322_1537276416358_0014_01_000003 on NM: [node10.cluster.com:8099] > 2018-09-18 14:53:18,030 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_0 TaskAttempt Transitioned from ASSIGNED > to RUNNING > 2018-09-18 14:53:18,030 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START > task_1537276416358_0014_r_000000 > 2018-09-18 14:53:18,031 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1537276416358_0014_r_000000 Task Transitioned from SCHEDULED to RUNNING > 2018-09-18 14:53:19,018 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1537276416358_0014: ask=1 release= 0 newContainers=0 > finishedContainers=0 resourcelimit=<memory:8784, vCores:1, disks:0.0> > knownNMs=1 > 2018-09-18 15:03:41,844 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1537276416358_0014_r_000000_0: > AttemptID:attempt_1537276416358_0014_r_000000_0 Timed out after 600 secs > 2018-09-18 15:03:41,845 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_0 TaskAttempt Transitioned from RUNNING > to FAIL_CONTAINER_CLEANUP > 2018-09-18 15:03:41,846 INFO [ContainerLauncher #3] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_CLEANUP for container > container_e1322_1537276416358_0014_01_000003 taskAttempt > attempt_1537276416358_0014_r_000000_0 > 2018-09-18 15:03:41,847 INFO [ContainerLauncher #3] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING > attempt_1537276416358_0014_r_000000_0 > 2018-09-18 15:03:41,848 INFO [ContainerLauncher #3] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : node10.cluster.com:8099 > 2018-09-18 15:03:41,881 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_0 TaskAttempt Transitioned from > FAIL_CONTAINER_CLEANUP to FAIL_TASK_CLEANUP > 2018-09-18 15:03:41,882 INFO [CommitterEvent Processor #1] > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing > the event EventType: TASK_ABORT > 2018-09-18 15:03:41,885 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_0 TaskAttempt Transitioned from > FAIL_TASK_CLEANUP to FAILED > 2018-09-18 15:03:41,888 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1537276416358_0014_r_000000_1 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2018-09-18 15:03:41,888 INFO [Thread-49] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on > node node10.cluster.com > 2018-09-18 15:03:42,510 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before > Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 > AssignedReds:1 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 > HostLocal:1 RackLocal:0 > 2018-09-18 15:03:42,513 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating > schedule, headroom=<memory:8784, vCores:1, disks:0.0> > 2018-09-18 15:03:42,513 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps > assigned. Ramping up all remaining reduces:1 > 2018-09-18 15:03:42,513 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:0 AssignedReds:1 > CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:1 RackLocal:0 > 2018-09-18 15:03:43,515 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1537276416358_0014: ask=1 release= 0 newContainers=0 > finishedContainers=1 resourcelimit=<memory:12880, vCores:2, disks:1.33> > knownNMs=1 > 2018-09-18 15:03:43,515 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received > completed container container_e1322_1537276416358_0014_01_000003 > 2018-09-18 15:03:43,515 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:0 AssignedReds:0 > CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:1 RackLocal:0 > 2018-09-18 15:03:43,515 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1537276416358_0014_r_000000_0: Container killed by the > ApplicationMaster. > Container killed on request. Exit code is 143 > Container exited with a non-zero exit code 143. > {code} > Maybe this is not a bug, but MR has not enough resources to run this query. -- This message was sent by Atlassian Jira (v8.3.4#803005)