Hi Rajesh, Thanks for your quick answer. Seems like we also have a problem with the logs, as even with aggreagation enabled we get the following : /app-logs/yarn/logs/application_1499426430661_0113 does not exist. Log aggregation has not completed or is not enabled.
Still, I tried what you suggested and still got the same problem: INFO : Map 1: 255(+85,-31)/340 INFO : Map 1: 256(+84,-31)/340 INFO : Map 1: 257(+77,-33)/340 INFO : Map 1: 257(+0,-33)/340 ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1499426430661_0119_1_00, diagnostics=[Task failed, taskId=task_1499426430661_0119_1_00_000273, diagnostics=[TaskAttempt 0 failed, info=[Container container_e17_1499426430661_0119_01_000170 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=37464,containerID=container_e17_1499426430661_0119_01_000170] is running beyond physical memory limits. Current usage: 2.5 GB of 2.5 GB physical memory used; 4.4 GB of 5.3 GB virtual memory used. Killing container. Do you have another idea ? Regards, Loïc Loïc CHANEL System Big Data engineer MS&T - Worldline Analytics Platform - Worldline (Villeurbanne, France) 2017-07-07 16:51 GMT+02:00 Rajesh Balamohan <rbalamo...@apache.org>: > You can run *"yarn logs -applicationId application_1499426430661_0113 > > application_1499426430661_**0113.log"* to get the app logs. > > Would suggest you to try with *"hive --hiveconf > tez.grouping.max-size=134217728 --hiveconf tez.grouping.min-size=** 134217728" > *for running your hive query. You may want to adjust this parameter (to > say 256 MB or so) in case too may mappers are created. > > ~Rajesh.B > > > On Fri, Jul 7, 2017 at 8:02 PM, Loïc Chanel <loic.cha...@telecomnancy.net> > wrote: > >> Hi guys, >> >> I'm having some troubles with Tez when I try to load some data stored in >> small JSON files in HDFS into a Hive table. >> >> At first I got some Out of memory exceptions, so I tried increasing the >> amount of memory allocated to Tez, until the problem turned to a GC >> Overhead limit exceeded after 10 GB of RAM was allocated to Tez containers. >> >> So I upgraded my common sense and put back memory limits to a normal >> level, and now the problem I hit is the following : >> >> INFO : Map 1: 276(+63,-84)/339 >> INFO : Map 1: 276(+63,-85)/339 >> INFO : Map 1: 276(+63,-85)/339 >> INFO : Map 1: 276(+0,-86)/339 >> INFO : Map 1: 276(+0,-86)/339 >> ERROR : Status: Failed >> ERROR : Status: Failed >> ERROR : Vertex failed, vertexName=Map 1, >> vertexId=vertex_1499426430661_0113_1_00, >> diagnostics=[Task failed, taskId=task_1499426430661_0113_1_00_000241, >> diagnostics=[TaskAttempt 0 failed, info=[Container >> container_e17_1499426430661_0113_01_000170 finished with diagnostics set >> to [Container failed, exitCode=-104. Container >> [pid=59528,containerID=container_e17_1499426430661_0113_01_000170] is >> running beyond physical memory limits. Current usage: 2.7 GB of 2.5 GB >> physical memory used; 4.4 GB of 5.3 GB virtual memory used. Killing >> container. >> >> The problem is I can't see how the container could be allocated so much >> memory, and why can't Tez split the jobs into smaller ones when it fails >> for memory reasons. >> >> FYI, in YARN, Max container memory is 92160 MB, in MR2 Map can have 4GB >> and Reduce 5GB, Tez container size is set to 2560 MB and tez.grouping. >> max-size is set to 1073741824. >> >> If you need more information feel free to ask. >> >> I am currently running out of ideas on how to debug this as I have a >> limited access to Tez container logs, so any inputs will be highly >> appreciated. >> >> Thanks ! >> >> >> Loïc >> >> Loïc CHANEL >> System Big Data engineer >> MS&T - Worldline Analytics Platform - Worldline (Villeurbanne, France) >> > >