Silly question but...

2015-01-20 15:01:45,366 INFO [IPC Server handler 0 on 41329]
org.apache.tez.dag.app.rm.container.AMContainerImpl: AMContainer
container_1420748315294_70716_01_000002 transitioned from IDLE to
RUNNING via event C_PULL_TA
2015-01-20 15:01:45,366 INFO [IPC Server handler 0 on 41329]
org.apache.tez.dag.app.TaskAttemptListenerImpTezDag: Container with
id: container_1420748315294_70716_01_000002 given task:
attempt_1420748315294_70716_1_01_000000_0
2015-01-20 15:01:45,367 INFO [AsyncDispatcher event handler]
org.apache.hadoop.yarn.util.RackResolver: Resolved
production-hadoop-cdh-64-77.use1.huffpo.net to /default
2015-01-20 15:01:45,369 INFO [AsyncDispatcher event handler]
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl: TaskAttempt:
[attempt_1420748315294_70716_1_01_000000_0] started. Is using
containerId: [container_1420748315294_70716_01_000002] on NM:
[production-hadoop-cdh-64-77.use1.huffpo.net:8041]
2015-01-20 15:01:45,374 INFO [AsyncDispatcher event handler]
org.apache.tez.dag.history.HistoryEventHandler:
[HISTORY][DAG:dag_1420748315294_70716_1][Event:TASK_ATTEMPT_STARTED]:
vertexName=Map 1,
taskAttemptId=attempt_1420748315294_70716_1_01_000000_0,
startTime=1421766105367,
containerId=container_1420748315294_70716_01_000002,
nodeId=production-hadoop-cdh-64-77.use1.huffpo.net:8041,
inProgressLogs=production-hadoop-cdh-64-77.use1.huffpo.net:8042/node/containerlogs/container_1420748315294_70716_01_000002/ecapriolo,
completedLogs=
2015-01-20 15:01:45,374 INFO [AsyncDispatcher event handler]
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl:
attempt_1420748315294_70716_1_01_000000_0 TaskAttempt Transitioned
from START_WAIT to RUNNING due to event TA_STARTED_REMOTELY
2015-01-20 15:01:45,375 INFO [AsyncDispatcher event handler]
org.apache.tez.common.counters.Limits: Counter limits initialized with
parameters:  GROUP_NAME_MAX=128, MAX_GROUPS=500, COUNTER_NAME_MAX=64,
MAX_COUNTERS=1200
2015-01-20 15:01:45,384 INFO [AsyncDispatcher event handler]
org.apache.tez.dag.app.dag.impl.TaskImpl:
task_1420748315294_70716_1_01_000000 Task Transitioned from SCHEDULED
to RUNNING
2015-01-20 15:01:45,758 INFO [IPC Server handler 2 on 41329]
org.apache.tez.dag.app.dag.impl.TaskImpl:
TaskAttempt:attempt_1420748315294_70716_1_01_000000_0 sent events:
(0-1)
2015-01-20 15:01:49,495 INFO [AMRM Callback Handler Thread]
org.apache.tez.dag.app.rm.TaskScheduler: Allocated container
completed:container_1420748315294_70716_01_000002 last allocated to
task: attempt_1420748315294_70716_1_01_000000_0
2015-01-20 15:01:49,498 INFO [AsyncDispatcher event handler]
org.apache.tez.dag.app.rm.container.AMContainerImpl: Container
container_1420748315294_70716_01_000002 exited with diagnostics set to
Exception from container-launch.
Container id: container_1420748315294_70716_01_000002
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Is there any way to log the command being run that causes then shell to fail?



On Tue, Jan 20, 2015 at 2:09 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

> Java is on the PATH of our datanode/nodemanager systems
>
> [ecapriolo@production-hadoop-cdh-67-142 ~]$ which java
> /usr/bin/java
>
> [ecapriolo@production-hadoop-cdh-67-142 ~]$ java -version
> java version "1.7.0_65"
> OpenJDK Runtime Environment (rhel-2.5.1.2.el6_5-x86_64 u65-b17)
> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
>
>
> On Tue, Jan 20, 2015 at 2:02 PM, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
>
>> My guess is..
>>  "java" binary is not in PATH of the shell script that launches the
>> container.. try creating a symbolic link in /bin/ to point to java..
>>
>> On Tue, Jan 20, 2015 at 7:22 AM, Edward Capriolo <edlinuxg...@gmail.com>
>> wrote:
>>
>>> It seems that CDH does not ship with enough jars to run tez out of the
>>> box.
>>>
>>> I have found the related cloudera forked pom.
>>>
>>> In this pom hive is built against tez 0.4.1-incubating-tez2.0-SHAPSHOT
>>>
>>> Thus I followed the instructions here:
>>>
>>> http://tez.apache.org/install_pre_0_5_0.html
>>>
>>> hive> dfs -lsr /apps
>>>     > ;
>>> lsr: DEPRECATED: Please use 'ls -R' instead.
>>> drwxr-xr-x   - ecapriolo supergroup          0 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating
>>> drwxr-xr-x   - ecapriolo supergroup          0 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib
>>> -rw-r--r--   3 ecapriolo supergroup     303139 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/avro-1.7.4.jar
>>> -rw-r--r--   3 ecapriolo supergroup      41123 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/commons-cli-1.2.jar
>>> -rw-r--r--   3 ecapriolo supergroup     610259 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/commons-collections4-4.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup    1648200 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/guava-11.0.2.jar
>>> -rw-r--r--   3 ecapriolo supergroup     710492 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/guice-3.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup     656365 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/hadoop-mapreduce-client-common-2.2.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup    1455001 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/hadoop-mapreduce-client-core-2.2.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup      21537 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/hadoop-mapreduce-client-shuffle-2.2.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup      81743 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/jettison-1.3.4.jar
>>> -rw-r--r--   3 ecapriolo supergroup     533455 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/protobuf-java-2.5.0.jar
>>> -rw-r--r--   3 ecapriolo supergroup     995968 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/lib/snappy-java-1.0.4.1.jar
>>> -rw-r--r--   3 ecapriolo supergroup     752332 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-api-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup      34089 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-common-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup     980132 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-dag-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup     246395 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-mapreduce-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup     199984 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-mapreduce-examples-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup     114676 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-runtime-internals-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup     352835 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-runtime-library-0.4.1-incubating.jar
>>> -rw-r--r--   3 ecapriolo supergroup       6832 2015-01-16 23:00
>>> /apps/tez-0.4.1-incubating/tez-tests-0.4.1-incubating.jar
>>>
>>> This is my tez-site.xml
>>>
>>> <configuration>
>>>   <property>
>>>     <name>tez.lib.uris</name>
>>>     <value>${fs.default.name}/apps/tez-0.4.1-incubating,${
>>> fs.default.name}/apps/tez-0.4.1-incubating/lib/</value>
>>>   </property>
>>> </configuration>
>>>
>>> [ecapriolo@production-hadoop-cdh-69-7 ~]$ ls -lahR
>>> /home/ecapriolo/tez-0.4.1-incubating/
>>> /home/ecapriolo/tez-0.4.1-incubating/:
>>> total 2.7M
>>> drwxrwxr-x 3 ecapriolo ecapriolo 4.0K Jan 16 22:54 .
>>> drwx------ 7 ecapriolo ecapriolo  20K Jan 20 15:20 ..
>>> drwxrwxr-x 2 ecapriolo ecapriolo 4.0K Jan 16 22:54 lib
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 735K Jan 16 22:54
>>> tez-api-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo  34K Jan 16 22:54
>>> tez-common-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 958K Jan 16 22:54
>>> tez-dag-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 241K Jan 16 22:54
>>> tez-mapreduce-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 196K Jan 16 22:54
>>> tez-mapreduce-examples-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 112K Jan 16 22:54
>>> tez-runtime-internals-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 345K Jan 16 22:54
>>> tez-runtime-library-0.4.1-incubating.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 6.7K Jan 16 22:54
>>> tez-tests-0.4.1-incubating.jar
>>>
>>> /home/ecapriolo/tez-0.4.1-incubating/lib:
>>> total 6.8M
>>> drwxrwxr-x 2 ecapriolo ecapriolo 4.0K Jan 16 22:54 .
>>> drwxrwxr-x 3 ecapriolo ecapriolo 4.0K Jan 16 22:54 ..
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 297K Jan 16 22:54 avro-1.7.4.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo  41K Jan 16 22:54 commons-cli-1.2.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 596K Jan 16 22:54
>>> commons-collections4-4.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 1.6M Jan 16 22:54 guava-11.0.2.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 694K Jan 16 22:54 guice-3.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 641K Jan 16 22:54
>>> hadoop-mapreduce-client-common-2.2.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 1.4M Jan 16 22:54
>>> hadoop-mapreduce-client-core-2.2.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo  22K Jan 16 22:54
>>> hadoop-mapreduce-client-shuffle-2.2.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo  80K Jan 16 22:54 jettison-1.3.4.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 521K Jan 16 22:54
>>> protobuf-java-2.5.0.jar
>>> -rw-rw-r-- 1 ecapriolo ecapriolo 973K Jan 16 22:54
>>> snappy-java-1.0.4.1.jar
>>>
>>> tez.sh
>>> TEZ_CONF_DIR=/home/ecapriolo
>>> TEZ_JARS=/home/ecapriolo/tez-0.4.1-incubating
>>> export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
>>> #hive -hiveconf mapreduce.framework.name=yarn-tez
>>> #hive -hiveconf hive.root.logger=DEGUG,console
>>> hive
>>>
>>> hive> set
>>> hive.execution.engine=tez;
>>> hive> select sum(viral_count) from author_article_hourly where dt=
>>> 2015011622;
>>> Total jobs = 1
>>> Launching Job 1 out of 1
>>>
>>>
>>> Status: Running (application id: application_1420748315294_70716)
>>>
>>> Map 1: -/-    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Map 1: 0/1    Reducer 2: 0/1
>>> Status: Failed
>>> Vertex failed, vertexName=Map 1,
>>> vertexId=vertex_1420748315294_70716_1_01, diagnostics=[Task failed,
>>> taskId=task_1420748315294_70716_1_01_000000,
>>> diagnostics=[AttemptID:attempt_1420748315294_70716_1_01_000000_0
>>> Info:Container container_1420748315294_70716_01_000002 COMPLETED with
>>> diagnostics set to [Exception from container-launch.
>>> Container id: container_1420748315294_70716_01_000002
>>> Exit code: 255
>>> Stack trace: ExitCodeException exitCode=255:
>>>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>>>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>>>     at
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>>
>>> Other logs are here (from the containers that I can extract from yawn.
>>>
>>> Exit code: 255
>>> Stack trace: ExitCodeException exitCode=255:
>>>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>>>     at org.apache.hadoop.util.Shell.run(Shell.java:455)
>>>     at
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:744)
>>>
>>> application_1420748315294_70716
>>>
>>> digraph _1 {
>>> graph [ label="_1", fontsize=24, fontname=Helvetica];
>>> node [fontsize=12, fontname=Helvetica];
>>> edge [fontsize=9, fontcolor=blue, fontname=Arial];
>>> "_1.Map_1_author_article_hourly" [ label =
>>> "Map_1[author_article_hourly]", shape = "box" ];
>>> "_1.Map_1_author_article_hourly" -> "_1.Map_1" [ label = "Input
>>> [inputClass=MRInputLegacy,\n initializer=MRInputAMSplitGenerator]" ];
>>> "_1.Reducer_2" [ label = "Reducer_2[ReduceTezProcessor]" ];
>>> "_1.Reducer_2" -> "_1.Reducer_2_out_Reducer_2" [ label = "Output
>>> [outputClass=MROutput,\n initializer=]" ];
>>> "_1.Map_1" [ label = "Map_1[MapTezProcessor]" ];
>>> "_1.Map_1" -> "_1.Reducer_2" [ label = "[input=OnFileSortedOutput,\n
>>> output=ShuffledMergedInputLegacy,\n dataMovement=SCATTER_GATHER,\n
>>> schedulingType=SEQUENTIAL]" ];
>>> "_1.Reducer_2_out_Reducer_2" [ label = "Reducer_2[out_Reducer_2]", shape
>>> = "box" ];
>>> }
>>>
>>>
>>> Container exited with a non-zero exit code 255
>>> ]], Vertex failed as one or more tasks failed. failedTasks:1]
>>> Vertex killed, vertexName=Reducer 2,
>>> vertexId=vertex_1420748315294_70716_1_00, diagnostics=[Vertex received Kill
>>> while in RUNNING state., Vertex killed as other vertex failed.
>>> failedTasks:0]
>>> DAG failed due to vertex failure. failedVertices:1 killedVertices:1,
>>> counters=Counters: 2, org.apache.tez.common.counters.DAGCounter,
>>> NUM_FAILED_TASKS=4, TOTAL_LAUNCHED_TASKS=4
>>> 2015-01-20 15:02:21,934 INFO [AsyncDispatcher event handler]
>>> org.apache.tez.dag.app.dag.impl.DAGImpl: DAG: dag_1420748315294_70716_1
>>> finished with state: FAILED
>>> 2015-01-20 15:02:21,934 INFO [AsyncDispatcher event handler]
>>> org.apache.tez.dag.app.dag.impl.DAGImpl: dag_1420748315294_70716_1
>>> transitioned from TERMINATING to FAILED
>>> 2015-01-20 15:02:21,935 INFO [AsyncDispatcher event handler]
>>> org.apache.tez.dag.app.DAGAppMaster: DAG completed,
>>> dagId=dag_1420748315294_70716_1, dagState=FAILED
>>> 2015-01-20 15:02:21,935 INFO [AsyncDispatcher event handler]
>>> org.apache.tez.common.TezUtils: Redirecting log files based on addend:
>>> dag_1420748315294_70716_1_post
>>>
>>> Has anyone got this working or have and ideas as to what is up here?
>>>
>>> Thanks,
>>> Edward
>>>
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Reply via email to