Fwd: Re: Configure Hive in Cluster

venkatramanan Tue, 22 Jan 2013 23:24:13 -0800

Hi,

I got the following error while executing the "select count(1) fromtweettrend;"


Below are the exact log msg from the jobtracker Web Interface

*Hive Cli Error:*

Exception in thread "Thread-21" java.lang.RuntimeException: Error whilereading from task log urlatorg.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)atorg.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)

    at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
    at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.UnknownHostException: savitha-VirtualBox

atjava.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)

    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
    at sun.net.www.http.HttpClient.New(HttpClient.java:290)
    at sun.net.www.http.HttpClient.New(HttpClient.java:306)

atsun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)atsun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)atsun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)atsun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)

    at java.net.URL.openStream(URL.java:1037)

atorg.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)

    ... 3 more

FAILED: Execution Error, return code 2 fromorg.apache.hadoop.hive.ql.exec.MapRedTask

MapReduce Jobs Launched:

Job 0: Map: 2 Reduce: 1 Cumulative CPU: 9.0 sec HDFS Read:408671053 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 9 seconds 0 msec

*_syslog logs_*

utCopier.copyOutput(ReduceTask.java:1394)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1326)

2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Task 
attempt_201301231151_0002_r_000000_0: Failed fetch #10 from 
attempt_201301231151_0002_m_000001_0
2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Failed to 
fetch map-output from attempt_201301231151_0002_m_000001_0 even after 
MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read error,  reporting to the 
JobTracker
2013-01-23 12:15:44,885 FATAL org.apache.hadoop.mapred.ReduceTask: Shuffle 
failed with too many fetch failures and insufficient progress!Killing task 
attempt_201301231151_0002_r_000000_0.
2013-01-23 12:15:44,889 WARN org.apache.hadoop.mapred.ReduceTask: 
attempt_201301231151_0002_r_000000_0 adding host savitha-VirtualBox to penalty 
box, next contact in 137 seconds
2013-01-23 12:15:44,889 INFO org.apache.hadoop.mapred.ReduceTask: 
attempt_201301231151_0002_r_000000_0: Got 1 map-outputs from previous failures
2013-01-23 12:15:45,218 FATAL org.apache.hadoop.mapred.Task: 
attempt_201301231151_0002_r_000000_0 GetMapEventsThread Ignoring exception : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate Failed. 
Ignoring request from task: attempt_201301231151_0002_r_000000_0, with JvmId: 
jvm_201301231151_0002_r_1079250852
        at 
org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
        at 
org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3537)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

        at org.apache.hadoop.ipc.Client.call(Client.java:1070)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy1.getMapCompletionEvents(Unknown Source)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2846)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2810)

2013-01-23 12:15:45,220 FATAL org.apache.hadoop.mapred.Task: Failed to contact 
the tasktracker
org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate Failed. 
Ignoring request from task: attempt_201301231151_0002_r_000000_0, with JvmId: 
jvm_201301231151_0002_r_1079250852
        at 
org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
        at 
org.apache.hadoop.mapred.TaskTracker.fatalError(TaskTracker.java:3520)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

        at org.apache.hadoop.ipc.Client.call(Client.java:1070)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy1.fatalError(Unknown Source)
        at org.apache.hadoop.mapred.Task.reportFatalError(Task.java:298)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2829)

thanks,
Venkat

-------- Original Message --------
Subject:        Re: Configure Hive in Cluster
Date:   Thu, 17 Jan 2013 17:23:03 +0530
From:   venkatramanan <venkatraman...@smartek21.com>
Reply-To:       <user@hive.apache.org>
To:     <user@hive.apache.org>

Can you suggest me the mandatory hive parameters and clusteringconfiguration steps


On Thursday 17 January 2013 12:56 PM, Nitin Pawar wrote:

looks like a very small cluster with very limited memory to runmapreduce jobs also number of map/reduce slots on nodes are less so ata time only one map is running.


but still 15 min is a lot of time for 600MB memory

On Thu, Jan 17, 2013 at 12:47 PM, venkatramanan<venkatraman...@smartek21.com <mailto:venkatraman...@smartek21.com>>wrote:


    Below details are the cluster configuration

    Configured Capacity         : 82.8 GB
    DFS Used                          : 1.16 GB
    Non DFS Used                  : 31.95 GB
    DFS Remaining                : 49.69 GB
    DFS Used%                      : 1.4 %
    DFS Remaining%              : 60.01 %
    Live Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=LIVE>
                          : 2
    Dead Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=DEAD>
                        : 0
    Decommissioning Nodes
    <http://localhost:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
    : 0
    Number of Under-Replicated Blocks : 0

    My Select Query is:

    "select * from tweet where Id = 810;"

    This query takes 15 min to complete



    On Thursday 17 January 2013 12:29 PM, Nitin Pawar wrote:

    how many number of nodes you have for select query?
    whats your select query?

    if its just a select * from table then it does not run any
    mapreduce job
    so its just taking time to show data on your screen if you are
    using that query


    On Thu, Jan 17, 2013 at 12:24 PM, venkatramanan
    <venkatraman...@smartek21.com
    <mailto:venkatraman...@smartek21.com>> wrote:

        I didnt set any hive parameters and my total table size is
        610 MB only



        On Thursday 17 January 2013 12:11 PM, Nitin Pawar wrote:

        a bit more details on size of table and select query will help
        also did you set any hive parameters ?


        On Thu, Jan 17, 2013 at 12:12 PM, venkatramanan
        <venkatraman...@smartek21.com
        <mailto:venkatraman...@smartek21.com>> wrote:

            Hi All,

            Am Newbie in apache hive. I have create a table and
            thats points to the HDFS Folder path and its takes 15
            min to execute the simple "*select*" stmt, Can anyone
            suggest me for a best practices and performance
            improvement on hive.

            Thanks in Advance

            Venkat

--Nitin Pawar





--
Nitin Pawar

Fwd: Re: Configure Hive in Cluster

Reply via email to