Re: Re: Configure Hive in Cluster

Nitin Pawar Tue, 22 Jan 2013 23:38:02 -0800

when you ran the query, did the VM shutdown ?


On Wed, Jan 23, 2013 at 12:57 PM, venkatramanan <
venkatraman...@smartek21.com> wrote:

>  Hi,
>
> I got the following error while executing the "select count(1) from
> tweettrend;"
>
> Below are the exact log msg from the jobtracker Web Interface
>
> *Hive Cli Error:*
>
> Exception in thread "Thread-21" java.lang.RuntimeException: Error while
> reading from task log url
>     at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)
>     at
> org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)
>     at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
>     at java.lang.Thread.run(Thread.java:722)
> Caused by: java.net.UnknownHostException: savitha-VirtualBox
>     at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>     at java.net.Socket.connect(Socket.java:579)
>     at java.net.Socket.connect(Socket.java:528)
>     at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>     at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
>     at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>     at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>     at
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>     at
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>     at
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>     at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
>     at java.net.URL.openStream(URL.java:1037)
>     at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)
>     ... 3 more
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
> MapReduce Jobs Launched:
> Job 0: Map: 2  Reduce: 1   Cumulative CPU: 9.0 sec   HDFS Read: 408671053
> HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 9 seconds 0 msec
>
> *syslog logs*
>
> utCopier.copyOutput(ReduceTask.java:1394)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1326)
>
> 2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Task 
> attempt_201301231151_0002_r_000000_0: Failed fetch #10 from 
> attempt_201301231151_0002_m_000001_0
> 2013-01-23 12:15:44,884 INFO org.apache.hadoop.mapred.ReduceTask: Failed to 
> fetch map-output from attempt_201301231151_0002_m_000001_0 even after 
> MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read error,  reporting to 
> the JobTracker
> 2013-01-23 12:15:44,885 FATAL org.apache.hadoop.mapred.ReduceTask: Shuffle 
> failed with too many fetch failures and insufficient progress!Killing task 
> attempt_201301231151_0002_r_000000_0.
> 2013-01-23 12:15:44,889 WARN org.apache.hadoop.mapred.ReduceTask: 
> attempt_201301231151_0002_r_000000_0 adding host savitha-VirtualBox to 
> penalty box, next contact in 137 seconds
> 2013-01-23 12:15:44,889 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201301231151_0002_r_000000_0: Got 1 map-outputs from previous failures
> 2013-01-23 12:15:45,218 FATAL org.apache.hadoop.mapred.Task: 
> attempt_201301231151_0002_r_000000_0 GetMapEventsThread Ignoring exception : 
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate 
> Failed. Ignoring request from task: attempt_201301231151_0002_r_000000_0, 
> with JvmId: jvm_201301231151_0002_r_1079250852
>       at 
> org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
>       at 
> org.apache.hadoop.mapred.TaskTracker.getMapCompletionEvents(TaskTracker.java:3537)
>       at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:601)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>
>       at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>       at $Proxy1.getMapCompletionEvents(Unknown Source)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2846)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2810)
>
> 2013-01-23 12:15:45,220 FATAL org.apache.hadoop.mapred.Task: Failed to 
> contact the tasktracker
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: JvmValidate 
> Failed. Ignoring request from task: attempt_201301231151_0002_r_000000_0, 
> with JvmId: jvm_201301231151_0002_r_1079250852
>       at 
> org.apache.hadoop.mapred.TaskTracker.validateJVM(TaskTracker.java:3278)
>       at 
> org.apache.hadoop.mapred.TaskTracker.fatalError(TaskTracker.java:3520)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:601)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>
>       at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>       at $Proxy1.fatalError(Unknown Source)
>       at org.apache.hadoop.mapred.Task.reportFatalError(Task.java:298)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2829)
>
> thanks,
> Venkat
>
> -------- Original Message --------  Subject: Re: Configure Hive in Cluster  
> Date:
> Thu, 17 Jan 2013 17:23:03 +0530  From: venkatramanan
> <venkatraman...@smartek21.com> <venkatraman...@smartek21.com>  Reply-To:
> <user@hive.apache.org> <user@hive.apache.org>  To: 
> <user@hive.apache.org><user@hive.apache.org>
>
>
> Can you suggest me the mandatory hive parameters and clustering
> configuration steps
>
> On Thursday 17 January 2013 12:56 PM, Nitin Pawar wrote:
>
> looks like a very small cluster with very limited memory to run mapreduce
> jobs also number of map/reduce slots on nodes are less so at a time only
> one map is running.
>
>  but still 15 min is a lot of time for 600MB memory
>
>
> On Thu, Jan 17, 2013 at 12:47 PM, venkatramanan <
> venkatraman...@smartek21.com> wrote:
>
>>  Below details are the cluster configuration
>>
>> Configured Capacity         : 82.8 GB
>> DFS Used                          : 1.16 GB
>> Non DFS Used                  : 31.95 GB
>> DFS Remaining                : 49.69 GB
>> DFS Used%                      : 1.4 %
>> DFS Remaining%              : 60.01 %
>> Live Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=LIVE>
>>                   : 2
>> Dead Nodes <http://localhost:50070/dfsnodelist.jsp?whatNodes=DEAD>
>>                 : 0
>> Decommissioning 
>> Nodes<http://localhost:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>: 0
>> Number of Under-Replicated Blocks : 0
>>
>> My Select Query is:
>>
>> "select * from tweet where Id = 810;"
>>
>> This query takes 15 min to complete
>>
>>
>>
>> On Thursday 17 January 2013 12:29 PM, Nitin Pawar wrote:
>>
>> how many number of nodes you have for select query?
>> whats your select query?
>>
>>  if its just a select * from table then it does not run any mapreduce job
>>  so its just taking time to show data on your screen if you are using
>> that query
>>
>>
>> On Thu, Jan 17, 2013 at 12:24 PM, venkatramanan <
>> venkatraman...@smartek21.com> wrote:
>>
>>>  I didnt set any hive parameters and my total table size is 610 MB only
>>>
>>>
>>>
>>> On Thursday 17 January 2013 12:11 PM, Nitin Pawar wrote:
>>>
>>> a bit more details on size of table and select query will help
>>> also did you set any hive parameters ?
>>>
>>>
>>> On Thu, Jan 17, 2013 at 12:12 PM, venkatramanan <
>>> venkatraman...@smartek21.com> wrote:
>>>
>>>>  Hi All,
>>>>
>>>> Am Newbie in apache hive. I have create a table and thats points to the
>>>> HDFS Folder path and its takes 15 min to execute the simple "*select*"
>>>> stmt, Can anyone suggest me for a best practices and performance
>>>> improvement on hive.
>>>>
>>>> Thanks in Advance
>>>>
>>>> Venkat
>>>>
>>>
>>>
>>>
>>>  --
>>> Nitin Pawar
>>>
>>>
>>>
>>
>>
>>  --
>> Nitin Pawar
>>
>>
>>
>
>
>  --
> Nitin Pawar
>
>
>
>
>


-- 
Nitin Pawar

Re: Re: Configure Hive in Cluster

Reply via email to