Hi All,

Updating further on this issue, as did some further debugging and see some
of the threads which are using high CPU.
And the operation I tried is listPartition on a table that has huge number
of partitions ( around 1.5 million).

One of the thread that was using large chunk of CPU.

838 "pool-1-thread-141" prio=10 tid=0x00007fa51c8d3800 nid=0x2fe runnable
[0x00007fa4f8b8a000]
 839    java.lang.Thread.State: RUNNABLE
 840     at java.net.SocketInputStream.socketRead0(Native Method)
 841     at java.net.SocketInputStream.read(SocketInputStream.java:150)
 842     at java.net.SocketInputStream.read(SocketInputStream.java:121)
 843     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
 844     at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
 845     at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
 846     - locked <0x0000000564aba418> (a java.io.BufferedInputStream)
 847     at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
 848     at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 849     at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 850     at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 851     at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 852     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
 853     at
org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
 854     at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 855     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 856     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 857     at java.lang.Thread.run(Thread.java:722)

Regards,
Manish


On Tue, Jan 20, 2015 at 5:58 PM, Manish Malhotra <
manish.hadoop.w...@gmail.com> wrote:

> Hi All,
>
> I'm using Hive Thrift Server in Production which at peak handles around
> 500 req/min.
> After certain point the Hive Thrift Server is going into the no response
> mode and throws
> Following exception
> "org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out"
>
> As the metastore we are using MySQL, that is being used by Thrift server.
> The design / architecture is like this:
>
> Oozie -- > Hive Action --> ELB (AWS) --> Hive Thrift ( 2 servers) -->
> MySQL (Master) -- > MySQL (Slave).
>
> Software versions:
>
>    Hive version : 0.10.0
>    Hadoop: 1.2.1
>
>
> Looks like when the load is beyond some threshold for certain operations
> it is having problem in responding.
> As the hive jobs sometimes fails because of this issue, we also have a
> auto-restart check to see if the Thrift server is not responding, it stops
> / kills and restart the service.
>
> Other tuning done:
>
> Thrift Server:
>
> Given 11gb heap, and configured CMS GC algo.
>
> MySQL:
>
> Tuned innodb_buffer, tmp_table and max_heap parameters.
>
> So, can somebody please help to understand, what could be the root cause
> for this or somebody faced the similar issue.
>
> I found one related JIRA :
> https://issues.apache.org/jira/browse/HCATALOG-541
>
> But this JIRA shows that Hive Thrift Server shows OOM error, but in my
> case I didnt see any OOM error in my case.
>
>
> Regards,
> Manish
>
> Full Exception Stack:
>
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>     at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
>     at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
>     at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>     at $Proxy7.getDatabase(Unknown Source)
>     at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
>     at
> org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>     at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
>     at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
>     at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Caused by: java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.read(SocketInputStream.java:150)
>     at java.net.SocketInputStream.read(SocketInputStream.java:121)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>     at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>     at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
>     ... 34 more
> 2015-01-20 22:44:12,978 ERROR exec.Task
> (SessionState.java:printError(401)) - FAILED: Error in metadata:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
>     at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
>     at
> org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
>     at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
>     at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
>     at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>
>

Reply via email to