Hi,

Long time user of HCatalog 0.4 and am testing out an upgrade to Hive /
HCatalog 0.11.0 as we need windowing functions and ORC

I'm testing the HCatLoader from Pig and am getting the exceptions below
using this simple Pig script:

sigs_in = load 'signals' using org.apache.hcatalog.pig.HCatLoader();
describe sigs_in;
sigs = filter sigs_in by datetime_partition == '2013-10-07_0000';
...

The exceptions (see below) occur in the Pig front-end processing, trying to
get the input paths. The Pig describe command returns the schema, so I know
there's some communication going on between the LoadFunc and the metastore.
Also, if I do: hcat -e "show partitions signals;" I get the list of
expected partitions on that table.

Any ideas on where to start troubleshooting this issue? I'm using Pig 0.10
with Hive / HCatalog 0.11.0 running on Hadoop 2.0.0-cdh4.1.2.

I built Hive/HCatalog from source using: *ant clean package
-Dmvn.hadoop.profile=hadoop23 -Dhadoop.mr.rev=23*

Exception:

Caused by: java.io.IOException:
org.shaded.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at
org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:87)
at
org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:63)
at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:119)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:380)
... 17 more
Caused by: org.shaded.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at
org.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.shaded.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.shaded.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.shaded.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
org.shaded.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.shaded.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
*at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_by_filter(ThriftHiveMetastore.java:1738)
*
* at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_by_filter(ThriftHiveMetastore.java:1722)
*
* at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:780)
*
* at
org.apache.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:112)
*
* at
org.apache.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:85)
*
* at
org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:85)
*
... 20 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)


NOTE: Don't worry about the org.shaded.thrift package names as I had to
build a shaded JAR for my HCatalog clients to work-around Thrift version
issues on my classpath. I tested the same w/o the shading and received the
same error.

Reply via email to