Hi, Long time user of HCatalog 0.4 and am testing out an upgrade to Hive / HCatalog 0.11.0 as we need windowing functions and ORC
I'm testing the HCatLoader from Pig and am getting the exceptions below using this simple Pig script: sigs_in = load 'signals' using org.apache.hcatalog.pig.HCatLoader(); describe sigs_in; sigs = filter sigs_in by datetime_partition == '2013-10-07_0000'; ... The exceptions (see below) occur in the Pig front-end processing, trying to get the input paths. The Pig describe command returns the schema, so I know there's some communication going on between the LoadFunc and the metastore. Also, if I do: hcat -e "show partitions signals;" I get the list of expected partitions on that table. Any ideas on where to start troubleshooting this issue? I'm using Pig 0.10 with Hive / HCatalog 0.11.0 running on Hadoop 2.0.0-cdh4.1.2. I built Hive/HCatalog from source using: *ant clean package -Dmvn.hadoop.profile=hadoop23 -Dhadoop.mr.rev=23* Exception: Caused by: java.io.IOException: org.shaded.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:87) at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:63) at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:119) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:380) ... 17 more Caused by: org.shaded.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.shaded.thrift.transport.TTransport.readAll(TTransport.java:84) at org.shaded.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.shaded.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.shaded.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.shaded.thrift.TServiceClient.receiveBase(TServiceClient.java:69) *at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_by_filter(ThriftHiveMetastore.java:1738) * * at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_by_filter(ThriftHiveMetastore.java:1722) * * at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:780) * * at org.apache.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:112) * * at org.apache.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:85) * * at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:85) * ... 20 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at org.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) NOTE: Don't worry about the org.shaded.thrift package names as I had to build a shaded JAR for my HCatalog clients to work-around Thrift version issues on my classpath. I tested the same w/o the shading and received the same error.