[ https://issues.apache.org/jira/browse/HIVE-10837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096061#comment-15096061 ]
Bohumir Zamecnik commented on HIVE-10837: ----------------------------------------- I've came across this issue on HiveServer2 1.1.1 used via Beeline on CDH CDH-5.4.4-1.cdh5.4.4.p0.4. The problem wasn't limited to inserting, a plain select failed as well. I queried a quite a big partitioned tabled backed by SequenceFiles of Protobuf. Each partition has about 6B records and it around 700GB. Query on a single partition was ok, but querying eg. 30 partitions fails (~19TB). Note that the same query executed via Hive CLI 1.1.1 works ok. The resulting number of rows is really small (the number of partitions, eg. <= 31). The HQL query string itself is small. > Running large queries (inserts) fails and crashes hiveserver2 > ------------------------------------------------------------- > > Key: HIVE-10837 > URL: https://issues.apache.org/jira/browse/HIVE-10837 > Project: Hive > Issue Type: Bug > Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0) > Reporter: Patrick McAnneny > Priority: Critical > > When running a large insert statement through beeline or pyhs2, a thrift > error is returned and hiveserver2 crashes. > I ran into this with large insert statements -- my initial failing query was > around 6million characters. After further testing however it seems like the > failure threshold is based on number of inserted rows rather than the query's > size in characters. My testing shows the failure threshold between 199,000 > and 230,000 inserted rows. > The thrift error is as follows: > Error: org.apache.thrift.transport.TTransportException: > java.net.SocketException: Broken pipe (state=08S01,code=0) > Also note for anyone that tests this issue - when testing different queries I > ran into https://issues.apache.org/jira/browse/HIVE-10836 -- This message was sent by Atlassian JIRA (v6.3.4#6332)