[ https://issues.apache.org/jira/browse/HIVE-11325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J reassigned HIVE-11325: ------------------------------ Assignee: (was: Harsh J) > Infinite loop in HiveHFileOutputFormat > -------------------------------------- > > Key: HIVE-11325 > URL: https://issues.apache.org/jira/browse/HIVE-11325 > Project: Hive > Issue Type: Bug > Components: HBase Handler > Affects Versions: 1.0.0 > Reporter: Harsh J > Attachments: HIVE-11325.patch > > > No idea why {{hbase_handler_bulk.q}} does not catch this if its being run > regularly in Hive builds, but here's the gist of the issue: > The condition at > https://github.com/apache/hive/blob/master/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java#L152-L164 > indicates that we will infinitely loop until we find a file whose last path > component (the name) is equal to the column family name. > In execution, however, the iteration enters an actual infinite loop cause the > file we end up considering as the srcDir name, is actually the region file, > whose name will never match the family name. > This is an example of the IPC the listing loop of a 100% progress task gets > stuck in: > {code} > 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: > 1: Call -> cdh54.vm/172.16.29.132:8020: getListing {src: > "/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c" > startAfter: "" needLocation: false} > 2015-07-21 10:32:20,662 DEBUG [IPC Parameter Sending Thread #1] > org.apache.hadoop.ipc.Client: IPC Client (1551465414) connection to > cdh54.vm/172.16.29.132:8020 from hive sending #510346 > 2015-07-21 10:32:20,662 DEBUG [IPC Client (1551465414) connection to > cdh54.vm/172.16.29.132:8020 from hive] org.apache.hadoop.ipc.Client: IPC > Client (1551465414) connection to cdh54.vm/172.16.29.132:8020 from hive got > value #510346 > 2015-07-21 10:32:20,662 DEBUG [main] org.apache.hadoop.ipc.ProtobufRpcEngine: > Call: getListing took 0ms > 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: > 1: Response <- cdh54.vm/172.16.29.132:8020: getListing {dirList { > partialListing { fileType: IS_FILE path: "" length: 863 permission { perm: > 4600 } owner: "hive" group: "hive" modification_time: 1437454718130 > access_time: 1437454717973 block_replication: 1 blocksize: 134217728 fileId: > 33960 childrenNum: 0 storagePolicy: 0 } remainingEntries: 0 }} > {code} > The path we are getting out of the listing results is > {{/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c}}, > but instead of checking the path's parent {{family}} we're instead looping > infinitely over its hashed filename {{97112ac1c09548ae87bd85af072d2e8c}} > cause it does not match {{family}}. > It stays in the infinite loop therefore, until the MR framework kills it away > due to an idle task timeout (and then since the subsequent task attempts fail > outright, the job fails). > While doing a {{getPath().getParent()}} will resolve that, is that infinite > loop even necessary? Especially given the fact that we throw exceptions if > there are no entries or there is more than one entry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)