[ https://issues.apache.org/jira/browse/HIVE-28409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868934#comment-17868934 ]
Krisztian Kasa commented on HIVE-28409: --------------------------------------- The goal of this jira is to * decouple Hive and {{org.apache.atlas.hive.hook.HiveHook}} by removing the hardcoded constant and introduce general setting to specify which statements should collect column lineage info. * restore column lineage info collection at create view. > Column lineage when creating view is missing if atlas HiveHook is set > --------------------------------------------------------------------- > > Key: HIVE-28409 > URL: https://issues.apache.org/jira/browse/HIVE-28409 > Project: Hive > Issue Type: Bug > Components: lineage > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > > Column lineage info is collected by > {{{}org.apache.hadoop.hive.ql.optimizer.lineage.Generator{}}}. This is called > during Hive optimizations and view creation if one of these conditions is met: > {code:java} > hiveConf.getBoolVar(HiveConf.ConfVars.HIVE_LINEAGE_INFO) > || > postExecHooks.contains("org.apache.hadoop.hive.ql.hooks.PostExecutePrinter") > || > postExecHooks.contains("org.apache.hadoop.hive.ql.hooks.LineageLogger") > || postExecHooks.contains("org.apache.atlas.hive.hook.HiveHook") > {code} > [https://github.com/apache/hive/blob/09553fca66ff69ff870c8a181750b70d81a8640e/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java#L78-L81] > and > [https://github.com/apache/hive/blob/09553fca66ff69ff870c8a181750b70d81a8640e/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L13226-L13228] > However HIVE-17125 introduced more conditions which affects only the > {{org.apache.atlas.hive.hook.HiveHook}} > [https://github.com/apache/hive/blob/09553fca66ff69ff870c8a181750b70d81a8640e/ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java#L75-L86] > > Later HIVE-23244 changed the code handles view creation. Since there are no > tests for testing view creation when {{org.apache.atlas.hive.hook.HiveHook}} > is specified at all the new code skips column lineage info collection. > The tests we have for testing column lineage info collection are using > [LineageLogger.java|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/LineageLogger.java] > which doesn't have any restriction in the Generator so column lineage info > is always collected when LineageLogger is set. -- This message was sent by Atlassian Jira (v8.20.10#820010)