[ https://issues.apache.org/jira/browse/HIVE-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474570#comment-16474570 ]
Vineet Garg commented on HIVE-19474: ------------------------------------ Pushed to branch-3 > Decimal type should be casted as part of the CTAS or INSERT Clause. > ------------------------------------------------------------------- > > Key: HIVE-19474 > URL: https://issues.apache.org/jira/browse/HIVE-19474 > Project: Hive > Issue Type: Bug > Components: Druid integration > Reporter: slim bouguerra > Assignee: slim bouguerra > Priority: Major > Labels: druid > Fix For: 3.0.0, 3.1.0 > > Attachments: HIVE-19474.patch > > > HIVE-18569 introduced a runtime config variable to allow the indexing of > Decimal as Double, this leads to kind of messy state, Hive metadata think the > column is still decimal while it is stored as double. Since the Hive metadata > of the column is Decimal the logical optimizer will not push down aggregates. > i tried to fix this by adding some logic to the application but it makes the > code very clumsy with lot of branches. Instead i propose to revert > HIVE-18569 and let the user introduce an explicit cast this will be better > since the metada reflects actual storage type and push down aggregates will > kick in and there is no config needed without adding any code or bug. > cc [~ashutoshc] and [~nishantbangarwa] > You can see the difference with the following DDL > {code:java} > create table test_base_table(`timecolumn` timestamp, `interval_marker` > string, `num_l` DECIMAL(10,2)); > insert into test_base_table values ('2015-03-08 00:00:00', 'i1-start', 4.5); > set hive.druid.approx.result=true; > CREATE TABLE druid_test_table > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.segment.granularity" = "DAY") > AS > select cast(`timecolumn` as timestamp with local time zone) as `__time`, > `interval_marker`, cast(`num_l` as double) > FROM test_base_table; > describe druid_test_table; > explain select sum(num_l), min(num_l) FROM druid_test_table; > CREATE TABLE druid_test_table_2 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.segment.granularity" = "DAY") > AS > select cast(`timecolumn` as timestamp with local time zone) as `__time`, > `interval_marker`, `num_l` > FROM test_base_table; > describe druid_test_table_2; > explain select sum(num_l), min(num_l) FROM druid_test_table_2; > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)