[ https://issues.apache.org/jira/browse/HIVE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
slim bouguerra resolved HIVE-16026. ----------------------------------- Resolution: Fixed > Generated query will timeout and/or kill the druid cluster. > ----------------------------------------------------------- > > Key: HIVE-16026 > URL: https://issues.apache.org/jira/browse/HIVE-16026 > Project: Hive > Issue Type: Bug > Components: Druid integration > Reporter: slim bouguerra > Priority: Major > > Grouping by `__time` and another dimension generate a query with granularity > NONE with an interval from 1970 to 3000. This will kill the druid cluster > because druid group by strategy will create cursor for every ms and there is > lot of milliseconds between 1970 and 3000. Hence such query can turn into a > select then do the group by within hive. This should only happen when we > don't know the `__time` granularity. > {code} > explain select `__time`, userid from login_druid group by `__time`, userid > > ; > OK > Plan optimized by CBO. > Stage-0 > Fetch Operator > limit:-1 > Select Operator [SEL_1] > Output:["_col0","_col1"] > TableScan [TS_0] > > Output:["__time","userid"],properties:{"druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_user_login\",\"granularity\":\"NONE\",\"dimensions\":[\"userid\"],\"limitSpec\":{\"type\":\"default\"},\"aggregations\":[{\"type\":\"longSum\",\"name\":\"dummy_agg\",\"fieldName\":\"dummy_agg\"}],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"groupBy"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)