slim bouguerra created HIVE-19155:
-------------------------------------
Summary: Day time saving cause Druid inserts to fail with
org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping
segments
Key: HIVE-19155
URL: https://issues.apache.org/jira/browse/HIVE-19155
Project: Hive
Issue Type: Bug
Components: Druid integration
Reporter: slim bouguerra
Assignee: slim bouguerra
If you try to insert data around the daylight saving time hour the query fails
with following exception
{code}
2018-04-10T11:24:58,836 ERROR [065fdaa2-85f9-4e49-adaf-3dc14d51be90 main]
exec.DDLTask: Failed
org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping
segments [2015-03-08T05:00:00.000Z/2015-03-09T05:00:00.000Z and
2015-03-09T04:00:00.000Z/2015-03-10T04:00:00.000Z] with the same version
[2018-04-10T11:24:48.388-07:00]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:914)
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:919)
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4831)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:394)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2443)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2114)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1797)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1538)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1532)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:204)
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
[hive-cli-3.1.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
[hive-cli-3.1.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
[hive-cli-3.1.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
[hive-cli-3.1.0-SNAPSHOT.jar:?]
at
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1455)
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1429)
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177)
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59)
[test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_92]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_92]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_92]
{code}
You can reproduce this using the following DDL
{code}
create database druid_test;
use druid_test;
create table test_table(`timecolumn` timestamp, `userid` string, `num_l` float);
insert into test_table values ('2015-03-08 00:00:00', 'i1-start', 4);
insert into test_table values ('2015-03-08 23:59:59', 'i1-end', 1);
insert into test_table values ('2015-03-09 00:00:00', 'i2-start', 4);
insert into test_table values ('2015-03-09 23:59:59', 'i2-end', 1);
insert into test_table values ('2015-03-10 00:00:00', 'i3-start', 2);
insert into test_table values ('2015-03-10 23:59:59', 'i3-end', 2);
CREATE TABLE druid_table
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.segment.granularity" = "DAY")
AS
select cast(`timecolumn` as timestamp with local time zone) as `__time`,
`userid`, `num_l` FROM test_table;
{code}
The fix is to always adjust the Druid segments identifiers to UTC.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)