[ https://issues.apache.org/jira/browse/HIVE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758063#comment-16758063 ]
Seung-Hyun Cheong commented on HIVE-21194: ------------------------------------------ [~bslim] I'm using HDP 3.1.0. (Hive 3.1.0, Druid 0.12.1) A query to insert data from HDFS to druid. {code:java} INSERT INTO TABLE druid.data_table SELECT `time` AS `__time`, . . . FROM hdfs.data_table WHERE . . .{code} The inserted segment meta !image-2019-02-01-16-31-56-958.png! * The interval of the segment: UTC (Green one) * The version of the segment: UTC+9 (Red one, it's KST.) To delete one of the segments I inserted {code:java} // Disabling the segment DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId} // Deleting the segment DELETE /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}{code} Then the exception occurs {code:java} 2019-01-30T16:58:35,354 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KillTask{id=kill_upload_2018-12-31T00:00:00.000Z_2019-02-05T00:00:00.000Z_2019-02-01T16:52:31.851Z, type=kill, dataSource=upload}] io.druid.java.util.common.ISE: WTF?! Unused segment[upload_2019-01-01T00:00:00.000Z_2019-01-02T00:00:00.000Z_2019-01-31T01:12:32.289+09:00] has version[2019-01-31T01:12:32.289+09:00] > task version[2019-01-30T16:58:29.992Z] at io.druid.indexing.common.task.KillTask.run(KillTask.java:94) ~[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] {code} So, in KST(UTC+9), I can't delete a segment for 9 hours... In Druid, a segment published by IndexTask has a version UTC (below) !image-2019-02-01-16-32-17-093.png! * The interval of the segment: UTC * The version of the segment: UTC So, there is no such problem. And, I didn't test my patch actually... sorry for that. I'll submit a patch again after testing. (QA failed, because I forgot the import statement on my patch. "import org.joda.time.DateTimeZone;") > DruidStorageHandler should set a version of segment to UTC > ---------------------------------------------------------- > > Key: HIVE-21194 > URL: https://issues.apache.org/jira/browse/HIVE-21194 > Project: Hive > Issue Type: Bug > Components: Druid integration > Reporter: Seung-Hyun Cheong > Assignee: Seung-Hyun Cheong > Priority: Minor > Attachments: image-2019-02-01-12-34-44-731.png, > image-2019-02-01-12-44-22-331.png, image-2019-02-01-15-07-06-893.png, > image-2019-02-01-16-17-36-598.png, image-2019-02-01-16-17-52-419.png, > image-2019-02-01-16-31-56-958.png, image-2019-02-01-16-32-17-093.png > > > h1. Exception while running a KillTask > {code:java} > 2019-01-30T16:58:35,354 ERROR [task-runner-0-priority-0] > io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running > task[KillTask{id=kill_upload_2018-12-31T00:00:00.000Z_2019-02-05T00:00:00.000Z_2019-02-01T16:52:31.851Z, > type=kill, dataSource=upload}] > io.druid.java.util.common.ISE: WTF?! Unused > segment[upload_2019-01-01T00:00:00.000Z_2019-01-02T00:00:00.000Z_2019-01-31T01:12:32.289+09:00] > has version[2019-01-31T01:12:32.289+09:00] > task > version[2019-01-30T16:58:29.992Z] > at io.druid.indexing.common.task.KillTask.run(KillTask.java:94) > ~[druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] > at > io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) > [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] > at > io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) > [druid-indexing-service-0.12.1.3.1.0.0-78.jar:0.12.1.3.1.0.0-78] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_112] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_112] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_112] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] > {code} > > h1. Reason > h3. KillTask compares versions > [KillTask.java#L88|https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillTask.java#L88] > {code:java} > if (unusedSegment.getVersion().compareTo(myLock.getVersion()) > 0) { > throw new ISE( > "WTF?! Unused segment[%s] has version[%s] > task version[%s]", > unusedSegment.getId(), > unusedSegment.getVersion(), > myLock.getVersion() > ); > } > {code} > > h3. KillTask version (UTC, e.g. "2019-01-30T16:58:29.992Z") > [TaskLockbox.java#L593|https://github.com/apache/incubator-druid/blob/8eae26fd4e7572060d112864dd3d5f6a865b9c89/indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskLockbox.java#L593] > {code:java} > version = DateTimes.nowUtc().toString(); > {code} > > h3. Segment version (UTC+9, e.g. "2019-01-31T01:12:32.289+09:00") > [DruidStorageHandler.java#L755|https://github.com/apache/hive/blob/master/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java#L755] > {code:java} > jobProperties.put(DruidConstants.DRUID_SEGMENT_VERSION, new > DateTime().toString()); > {code} > > > h1. Suggestion > h3. Because druid uses UTC only, DruidStorageHandler should set a version of > segment to UTC. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)