[ 
https://issues.apache.org/jira/browse/HIVE-25026?focusedWorklogId=585094&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585094
 ]

ASF GitHub Bot logged work on HIVE-25026:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Apr/21 12:10
            Start Date: 19/Apr/21 12:10
    Worklog Time Spent: 10m 
      Work Description: zhangheihei opened a new pull request #2189:
URL: https://github.com/apache/hive/pull/2189


   **Hive task job will gen duplicate data cause of same task resubmission**
   ```
   2021-04-05 06:05:52 CONSOLE# Number of reduce tasks is set to 0 since 
there's no reduce operator
   2021-04-05 06:05:52 CONSOLE# Launching Job 5 out of 4
   2021-04-05 06:05:52 CONSOLE# Number of reduce tasks is set to 0 since 
there's no reduce operator
   ```
   <img 
src="https://user-images.githubusercontent.com/13237066/115213523-2d945800-a134-11eb-94c3-52095c748283.png";
 width="300" height="300">
   For example,  hive sql explain 4 task. when hive.exec.parallel=true and 
task2/task3 is canExecuteInParallel,task4 will execute 2 times;
   
   1.  task1 is FINISHED, task2/task3 enter Runnable queue
   <img 
src="https://user-images.githubusercontent.com/13237066/115233371-65a69580-a14a-11eb-81fb-5a0c3582e3dc.png";
 width="400" height="150">
   2. task2/task3 is executed in parallel and ends at the same time. Now 
task2/task3 is FINISHED
   <img 
src="https://user-images.githubusercontent.com/13237066/115233876-06955080-a14b-11eb-9570-7334eff8dcad.png";
 width="400" height="150">
   3. task2 removed from running queue, task4 will enter runnable queue
   4. 
   4. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 585094)
    Remaining Estimate: 0h
            Time Spent: 10m

> hive sql result is duplicate data cause of same task resubmission
> -----------------------------------------------------------------
>
>                 Key: HIVE-25026
>                 URL: https://issues.apache.org/jira/browse/HIVE-25026
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.1.1
>            Reporter: hezhang
>            Assignee: hezhang
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue is the same with hive-24577



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to