[jira] [Commented] (HIVE-17976) HoS: don't set output collector if there's no data to process

Rui Li (JIRA) Wed, 08 Nov 2017 00:37:35 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-17976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243533#comment-16243533
 ]


Rui Li commented on HIVE-17976:
-------------------------------

Seems the logic should only be implemented for map work, not reduce work. For 
example, if we do a count() and there's no data in the table, we should return 
0 instead of nothing. Therefore we should set output collector for the reduce 
work even if it has no data to process. MR doesn't need to do this because 
reducers of MR always end with operators that write to disk, e.g. FS.

> HoS: don't set output collector if there's no data to process
> -------------------------------------------------------------
>
>                 Key: HIVE-17976
>                 URL: https://issues.apache.org/jira/browse/HIVE-17976
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Rui Li
>            Assignee: Rui Li
>            Priority: Minor
>         Attachments: HIVE-17976.1.patch, HIVE-17976.2.patch
>
>
> MR doesn't set an output collector if no row is processed, i.e. 
> {{ExecMapper::map}} is never called. Let's investigate whether Spark should 
> do the same.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17976) HoS: don't set output collector if there's no data to process

Reply via email to