[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

Remus Rusanu (JIRA) Tue, 26 Nov 2013 10:02:32 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832810#comment-13832810
 ]


Remus Rusanu commented on HIVE-5817:
------------------------------------

[~sershe] Let me think about that (the SELECT and other operators). My patch 
handles cases where the input batch is different from the output batch (eg. 
JOIN, GROUP BY) but is not good for SELECT operator because it forces copy from 
in batch to out batch. What I uploaded is still needed for JOIN, is much better 
than how JOIN used to handle this issue, but you have a point that the solution 
may not be complete.

> column name to index mapping in VectorizationContext is broken
> --------------------------------------------------------------
>
>                 Key: HIVE-5817
>                 URL: https://issues.apache.org/jira/browse/HIVE-5817
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 0.13.0
>            Reporter: Sergey Shelukhin
>            Assignee: Remus Rusanu
>            Priority: Critical
>         Attachments: HIVE-5817-uniquecols.broken.patch, 
> HIVE-5817.00-broken.patch, HIVE-5817.4.patch, HIVE-5817.5.patch, 
> HIVE-5817.6.patch
>
>
> Columns coming from different operators may have the same internal names 
> ("_colNN"). There exists a query in the form {{select b.cb, a.ca from a JOIN 
> b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
> ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
> some ca it will fail (or can probably return incorrect results). That is 
> because when building column-to-VRG-index map in VectorizationContext, 
> internal column name for ca that the first map join operator adds to the 
> mapping may be the same as internal name for cb that the 2nd one tries to 
> add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
> output stuff, it retrieves wrong index from the map by name, and then wrong 
> vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

Reply via email to