[ 
https://issues.apache.org/jira/browse/HIVE-23022?focusedWorklogId=402790&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402790
 ]

ASF GitHub Bot logged work on HIVE-23022:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Mar/20 11:19
            Start Date: 13/Mar/20 11:19
    Worklog Time Spent: 10m 
      Work Description: ShubhamChaurasia commented on pull request #954: 
HIVE-23022: Arrow deserializer should ensure size of hive vector equal to arrow 
vector
URL: https://github.com/apache/hive/pull/954
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 402790)
    Remaining Estimate: 0h
            Time Spent: 10m

> Arrow deserializer should ensure size of hive vector equal to arrow vector
> --------------------------------------------------------------------------
>
>                 Key: HIVE-23022
>                 URL: https://issues.apache.org/jira/browse/HIVE-23022
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, Serializers/Deserializers
>            Reporter: Shubham Chaurasia
>            Assignee: Shubham Chaurasia
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Arrow deserializer - {{org.apache.hadoop.hive.ql.io.arrow.Deserializer}} in 
> some cases does not set the size of hive vector correctly. Size of hive 
> vector should be set at least equal to arrow vector to be able to read 
> (accommodate) it fully.
> Following exception can be seen when we try to read (using 
> {{LlapArrowRowInputFormat}} ) some table which contains complex types (struct 
> nested in array to be specific) and number of rows in table is more than 
> default (1024) batch/vector size.
> {code:java}
>     Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
>   at 
> org.apache.hadoop.hive.ql.io.arrow.Deserializer.readStruct(Deserializer.java:440)
>   at 
> org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:143)
>   at 
> org.apache.hadoop.hive.ql.io.arrow.Deserializer.readList(Deserializer.java:394)
>   at 
> org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:137)
>   at 
> org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:122)
>   at 
> org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75)
>   ... 23 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to