Re: Hive Cli ORC table read error with limit option

Biswajit Nayak Sun, 06 Mar 2016 19:26:09 -0800

Hi Gopal,


I had already pasted the table format in this thread. Will repeat it again.


*hive> desc formatted *testdb.table_orc*;*

*OK*

*# col_name             data_type            comment             *



*row_id               bigint                                   *

*a                 int                                      *

*b                  int                                      *

*c                varchar(2)                               *

*d     bigint                                   *

*e           int                                      *

*f        bigint                                   *

*g                float                                    *

*h                 int                                      *

*i                  int                                      *



*# Partition Information    *

*# col_name             data_type            comment             *



*year                 int                                      *

*month                int                                      *

*day                  int                                      *



*# Detailed Table Information    *

*Database:            *testdb

*Owner:               *************    *

*CreateTime:          Mon Jan 25 22:32:22 UTC 2016  *

*LastAccessTime:      UNKNOWN               *

*Protect Mode:        None                  *

*Retention:           0                     *

*Location:            hdfs://***************:8020/hive/*testdb*.db/table_orc
 *

*Table Type:          MANAGED_TABLE         *

*Table Parameters:    *

* last_modified_by     **************          *

* last_modified_time   **************          *

* orc.compress         SNAPPY              *

* transient_lastDdlTime 1454104669          *



*# Storage Information    *

*SerDe Library:       org.apache.hadoop.hive.ql.io.orc.OrcSerde  *

*InputFormat:         org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  *

*OutputFormat:        org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat  *

*Compressed:          No                    *

*Num Buckets:         7                     *

*Bucket Columns:      [f]        *

*Sort Columns:        []                    *

*Storage Desc Params:    *

* field.delim          \t                  *

* serialization.format \t                  *

*Time taken: 0.105 seconds, Fetched: 46 row(s)*

*hive> *


>>>Depends on whether any of those columns are paritition columns or not & 
>>>whether
the table is marked transactional.

Yes those columns are partitioned and they are not marked as transactional.


>>>Usually that and a copy of --orcfiledump output to check the
offsets/types.

there are around 10 files, so copying all the orcfiledump will be a mess
here. Is there any way to find the defective file so that i could isolate
it and copy the orcfiledump of it here.

Thanks
Biswa


On Sat, Mar 5, 2016 at 12:21 AM, Gopal Vijayaraghavan <gop...@apache.org>
wrote:

>
> > Any one has any idea about this.. Really stuck with this.
> ...
> > hive> select h from testdb.table_orc where year = 2016 and month =1 and
> >day >29 limit 10;
>
> Depends on whether any of those columns are paritition columns or not &
> whether the table is marked transactional.
>
> > Caused by: java.lang.IndexOutOfBoundsException: Index: 0
> > at java.util.Collections$EmptyList.get(Collections.java:3212)
> > at
> >org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:1
> >2240)
>
> If you need answers to rare problems, these emails need at least the table
> format ("desc formatted").
>
>
> Usually that and a copy of --orcfiledump output to check the offsets/types.
>
> Cheers,
> Gopal
>
>
>

Re: Hive Cli ORC table read error with limit option

Reply via email to