Hi,

When I was tuning initial mapper number with Hive+Tez, found if orc table is 
clustered, total length return by estimator is always 2^31.

Hive: 2.3.3
Tez: 0.8.4 (TezSplitGrouper.java:197)

How to replicate:

create table test (f1 string, f2 string) clustered by (f1) into 1 buckets 
stored as orc tblproperties(’transactional’=’true’);
insert into test values(’s1’, ’s2’), (’s3’, ’s4’);
select count(*) from test;

Search ’Total length’ in log sys_dag_xxx, it is 2147483648.

Thanks for any suggestion.

Bob He
Thanks

Reply via email to