terminal_type =0, 260,000,000 rows, almost cover half of the whole
data.terminal_type =25066, just 3800 rows.
orc
tblproperties("orc.compress"="SNAPPY","orc.compress.size"="262141","orc.stripe.size"="268435456","orc.row.index.stride"="10","orc.create.index"="true","orc.bloom.filter.columns"
Hi professor Gopal,
> Most of your ~300s looks to be the fixed overheads of setting up each task.
Maybe you are right. Perhaps the orc indexes work normally in hive, Just
because the fixed time overhead is too long, so I think the performance
improement is not obvious, I will check this later.