[ https://issues.apache.org/jira/browse/HIVE-25557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421905#comment-17421905 ]
katty he commented on HIVE-25557: --------------------------------- count(*) on MR wil faster than Tez, normally, count operation can only read parquet metadata, but in this case it read all the data and compute, do i am confused and there is plan: !image-2021-09-29-11-07-04-118.png! > Hive 3.1.2 with Tez is slow to clount data in parquet format > ------------------------------------------------------------ > > Key: HIVE-25557 > URL: https://issues.apache.org/jira/browse/HIVE-25557 > Project: Hive > Issue Type: Improvement > Affects Versions: 3.1.2 > Environment: Tez *0.10.1* > Reporter: katty he > Priority: Major > Attachments: image-2021-09-29-11-07-04-118.png > > > recently, i use test a sql like seelct count(*) from table in Hive 3.1.2 with > Tez, and the table is in parquet format, normally, when counting, the query > engin can read metadata instead of reading the full data, but in my case, > Tez can not get count by metadata only, it will read the data, so it's slow, > when count 2 billion data, tez wil use 500s , and spend 60s to initialized, > ts that a problem? -- This message was sent by Atlassian Jira (v8.3.4#803005)