Hi,
Why Hive still read so much "records" even with a filter pushdown enabled and the returned dataset would be a very small amount ( 4k out of 30billion records). The "RECORDS_IN" counter of Hive which still showed the 30billion count and also the output in the map reduce log like this : org.apache.hadoop.hive.ql.exec.MapOperator: MAP[4]: records read - 100000 BTW, I am using parquet as stoarg format and the filter pushdown did work as i see this in log : AM INFO: parquet.filter2.compat.FilterCompat: Filtering using predicate: eq(myid, 223) Thanks, Keith