[
https://issues.apache.org/jira/browse/HIVE-21599?focusedWorklogId=824813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-824813
]
ASF GitHub Bot logged work on HIVE-21599:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Nov/22 04:32
Start Date: 10/Nov/22 04:32
Worklog Time Spent: 10m
Work Description: amansinha100 commented on code in PR #3742:
URL: https://github.com/apache/hive/pull/3742#discussion_r1018639310
##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java:
##########
@@ -209,6 +214,22 @@ public FilterCompat.Filter setFilter(final JobConf conf,
MessageType schema) {
}
}
+ private MessageType getSchemaWithoutPartitionColumns(JobConf conf,
MessageType schema) {
+ String partCols = conf.get(IOConstants.PARTITION_COLUMNS);
+ if (partCols != null && partCols.length() > 0) {
+ Set<String> partitionColumns = new
HashSet<>(Arrays.asList(partCols.split(",")));
+ List<Type> newFields = new ArrayList<>();
+
+ for (Type field: schema.getFields()) {
+ if(!partitionColumns.contains(field.getName())) {
Review Comment:
Could you confirm if this comparison works in a case-insensitive way ? i.e
if the partition columns is defined as 'part_col' and the query has predicate
on 'PART_COL'.
##########
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:
##########
@@ -4272,6 +4273,21 @@ public static void addTableSchemaToConf(Configuration
conf,
LOG.info("schema.evolution.columns and schema.evolution.columns.types
not available");
}
}
+ public static void setPartitionColumnsToConf(Configuration conf,
TableScanOperator tableScanOp) {
+ TableScanDesc scanDesc = tableScanOp.getConf();
+ if (scanDesc != null && scanDesc.getTableMetadata() != null) {
+ List<String> partitionColsList =
scanDesc.getTableMetadata().getPartColNames();
+ if (!partitionColsList.isEmpty()) {
+ conf.set(IOConstants.PARTITION_COLUMNS, String.join(",",
partitionColsList));
+ }
+ } else {
+ LOG.info(IOConstants.PARTITION_COLUMNS + " not available");
Review Comment:
Is this else condition placed correctly ? Seems like the INFO message for
partition columns not available should be logged when
partitionColsList.isEmpty() returns true.
##########
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:
##########
@@ -4272,6 +4273,21 @@ public static void addTableSchemaToConf(Configuration
conf,
LOG.info("schema.evolution.columns and schema.evolution.columns.types
not available");
}
}
+ public static void setPartitionColumnsToConf(Configuration conf,
TableScanOperator tableScanOp) {
Review Comment:
nit: pls add a comment for this method. Also, the unset method is named
'InConf' whereas this one has 'ToConf'. Can both set and unset method names
use the same suffix ?
Issue Time Tracking
-------------------
Worklog Id: (was: 824813)
Time Spent: 1.5h (was: 1h 20m)
> Remove predicate on partition columns from Table Scan operator
> --------------------------------------------------------------
>
> Key: HIVE-21599
> URL: https://issues.apache.org/jira/browse/HIVE-21599
> Project: Hive
> Issue Type: Improvement
> Components: Query Planning
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-21599.1.patch
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Filter predicates are pushed to Table Scan (to be pushed to and used by
> storage handler/input format). Such predicates could consist of partition
> columns which are of no use to storage handler or input formats. Therefore
> it should be removed from TS filter expression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)