[ 
https://issues.apache.org/jira/browse/HIVE-21599?focusedWorklogId=824813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-824813
 ]
ASF GitHub Bot logged work on HIVE-21599:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Nov/22 04:32
            Start Date: 10/Nov/22 04:32
    Worklog Time Spent: 10m 
      Work Description: amansinha100 commented on code in PR #3742:
URL: https://github.com/apache/hive/pull/3742#discussion_r1018639310


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java:
##########
@@ -209,6 +214,22 @@ public FilterCompat.Filter setFilter(final JobConf conf, 
MessageType schema) {
     }
   }
 
+  private MessageType getSchemaWithoutPartitionColumns(JobConf conf, 
MessageType schema) {
+    String partCols = conf.get(IOConstants.PARTITION_COLUMNS);
+    if (partCols != null && partCols.length() > 0) {
+      Set<String> partitionColumns = new 
HashSet<>(Arrays.asList(partCols.split(",")));
+      List<Type> newFields = new ArrayList<>();
+
+      for (Type field: schema.getFields()) {
+        if(!partitionColumns.contains(field.getName())) {

Review Comment:
   Could you confirm if this comparison works in a case-insensitive way ? i.e 
if the partition columns is defined as 'part_col' and the query has predicate 
on 'PART_COL'. 



##########
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:
##########
@@ -4272,6 +4273,21 @@ public static void addTableSchemaToConf(Configuration 
conf,
       LOG.info("schema.evolution.columns and schema.evolution.columns.types 
not available");
     }
   }
+  public static void setPartitionColumnsToConf(Configuration conf, 
TableScanOperator tableScanOp) {
+    TableScanDesc scanDesc = tableScanOp.getConf();
+    if (scanDesc != null && scanDesc.getTableMetadata() != null) {
+      List<String> partitionColsList = 
scanDesc.getTableMetadata().getPartColNames();
+      if (!partitionColsList.isEmpty()) {
+        conf.set(IOConstants.PARTITION_COLUMNS, String.join(",", 
partitionColsList));
+      }
+    } else {
+      LOG.info(IOConstants.PARTITION_COLUMNS + " not available");

Review Comment:
   Is this else condition placed correctly ?  Seems like the INFO message  for 
partition columns not available  should be logged when 
partitionColsList.isEmpty() returns true. 



##########
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:
##########
@@ -4272,6 +4273,21 @@ public static void addTableSchemaToConf(Configuration 
conf,
       LOG.info("schema.evolution.columns and schema.evolution.columns.types 
not available");
     }
   }
+  public static void setPartitionColumnsToConf(Configuration conf, 
TableScanOperator tableScanOp) {

Review Comment:
   nit: pls add a comment for this method.  Also, the unset method is named 
'InConf' whereas this one has 'ToConf'.  Can both set and unset method names 
use the same suffix ?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 824813)
    Time Spent: 1.5h  (was: 1h 20m)

> Remove predicate on partition columns from Table Scan operator
> --------------------------------------------------------------
>
>                 Key: HIVE-21599
>                 URL: https://issues.apache.org/jira/browse/HIVE-21599
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Planning
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21599.1.patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Filter predicates are pushed to Table Scan (to be pushed to and used by 
> storage handler/input format). Such predicates could consist of partition 
> columns which are of no use to storage handler  or input formats. Therefore 
> it should be removed from TS filter expression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to