[ 
https://issues.apache.org/jira/browse/HIVE-19703?focusedWorklogId=444963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-444963
 ]

ASF GitHub Bot logged work on HIVE-19703:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Jun/20 15:28
            Start Date: 12/Jun/20 15:28
    Worklog Time Spent: 10m 
      Work Description: belugabehr commented on a change in pull request #377:
URL: https://github.com/apache/hive/pull/377#discussion_r439487202



##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
##########
@@ -201,10 +214,16 @@ public HiveSplitGenerator(InputInitializerContext 
initializerContext) throws IOE
           LOG.info("The preferred split size is " + preferredSplitSize);
         }
 
+        float waves;
         // Create the un-grouped splits
-        float waves =
-          conf.getFloat(TezMapReduceSplitsGrouper.TEZ_GROUPING_SPLIT_WAVES,
-            TezMapReduceSplitsGrouper.TEZ_GROUPING_SPLIT_WAVES_DEFAULT);
+        if (numSplits.isPresent()) {
+          waves = (float)numSplits.getAsInt() / availableSlots;

Review comment:
       How about `numSplits.get().floatValue()` ?

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
##########
@@ -91,9 +92,16 @@
   private final MapWork work;
   private final SplitGrouper splitGrouper = new SplitGrouper();
   private final SplitLocationProvider splitLocationProvider;
+  private final OptionalInt numSplits;

Review comment:
       Since the constructor accepts an `Integer` type anyway, can you please 
make this `Optional<Integer>` ?

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
##########
@@ -117,6 +125,11 @@ public HiveSplitGenerator(Configuration conf, MapWork 
work, final boolean genera
     // initialized, which may cause it to drop events.
     // No dynamic partition pruning
     pruner = null;
+    if (numSplits == null) {

Review comment:
       If `Optional<Integer>` is used, this just becomes `Optional.ofNullable()`

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
##########
@@ -232,7 +251,7 @@ public HiveSplitGenerator(InputInitializerContext 
initializerContext) throws IOE
           splits[0] = new HiveInputFormat.HiveInputSplit(fileSplit, partIF);
         } else {
           // Raw splits
-          splits = inputFormat.getSplits(jobConf, (int) (availableSlots * 
waves));
+          splits = inputFormat.getSplits(jobConf, numSplits.orElse((int) 
(availableSlots * waves)));

Review comment:
       Take a look at using `Math.multiplyExact(availableSlots, waves)` here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 444963)
    Time Spent: 20m  (was: 10m)

> GenericUDTFGetSplits never uses num splits argument
> ---------------------------------------------------
>
>                 Key: HIVE-19703
>                 URL: https://issues.apache.org/jira/browse/HIVE-19703
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>            Reporter: Eric Wohlstadter
>            Assignee: Jaume M
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-19703.1.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The description for GenericUDTFGetSplits says
> {code}
> Returns an array of length int serialized splits for the referenced tables 
> string.
> {code}
> but the argument to control the number of splits is DOA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to