vegarsti commented on code in PR #18981:
URL: https://github.com/apache/datafusion/pull/18981#discussion_r2574789320


##########
datafusion/common/src/hash_utils.rs:
##########
@@ -535,6 +636,10 @@ fn hash_single_array(
             let array = as_union_array(array)?;
             hash_union_array(array, random_state, hashes_buffer)?;
         }
+        DataType::RunEndEncoded(_, _) => downcast_run_array! {
+            array => hash_run_array(array, random_state, hashes_buffer, 
rehash)?,
+            _ => unreachable!()

Review Comment:
   Maybe return `_internal_err` like below (line 646) here instead?



##########
datafusion/common/src/hash_utils.rs:
##########
@@ -484,6 +484,107 @@ fn hash_fixed_list_array(
     Ok(())
 }
 
+#[cfg(not(feature = "force_hash_collisions"))]
+fn hash_run_array<R: RunEndIndexType>(
+    array: &RunArray<R>,
+    random_state: &RandomState,
+    hashes_buffer: &mut [u64],
+    rehash: bool,
+) -> Result<()> {
+    // We find the relevant runs that cover potentially sliced arrays, so we 
can only hash those
+    // values. Then we find the runs refer to the original runs and ensure 
that we apply hashes

Review Comment:
   ```suggestion
       // values. Then we find the runs that refer to the original runs and 
ensure that we apply hashes
   ```



##########
datafusion/common/src/hash_utils.rs:
##########
@@ -484,6 +484,107 @@ fn hash_fixed_list_array(
     Ok(())
 }
 
+#[cfg(not(feature = "force_hash_collisions"))]
+fn hash_run_array<R: RunEndIndexType>(
+    array: &RunArray<R>,
+    random_state: &RandomState,
+    hashes_buffer: &mut [u64],
+    rehash: bool,
+) -> Result<()> {
+    // We find the relevant runs that cover potentially sliced arrays, so we 
can only hash those
+    // values. Then we find the runs refer to the original runs and ensure 
that we apply hashes

Review Comment:
   Good comment, though!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to