This is an automated email from the ASF dual-hosted git repository.

scovich pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git


The following commit(s) were added to refs/heads/main by this push:
     new 01d34a8bee Add `append_value_n` to GenericByteBuilder (#9426)
01d34a8bee is described below

commit 01d34a8bee7fae52afd167469ef9e75ff9533309
Author: Fokko Driesprong <[email protected]>
AuthorDate: Mon Mar 2 22:50:41 2026 +0100

    Add `append_value_n` to GenericByteBuilder (#9426)
    
    # Which issue does this PR close?
    
    - Closes #9425.
    
    # Rationale for this change
    
    I noticed that this method is available on PrimitiveTypeBuilder, but
    missing on the GenericByteBuilder, which make sense since the gain is
    less, but after benchmarking, it shows a solid 10%. Mostly because the
    more efficient allocation of the null-mask.
    
    ```
    ┌───────────────────┬────────────────┬───────────────────┬─────────┐
    │     Benchmark     │ append_value_n │ append_value loop │ Speedup │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100/len=5       │ 371 ns         │ 408 ns            │ 10%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100/len=30      │ 456 ns         │ 507 ns            │ 10%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100/len=1024    │ 1.81 µs        │ 1.95 µs           │ 8%      │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=1000/len=5      │ 2.39 µs        │ 2.87 µs           │ 17%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=1000/len=30     │ 3.41 µs        │ 3.89 µs           │ 12%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=1000/len=1024   │ 12.3 µs        │ 14.4 µs           │ 15%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=10000/len=5     │ 23.8 µs        │ 29.3 µs           │ 19%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=10000/len=30    │ 33.7 µs        │ 39.0 µs           │ 14%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=10000/len=1024  │ 115.9 µs       │ 135.0 µs          │ 14%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100000/len=5    │ 227.5 µs       │ 278.6 µs          │ 18%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100000/len=30   │ 328.1 µs       │ 377.9 µs          │ 13%     │
    ├───────────────────┼────────────────┼───────────────────┼─────────┤
    │ n=100000/len=1024 │ 1.16 ms        │ 1.34 ms           │ 14%     │
    └───────────────────┴────────────────┴───────────────────┴─────────┘
    ```
    
    I think this is still worthwhile to be added. Let me know what the
    community thinks!
    
    # What changes are included in this PR?
    
    A new public API.
    
    # Are these changes tested?
    
    Yes!
    
    # Are there any user-facing changes?
    
    A new public API.
---
 arrow-array/src/builder/generic_bytes_builder.rs | 32 ++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arrow-array/src/builder/generic_bytes_builder.rs 
b/arrow-array/src/builder/generic_bytes_builder.rs
index 7ed4bc5826..0a83ff989d 100644
--- a/arrow-array/src/builder/generic_bytes_builder.rs
+++ b/arrow-array/src/builder/generic_bytes_builder.rs
@@ -110,6 +110,21 @@ impl<T: ByteArrayType> GenericByteBuilder<T> {
         self.offsets_builder.push(self.next_offset());
     }
 
+    /// Appends a value of type `T` into the builder `n` times.
+    ///
+    /// See [`Self::append_value`] for more panic information.
+    #[inline]
+    pub fn append_value_n(&mut self, value: impl AsRef<T::Native>, n: usize) {
+        let bytes: &[u8] = value.as_ref().as_ref();
+        self.value_builder.reserve(bytes.len() * n);
+        self.offsets_builder.reserve(n);
+        for _ in 0..n {
+            self.value_builder.extend_from_slice(bytes);
+            self.offsets_builder.push(self.next_offset());
+        }
+        self.null_buffer_builder.append_n_non_nulls(n);
+    }
+
     /// Append an `Option` value into the builder.
     ///
     /// - A `None` value will append a null value.
@@ -939,4 +954,21 @@ mod tests {
 
         assert!(matches!(result, Err(ArrowError::OffsetOverflowError(_))));
     }
+
+    #[test]
+    fn test_append_value_n() {
+        let mut builder = GenericStringBuilder::<i32>::new();
+        builder.append_value("hello");
+        builder.append_value_n("world", 3);
+        builder.append_null();
+        let array = builder.finish();
+
+        assert_eq!(5, array.len());
+        assert_eq!(1, array.null_count());
+        assert_eq!("hello", array.value(0));
+        assert_eq!("world", array.value(1));
+        assert_eq!("world", array.value(2));
+        assert_eq!("world", array.value(3));
+        assert!(array.is_null(4));
+    }
 }

Reply via email to