This is an automated email from the ASF dual-hosted git repository.
scovich pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/main by this push:
new 01d34a8bee Add `append_value_n` to GenericByteBuilder (#9426)
01d34a8bee is described below
commit 01d34a8bee7fae52afd167469ef9e75ff9533309
Author: Fokko Driesprong <[email protected]>
AuthorDate: Mon Mar 2 22:50:41 2026 +0100
Add `append_value_n` to GenericByteBuilder (#9426)
# Which issue does this PR close?
- Closes #9425.
# Rationale for this change
I noticed that this method is available on PrimitiveTypeBuilder, but
missing on the GenericByteBuilder, which make sense since the gain is
less, but after benchmarking, it shows a solid 10%. Mostly because the
more efficient allocation of the null-mask.
```
┌───────────────────┬────────────────┬───────────────────┬─────────┐
│ Benchmark │ append_value_n │ append_value loop │ Speedup │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=5 │ 371 ns │ 408 ns │ 10% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=30 │ 456 ns │ 507 ns │ 10% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100/len=1024 │ 1.81 µs │ 1.95 µs │ 8% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=5 │ 2.39 µs │ 2.87 µs │ 17% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=30 │ 3.41 µs │ 3.89 µs │ 12% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=1000/len=1024 │ 12.3 µs │ 14.4 µs │ 15% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=5 │ 23.8 µs │ 29.3 µs │ 19% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=30 │ 33.7 µs │ 39.0 µs │ 14% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=10000/len=1024 │ 115.9 µs │ 135.0 µs │ 14% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=5 │ 227.5 µs │ 278.6 µs │ 18% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=30 │ 328.1 µs │ 377.9 µs │ 13% │
├───────────────────┼────────────────┼───────────────────┼─────────┤
│ n=100000/len=1024 │ 1.16 ms │ 1.34 ms │ 14% │
└───────────────────┴────────────────┴───────────────────┴─────────┘
```
I think this is still worthwhile to be added. Let me know what the
community thinks!
# What changes are included in this PR?
A new public API.
# Are these changes tested?
Yes!
# Are there any user-facing changes?
A new public API.
---
arrow-array/src/builder/generic_bytes_builder.rs | 32 ++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/arrow-array/src/builder/generic_bytes_builder.rs
b/arrow-array/src/builder/generic_bytes_builder.rs
index 7ed4bc5826..0a83ff989d 100644
--- a/arrow-array/src/builder/generic_bytes_builder.rs
+++ b/arrow-array/src/builder/generic_bytes_builder.rs
@@ -110,6 +110,21 @@ impl<T: ByteArrayType> GenericByteBuilder<T> {
self.offsets_builder.push(self.next_offset());
}
+ /// Appends a value of type `T` into the builder `n` times.
+ ///
+ /// See [`Self::append_value`] for more panic information.
+ #[inline]
+ pub fn append_value_n(&mut self, value: impl AsRef<T::Native>, n: usize) {
+ let bytes: &[u8] = value.as_ref().as_ref();
+ self.value_builder.reserve(bytes.len() * n);
+ self.offsets_builder.reserve(n);
+ for _ in 0..n {
+ self.value_builder.extend_from_slice(bytes);
+ self.offsets_builder.push(self.next_offset());
+ }
+ self.null_buffer_builder.append_n_non_nulls(n);
+ }
+
/// Append an `Option` value into the builder.
///
/// - A `None` value will append a null value.
@@ -939,4 +954,21 @@ mod tests {
assert!(matches!(result, Err(ArrowError::OffsetOverflowError(_))));
}
+
+ #[test]
+ fn test_append_value_n() {
+ let mut builder = GenericStringBuilder::<i32>::new();
+ builder.append_value("hello");
+ builder.append_value_n("world", 3);
+ builder.append_null();
+ let array = builder.finish();
+
+ assert_eq!(5, array.len());
+ assert_eq!(1, array.null_count());
+ assert_eq!("hello", array.value(0));
+ assert_eq!("world", array.value(1));
+ assert_eq!("world", array.value(2));
+ assert_eq!("world", array.value(3));
+ assert!(array.is_null(4));
+ }
}