This is an automated email from the ASF dual-hosted git repository.
agrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git
The following commit(s) were added to refs/heads/main by this push:
new 7886a3d45 docs: fix inaccurate claim about mutable buffers in parquet
scan docs (#3378)
7886a3d45 is described below
commit 7886a3d452989008e50a382bef120ca10bb08477
Author: Andy Grove <[email protected]>
AuthorDate: Wed Feb 4 05:46:26 2026 -0700
docs: fix inaccurate claim about mutable buffers in parquet scan docs
(#3378)
---
docs/source/contributor-guide/parquet_scans.md | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/docs/source/contributor-guide/parquet_scans.md
b/docs/source/contributor-guide/parquet_scans.md
index 626b65b4e..bbacff4d9 100644
--- a/docs/source/contributor-guide/parquet_scans.md
+++ b/docs/source/contributor-guide/parquet_scans.md
@@ -37,9 +37,13 @@ implementation:
- Leverages the DataFusion community's ongoing improvements to `DataSourceExec`
- Provides support for reading complex types (structs, arrays, and maps)
-- Removes the use of reusable mutable-buffers in Comet, which is complex to
maintain
+- Delegates Parquet decoding to native Rust code rather than JVM-side decoding
- Improves performance
+> **Note on mutable buffers:** Both `native_comet` and `native_iceberg_compat`
use reusable mutable buffers
+> when transferring data from JVM to native code via Arrow FFI. The
`native_iceberg_compat` implementation uses DataFusion's native Parquet reader
for data columns, bypassing Comet's mutable buffer infrastructure entirely.
However, partition columns still use `ConstantColumnReader`, which relies on
Comet's mutable buffers that are reused across batches. This means native
operators that buffer data (such as `SortExec` or `ShuffleWriterExec`) must
perform deep copies to avoid data corruption.
+> See the [FFI documentation](ffi.md) for details on the `arrow_ffi_safe` flag
and ownership semantics.
+
The `native_datafusion` and `native_iceberg_compat` scans share the following
limitations:
- When reading Parquet files written by systems other than Spark that contain
columns with the logical type `UINT_8`
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]