Kontinuation opened a new issue, #1849: URL: https://github.com/apache/datafusion-comet/issues/1849
This was found when working on https://github.com/apache/datafusion-comet/pull/1845. A newly added Rust test that calls into `SparkUnsafeRow.is_null_at(idx)` was identified to have undefined behavior by Miri: ``` test execution::planner::tests::test_unpack_dictionary_primitive ... ok test execution::planner::tests::test_unpack_dictionary_string ... ok test execution::planner::tests::to_datafusion_filter ... ok test execution::shuffle::row::test::test_append_null_row_to_struct_builder ... ok warning: integer-to-pointer cast --> core/src/execution/shuffle/row.rs:260:31 | 260 | let word_offset = (self.row_addr + (((index >> 6) as i64) << 3)) as *const i64; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ integer-to-pointer cast | = help: this program is using integer-to-pointer casts or (equivalently) `ptr::with_exposed_provenance`, which means that Miri might miss pointer bugs in this program = help: see https://doc.rust-lang.org/nightly/std/ptr/fn.with_exposed_provenance.html for more details on that operation = help: to ensure that Miri does not miss bugs in your program, use Strict Provenance APIs (https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance, https://crates.io/crates/sptr) instead = help: you can then set `MIRIFLAGS=-Zmiri-strict-provenance` to ensure you are not relying on `with_exposed_provenance` semantics = help: alternatively, `MIRIFLAGS=-Zmiri-permissive-provenance` disables this warning = note: BACKTRACE on thread `execution::shuf`: = note: inside `execution::shuffle::row::SparkUnsafeRow::is_null_at` at core/src/execution/shuffle/row.rs:260:31: 260:91 note: inside `execution::shuffle::row::append_field` --> core/src/execution/shuffle/row.rs:447:54 | 447 | let nested_row = if row.is_null_row() || row.is_null_at(idx) { | ^^^^^^^^^^^^^^^^^^^ note: inside `execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder` --> core/src/execution/shuffle/row.rs:3332:9 | 3332 | append_field(&data_type, &mut struct_builder, &row, 0).expect("append field"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> core/src/execution/shuffle/row.rs:3322:57 | 3321 | #[test] | ------- in this procedural macro expansion 3322 | fn test_append_null_struct_field_to_struct_builder() { | ^ error: Undefined Behavior: accessing memory based on pointer with alignment 1, but alignment 8 is required --> core/src/execution/shuffle/row.rs:261:29 | 261 | let word: i64 = *word_offset; | ^^^^^^^^^^^^ accessing memory based on pointer with alignment 1, but alignment 8 is required | = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information = note: BACKTRACE on thread `execution::shuf`: = note: inside `execution::shuffle::row::SparkUnsafeRow::is_null_at` at core/src/execution/shuffle/row.rs:261:29: 261:41 note: inside `execution::shuffle::row::append_field` --> core/src/execution/shuffle/row.rs:447:54 | 447 | let nested_row = if row.is_null_row() || row.is_null_at(idx) { | ^^^^^^^^^^^^^^^^^^^ note: inside `execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder` --> core/src/execution/shuffle/row.rs:3332:9 | 3332 | append_field(&data_type, &mut struct_builder, &row, 0).expect("append field"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> core/src/execution/shuffle/row.rs:3322:57 | 3321 | #[test] | ------- in this procedural macro expansion 3322 | fn test_append_null_struct_field_to_struct_builder() { | ^ note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace error: aborting due to 1 previous error; 1 warning emitted error: test failed, to rerun pass `-p datafusion-comet --lib` Caused by: process didn't exit successfully: `/home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner /home/runner/work/datafusion-comet/datafusion-comet/native/target/miri/x86_64-unknown-linux-gnu/debug/deps/comet-92e75f0cb556895b` (exit status: 1) note: test exited abnormally; to see the full output pass --nocapture to the harness. test execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder ... ``` We made Miri to ignore the test in that PR to pass CI, but we still need to address this undefined behavior anyway. https://github.com/apache/datafusion-comet/pull/1845/files#r2126985383 Although unaligned memory access are allowed by mostly used architectures including amd64 and aarch64, and they do not exhibit performance penalties in recent CPU models, there are still some architectures do not support unaligned memory access. We need to eliminate this undefined behavior in our native code. When the native library is compiled to target platforms that supports unaligned memory access, the compiler should emit unaligned memory instructions so there should be no performance penalty. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org