Kontinuation opened a new issue, #1849:
URL: https://github.com/apache/datafusion-comet/issues/1849

   This was found when working on 
https://github.com/apache/datafusion-comet/pull/1845. A newly added Rust test 
that calls into `SparkUnsafeRow.is_null_at(idx)` was identified to have 
undefined behavior by Miri:
   
   ```
   test execution::planner::tests::test_unpack_dictionary_primitive ... ok
   test execution::planner::tests::test_unpack_dictionary_string ... ok
   test execution::planner::tests::to_datafusion_filter ... ok
   test execution::shuffle::row::test::test_append_null_row_to_struct_builder 
... ok
   warning: integer-to-pointer cast
       --> core/src/execution/shuffle/row.rs:260:31
        |
   260  |             let word_offset = (self.row_addr + (((index >> 6) as i64) 
<< 3)) as *const i64;
        |                               
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ integer-to-pointer 
cast
        |
        = help: this program is using integer-to-pointer casts or 
(equivalently) `ptr::with_exposed_provenance`, which means that Miri might miss 
pointer bugs in this program
        = help: see 
https://doc.rust-lang.org/nightly/std/ptr/fn.with_exposed_provenance.html for 
more details on that operation
        = help: to ensure that Miri does not miss bugs in your program, use 
Strict Provenance APIs 
(https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance, 
https://crates.io/crates/sptr) instead
        = help: you can then set `MIRIFLAGS=-Zmiri-strict-provenance` to ensure 
you are not relying on `with_exposed_provenance` semantics
        = help: alternatively, `MIRIFLAGS=-Zmiri-permissive-provenance` 
disables this warning
        = note: BACKTRACE on thread `execution::shuf`:
        = note: inside `execution::shuffle::row::SparkUnsafeRow::is_null_at` at 
core/src/execution/shuffle/row.rs:260:31: 260:91
   note: inside `execution::shuffle::row::append_field`
       --> core/src/execution/shuffle/row.rs:447:54
        |
   447  |             let nested_row = if row.is_null_row() || 
row.is_null_at(idx) {
        |                                                      
^^^^^^^^^^^^^^^^^^^
   note: inside 
`execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder`
       --> core/src/execution/shuffle/row.rs:3332:9
        |
   3332 |         append_field(&data_type, &mut struct_builder, &row, 
0).expect("append field");
        |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   note: inside closure
       --> core/src/execution/shuffle/row.rs:3322:57
        |
   3321 |     #[test]
        |     ------- in this procedural macro expansion
   3322 |     fn test_append_null_struct_field_to_struct_builder() {
        |                                                         ^
   
   error: Undefined Behavior: accessing memory based on pointer with alignment 
1, but alignment 8 is required
       --> core/src/execution/shuffle/row.rs:261:29
        |
   261  |             let word: i64 = *word_offset;
        |                             ^^^^^^^^^^^^ accessing memory based on 
pointer with alignment 1, but alignment 8 is required
        |
        = help: this indicates a bug in the program: it performed an invalid 
operation, and caused Undefined Behavior
        = help: see 
https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html 
for further information
        = note: BACKTRACE on thread `execution::shuf`:
        = note: inside `execution::shuffle::row::SparkUnsafeRow::is_null_at` at 
core/src/execution/shuffle/row.rs:261:29: 261:41
   note: inside `execution::shuffle::row::append_field`
       --> core/src/execution/shuffle/row.rs:447:54
        |
   447  |             let nested_row = if row.is_null_row() || 
row.is_null_at(idx) {
        |                                                      
^^^^^^^^^^^^^^^^^^^
   note: inside 
`execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder`
       --> core/src/execution/shuffle/row.rs:3332:9
        |
   3332 |         append_field(&data_type, &mut struct_builder, &row, 
0).expect("append field");
        |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   note: inside closure
       --> core/src/execution/shuffle/row.rs:3322:57
        |
   3321 |     #[test]
        |     ------- in this procedural macro expansion
   3322 |     fn test_append_null_struct_field_to_struct_builder() {
        |                                                         ^
   
   note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` 
for a verbose backtrace
   
   error: aborting due to 1 previous error; 1 warning emitted
   
   error: test failed, to rerun pass `-p datafusion-comet --lib`
   
   Caused by:
     process didn't exit successfully: 
`/home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri
 runner 
/home/runner/work/datafusion-comet/datafusion-comet/native/target/miri/x86_64-unknown-linux-gnu/debug/deps/comet-92e75f0cb556895b`
 (exit status: 1)
   note: test exited abnormally; to see the full output pass --nocapture to the 
harness.
   test 
execution::shuffle::row::test::test_append_null_struct_field_to_struct_builder 
... 
   ```
   
   We made Miri to ignore the test in that PR to pass CI, but we still need to 
address this undefined behavior anyway. 
https://github.com/apache/datafusion-comet/pull/1845/files#r2126985383
   
   Although unaligned memory access are allowed by mostly used architectures 
including amd64 and aarch64, and they do not exhibit performance penalties in 
recent CPU models, there are still some architectures do not support unaligned 
memory access. We need to eliminate this undefined behavior in our native code. 
When the native library is compiled to target platforms that supports unaligned 
memory access, the compiler should emit unaligned memory instructions so there 
should be no performance penalty.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to