andygrove commented on PR #549:
URL: https://github.com/apache/datafusion-comet/pull/549#issuecomment-2159574852
> Should we investigate why we have unaligned memory in the first place?
The input is a slice of `u8`, so it has no particular alignment, and we are
trying to turn it into a slice of `u32` using `from_raw_parts`. When we run
this with Rust 1.78 we get this error:
```
unsafe precondition(s) violated: slice::from_raw_parts requires the pointer
to be aligned and non-null, and the total size of the slice not to exceed
`isize::MAX`
```
The original code seems to be copying data over using two loops. The first
loop was copying in slices of 32 items and then the second loop copied any
remaining items. I can only assume that this was some sort of optimization, but
it relied on undefined behavior, which is now caught by Rust 1.78.
I simplified the code to use a single loop and removed the unsafe
optimization. The code is now almost identical to the decode method for other
types such as this one, which I based it on:
```rust
impl PlainDecoding for Int32To64Type {
fn decode(src: &mut PlainDecoderInner, dst: &mut ParquetMutableVector,
num: usize) {
let src_ptr = src.data.as_ptr() as *const i32;
let dst_ptr = dst.value_buffer.as_mut_ptr() as *mut i64;
unsafe {
for i in 0..num {
dst_ptr
.add(dst.num_values + i)
.write_unaligned(src_ptr.add(src.offset +
i).read_unaligned() as i64);
}
}
src.offset += 4 * num;
}
```
I don't know why we were trying to do the optimization for some cases and
not others. Perhaps @sunchao can remember.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]