[jira] [Comment Edited] (CASSANDRA-20190) MemoryUtil.setInt/getInt and similar use the wrong endianness

Dmitry Konstantinov (Jira) Thu, 20 Mar 2025 06:46:12 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937117#comment-17937117
 ]


Dmitry Konstantinov edited comment on CASSANDRA-20190 at 3/20/25 1:45 PM:
--------------------------------------------------------------------------

Yes, I have the same opinion that we should not use machine-dependent on-disk 
order. 
A short summary for index summary:
* before 5.0 we had native order used in index summary file format for offsets 
and positions in entries. 
* since 5.0 due to CASSANDRA-17723 changes we have LE now for positions in 
entries (for all architectures) 
* since 5.0 due to CASSANDRA-17723 changes we have LE now for offsets in case 
of LE and unaligned BE architectures and BE in case of aligned BE architectures.

so, here the most reasonable option then is to fix the case with aligned BE 
architecture in case of offsets and use LE in this case as well (which actually 
means use LE in Memory#get/putXByByte). 

I am checking now BloomFilter and CompressionMetadata read/write flows.




was (Author: dnk):
Yes, I have the same opinion that we should not use machine-dependent on-disk 
order. 
A short summary for index summary:
* before 5.0 we had native order used in index summary file format for offsets 
and positions in entries. 
* since 5.0 due to CASSANDRA-17723 changes we have LE now for positions in 
entries. 
* since 5.0 due to CASSANDRA-17723 changes we have LE now for offsets in case 
of LE and unaligned BE architectures and BE in case of aligned BE architectures.

so, here the most reasonable option then is to fix the case with aligned BE 
architecture in case of offsets and use LE in this case as well (which actually 
means use LE in Memory#get/putXByByte). 

I am checking now BloomFilter and CompressionMetadata read/write flows.



> MemoryUtil.setInt/getInt and similar use the wrong endianness
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-20190
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20190
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Other
>            Reporter: Branimir Lambov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> `NativeCell`, `NativeClustering` and `NativeDecoratedKey` use the above 
> methods from `MemoryUtil` to write and read data from native memory. As far 
> as I can see they are meant to write data in big endian. They do not (they 
> always correct to little endian).
> Moreover, they disagree with their `ByByte` versions on big-endian machines 
> (which is only likely an issue on aligned-access architectures (x86 and arm 
> should be fine)).
> The same is true for the methods in `Memory`, used by compression metadata as 
> well as index summaries.
> We need to verify that this does not cause any problems, and to change the 
> methods to behave as expected and document the behaviour by explicitly using 
> `ByteOrder.LITTLE_ENDIAN` for any data that may have been persisted on disk 
> with the wrong endianness.
>  
> The current MemoryUtil behaviour (before the fix):
> ||Native 
> order||MemoryUtil.setX||MemoryUtil.setXByByte||MemoryUtil.getX||MemoryUtil.getXByByte||
> |BE|LE|BE|LE|BE|
> |LE|LE|LE|LE|LE|
> shortly: MemoryUtil.setX/getX is LE, MemoryUtil.setXByByte/getXByByte is 
> Native



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-20190) MemoryUtil.setInt/getInt and similar use the wrong endianness

Reply via email to