[jira] [Comment Edited] (CASSANDRA-20190) MemoryUtil.setInt/getInt and similar use the wrong endianness

Dmitry Konstantinov (Jira) Sat, 05 Apr 2025 10:25:47 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937127#comment-17937127
 ]


Dmitry Konstantinov edited comment on CASSANDRA-20190 at 3/20/25 2:22 PM:
--------------------------------------------------------------------------

CompressionMetadata:
* write and store flow
** org.apache.cassandra.io.compress.CompressionMetadata.Writer#addOffset - uses 
Memory.setLong();
** org.apache.cassandra.io.compress.CompressionMetadata.Writer#doPrepare - we 
read long as a primitive value from Memory and write it to FileInputStreamPlus: 
out.writeLong(offsets.getLong(i * 8L));
* load and read flow
** org.apache.cassandra.io.compress.CompressionMetadata#readChunkOffsets - we 
read long as a primitive value from an input stream and set it to Memory: 
offsets.setLong(i * 8L, input.readLong()), where the input stream is a standard 
FileInputStreamPlus with BE order.
** org.apache.cassandra.io.compress.CompressionMetadata#chunkFor - uses 
Memory.getLong(..) 

In this case transferring of data to/from file and Memory is done using a 
translation to primitive long values, so the logic is agnostic to the order 
used inside Memory (while get/put methods are consistent in ordering). Input 
and output streams use BE order, so the file format is BE here.
So, this logic will not be affected by using LE in Memory#get/putXByByte too.




was (Author: dnk):
CompressionMetadata:
* write and store flow
** org.apache.cassandra.io.compress.CompressionMetadata.Writer#addOffset - uses 
Memory.setLong();
** org.apache.cassandra.io.compress.CompressionMetadata.Writer#doPrepare - we 
read long as a primitive value from Memory and write it to FileInputStreamPlus: 
out.writeLong(offsets.getLong(i * 8L));
* load and read flow
** org.apache.cassandra.io.compress.CompressionMetadata#readChunkOffsets - we 
read long as a primitive value from an input stream and set it to Memory: 
offsets.setLong(i * 8L, input.readLong()), where the input stream is a standard 
FileInputStreamPlus with BE order.
** org.apache.cassandra.io.compress.CompressionMetadata#chunkFor - uses 
Memory.getLong(..) 

In this case transferring of data to/from file and Memory is done using 
translations to primitive long values, so the logic is agnostic to the order 
used inside Memory (while get/put methods are consistent in ordering). Input 
and output streams use BE order, so the file format is BE here.
So, this logic will not be affected by using LE in Memory#get/putXByByte too.



> MemoryUtil.setInt/getInt and similar use the wrong endianness
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-20190
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20190
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Other
>            Reporter: Branimir Lambov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> `NativeCell`, `NativeClustering` and `NativeDecoratedKey` use the above 
> methods from `MemoryUtil` to write and read data from native memory. As far 
> as I can see they are meant to write data in big endian. They do not (they 
> always correct to little endian).
> Moreover, they disagree with their `ByByte` versions on big-endian machines 
> (which is only likely an issue on aligned-access architectures (x86 and arm 
> should be fine)).
> The same is true for the methods in `Memory`, used by compression metadata as 
> well as index summaries.
> We need to verify that this does not cause any problems, and to change the 
> methods to behave as expected and document the behaviour by explicitly using 
> `ByteOrder.LITTLE_ENDIAN` for any data that may have been persisted on disk 
> with the wrong endianness.
>  
> The current MemoryUtil behaviour (before the fix):
> ||Native 
> order||MemoryUtil.setX||MemoryUtil.setXByByte||MemoryUtil.getX||MemoryUtil.getXByByte||
> |BE|LE|BE|LE|BE|
> |LE|LE|LE|LE|LE|
> shortly: MemoryUtil.setX/getX is LE, MemoryUtil.setXByByte/getXByByte is 
> Native



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-20190) MemoryUtil.setInt/getInt and similar use the wrong endianness

Reply via email to