Hi Daniel,
thanks for replying!
IMO, that reading of network byte order is a common misconception. The
rfc for IP/UDP/TCP defined that the fields of the IP headers would be
encoded in big endian, which was a very common native byte order at the
time. Afaik, there has never been any kind of standard defining that
network protocols should be big endian - that is just an interpretation
of the term "network byte order".
Any application layer protocol in the ISO/OSI stack can define byte
orders as the author likes. And in times where likely >95% of all
compute resources natively speak little endian, the sensible choice is
pretty obvious in my view.
But back to the actual problem here: Websockets, as any layered network
protocol, need to be transparent. I transfer a binary message (which is
a byte blob, in case of the WebSocket API its a ByteBuffer) to the
websocket layer, which uses the WS protocol on top of TCP on top of IP
on top of ethernet/wifi/5G/whatever to transport it to the reciever. And
on the other and, another websocket layer removes all the things around
and again just delivers a byte blob to the application. That final byte
blob must have the exact same content as the original one, or the
protocol is unusable. And this expectation is what the bug violates. The
byte order of the source buffer is just a hint for extracting multi-byte
values from it. It cannot influence the byte blob recieved by the
communication partner.
It is very likely that very few people have stumbled upon that problem,
because (for historic reasons, but unreasonably so) you have to actually
work hard in Java in order to work with little endian ByteBuffers
(although the JVM internally works with native byte order, which again
is 95% little endian). But to be frank: if any developer has ever built
a system based on the Java Websocket API, using little endian
ByteBuffers in the client and hacking the server in order to work with
the unintentional re-ordering of bytes at the API boundary (which is
actually hard, because that has to be applied selectively, when
fragmentation and alignments come into play), they surely deserve the
pain of a malfunctioning system when some undocumented misbehaviour
present since Java 11 is fixed...
Ok, rant over, sorry :-)
have a good evening
Simon Fischer
__
_
_
Simon Fischer
Developer - CoDaC
Department Operation
Max Planck Institute for Plasmaphysics
Wendelsteinstrasse 1
17491 Greifswald, Germany
Phone: +49(0)3834 88 1215
On March 5, 2025 19:50:33 Daniel Fuchs <daniel.fu...@oracle.com> wrote:
Hi Simon,
Thank you for the report.
I am not too familiar with WebSocket. But since this is
a networking protocol I would expect binary data to be
transferred on the wire in network byte order.
So the expectation from WebSocket::sendBinary that the
byte buffer is in network byte order (big endian) does
not seem unrealistic to me.
More concretely, since the API is there since java 11
(at least in its standard form), it would be difficult
to change this long standing behavior.
That said - a deeper analysis of the issue and possible
options is probably warranted. At least the expectations
should be documented.
best regards,
-- daniel
On 05/03/2025 17:53, Fischer, Simon wrote:
Hi all,
no idea if I am in the right place here, but I have no account to create
a tracker issue and also could not find out how to get one…
I was just using the java.net.html.WebSocket (jdk17 specifically, but
the issue should still be there at least in 21 from a quick look into
the code) for testing purposes and stumbled upon what I think is a bug:
When I supply a _/little endian/_ ByteBuffer to WebSocket.sendBinary,
the payload will be scrambled on the way to the server.
The actual issue is in jdk.internal.net.http.websocket.Frame. The
loop()-portion of the transferMasking algorithm uses
ByteBuffer.putLong(ByteBuffer.getLong) (“dst.putLong(j, src.getLong(i) ^
maskLong);”) to try to optimize the masking and data transfer. Problem
is that src is a ByteBuffer provided by the user, with the endianness
set by the user, while dst is an internally allocated ByteBuffer with
the default byte order. This obviously can lead to 8-byte-blocks of the
original message being re-ordered on the way to the client.
The solution IMO would be to ensure that both buffers are set to the
same endianness. And it should probably be _/native endian/_, as the use
of a specific endianness would likely render the
long-vectorization/optimization useless on a machine which does not
support that endianness natively (getLong would reverse the byte order
when loading into native CPU register and putLong would reorder again).
In that case, actually any case, care must also be taken with regard to
the right encoding of the “maskLong” 64bit integer.
Alternatively, one could just adopt the documentation to require the
ByteBuffer provided to WebSocket.sendBinary() to be default(/big)-endian
encoded. Semi-solution IMO.
Would be interested in feedback and hope this finds somebody who can
make use of it J
Best regards
Simon Fischer
--
Simon Fischer
Developer
E5-CoDaC
Max Planck Institut for Plasmaphysics
Wendelsteinstrasse 1
17491 Greifswald, Germany
Phone: +49(0)3834 88 1215