Hello, Igniters!

Recently we encountered an unexpected issue. Let me start with its roots,
before I start
discussing potential fixes.

We noticed that certain benchmarks showed some inefficiencies when being
run on new
MacBooks. They were related to low-level serialization code, and the cause
of it was an
unaligned read in GridUnsafe. "aarch64" allows it, but the architecture is
not included in the
"GridUnsafe#unaligned" check, which resulted in the execution of fall-back
code that reads
and writes everything byte by byte.

The fix seemed trivial, and we did it in [1] by adding "aarch64" into the
list of architectures that
support unaligned memory access. After a while, when we enabled the
"ItCompatibilityTest#testCompatibility", we realized that compatibility on
MacBooks is broken.
The incompatibility has been caused by [1], and as a hotfix, it has been
temporarily reverted
in [2].

How was that possible?
When we finished the investigation, it turned out
"DirectByteBufferStreamImplV1#writeUuid"
and "DirectByteBufferStreamImplV1#readUuid" have a particularly nasty bug
in them. This is
how these methods behave in 3.0:
 - If we run on an "i386", "x86", "amd64", or "x86_64", we will write parts
of UUID in Big Endian.
 - If we run on other Little Endian architectures, we will write these
parts in Little Endian.
 - If we run on a Big Endian architecture, we will write these parts in Big
Endian.

When we added "aarch64" to the list of "unaligned" architectures, we
started treating its data
as BE in "main" while Ignite 3.0 treats it as LE. For the clarification -
this stream is used for
- Network communication, runtime only.
- Serialization of raft commands, this data is written to the storage.
That's why fix [1] broke compatibility.

Such a behavior constitutes a problem, because network protocol and raft
serialization must be
architecture-independent:
- It is possible that nodes in the same cluster are run in different
environments with different
  architectures.
- It is possible, and almost guaranteed, that raft command serialization
happens on a different
  node, and thus must also be architecture-independent.
  (node A does the serialization, node B writes resulted payload into the
log storage)

That's issue number 1. The issue number 2 was found when we inspected the
code of
"DirectByteBufferStreamImplV1". "writeFixedInt"/"readFixedInt" (long too)
methods parity
is violated in BE architectures. Writes are always LE, but read uses native
bytes ordering.

In other words, Ignite 3.0 doesn't really work on Big Endian architectures.
Fixing this place
in particular is trivial, we will do it in 3.1. Fixing broken Little Endian
architectures might not
be as trivial.

My proposal is the following:
- We fix the bug in UUID serialization, and always use Big Endian for
encoding there. This
  will make our protocols correct on all architectures at once.
  This fix will break backwards compatibility on Little Endian
architectures that are NOT
  included in the following list: "i386", "x86", "amd64", and "x86_64".
  This means that an upgrade from 3.0 to 3.1 will be impossible*.
- We add "aarch64" into the list of architectures that support unaligned
memory access.
- We explicitly disable "ItCompatibilityTest#testCompatibility" on a number
of architectures.
- * If it turns out that we have a user, who uses one of those
architectures and who must
  upgrade their cluster from 3.0, we will prepare and provide a log storage
conversion tool
  that will replace all Little Endian UUIDs to Big Endian format. As far as
I'm aware, only log
  storage is affected at the moment.

It's better to fix it in 3.1, because it will be more widely adopted than
3.0. I will do that in [3].
Please provide your feedback to the proposal. What are your thoughts? Thank
you!

[1] https://issues.apache.org/jira/browse/IGNITE-25564
[2] https://issues.apache.org/jira/browse/IGNITE-25796
[3] https://issues.apache.org/jira/browse/IGNITE-25797

-- 
Sincerely yours,
Ivan Bessonov

Reply via email to