[jira] [Created] (IGNITE-28835) BinaryMarshaller: avoid redundant full-object byte[] copy and double node-name scope on marshal-to-stream

Anton Vinogradov (Jira) Mon, 29 Jun 2026 16:00:16 -0700

Anton Vinogradov created IGNITE-28835:
-----------------------------------------


             Summary: BinaryMarshaller: avoid redundant full-object byte[] copy 
and double node-name scope on marshal-to-stream
                 Key: IGNITE-28835
                 URL: https://issues.apache.org/jira/browse/IGNITE-28835
             Project: Ignite
          Issue Type: Task
            Reporter: Anton Vinogradov
            Assignee: Anton Vinogradov


{{BinaryMarshaller.marshal0(Object, OutputStream)}} has two avoidable 
inefficiencies on the hot marshalling-to-stream path:

_1. Full-size temporary array copy (main issue)._

{code:java}
@Override protected void marshal0(Object obj, OutputStream out) {
byte[] arr = marshal(obj); // GridBinaryMarshaller.marshal -> writer.array() == 
Arrays.copyOf (full copy)
out.write(arr);
...
}
{code}

The whole serialized object is allocated and copied once 
({{{}writer.array(){}}} trims via {{{}Arrays.copyOf{}}}) only to be immediately 
streamed out. The temporary array and the copy are unnecessary — the writer's 
internal buffer can be written straight to \{{out}} by length.

_2. Redundant node-name scope._

{\{marshal0(obj, out)}} calls the *public* \{{marshal(obj)}}, which re-enters 
\{{AbstractNodeNameAwareMarshaller}}'s 
\{{setCurrentIgniteName}}/\{{restoreOldIgniteName}} scope a second time, 
although \{{marshal0}} already runs inside the scope established by the outer 
\{{marshal(obj, out)}}. It should call the protected \{{marshal0(obj)}} (or a 
new stream method) directly.

*Proposed change*
 * Add a "marshal directly into \{{OutputStream}}" path in 
\{{GridBinaryMarshaller}} that writes \{{writer.internalArray()}} for 
\{{writer.outputSize()}} bytes to \{{out}}, avoiding the trim-copy and the 
temporary array.
 * In \{{BinaryMarshaller.marshal0(obj, out)}} call that path instead of the 
public \{{marshal(obj)}}, removing the second name-scope.
 * Minor: reuse a shared single-byte \{{NULL}} array for the \{{obj == null}} 
fast path instead of allocating per call.

*Benchmark*

Isolated JMH micro-benchmark of the avoided work (serialization excluded, 
identical in both variants; AverageTime, ns/op, JDK 17):

{noformat}
#1 redundant name-scope (per marshal-to-stream call):
current : 13.00 ns
proposed : 11.16 ns (-14%, -1.84 ns/call)

#2 full-object copy in marshal0(obj, out) — copy+write vs write-directly:
object size current proposed gain
64 B 4.96 ns 2.14 ns -57%
1 KB 43.7 ns 18.8 ns -57%
16 KB 667 ns 364 ns -45%
256 KB 10563 ns 3756 ns -64% (-6.8 us)
{noformat}

The stream-tail cost roughly halves and scales with object size; additionally a 
full-size temporary allocation per call is eliminated (reduced GC pressure), 
which the ns/op figures do not fully capture.

Note: these are isolated deltas of the avoided copy/scope, not end-to-end 
marshaller throughput — in the full cycle the relative gain is smaller since 
serialization dominates, but the eliminated copy is absolute and proportional 
to object size, so the benefit is meaningful for large objects at high 
throughput.

*Scope / risk*
Local change in the \{{binary}} module, no public API or wire-format change. 
Behavior-preserving; covered by existing marshaller tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-28835) BinaryMarshaller: avoid redundant full-object byte[] copy and double node-name scope on marshal-to-stream

Reply via email to