MacChen02 commented on PR #28514:
URL: https://github.com/apache/flink/pull/28514#issuecomment-4861914525
> > I am curious is there anything wrong with the implementation that is
already there, you have replaced an Atomic boolean with a a boolean with
synchronization - why is this better? If there is no difference apart from
style, I suggest not making this change.
>
> Hi @davidradl Good question! Sorry for not making the motivation clearer
in the PR description (I've updated the PR description now to reflect this as
well). As detailed in the Jira ticket, the core motivation is to reduce object
allocation overhead in Flink's high-frequency data path. MemorySegment is
instantiated heavily. The current implementation allocates a separate
AtomicBoolean instance for every segment, leading to billions of redundant
objects at scale. Each AtomicBoolean object consumes 16 bytes. For 1 billion
data entries, an extra 15 GB (1 billion * 16 bytes) of memory must be
allocated. By switching to a primitive boolean, we eliminate this allocation
cost entirely. Although free() now uses synchronized, it's a low-frequency,
lifecycle-ending operation, making this trade-off highly beneficial for overall
construction throughput. Let me know if you have further concerns!
**Each AtomicBoolean object consumes 16 bytes. For 1 billion data entries,
an extra 15 GB (1 billion * 16 bytes) of memory**
Such a large memory space is saved, oh, Great~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]