[
https://issues.apache.org/jira/browse/CASSANDRA-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042601#comment-18042601
]
Dmitry Konstantinov edited comment on CASSANDRA-20166 at 12/3/25 6:56 PM:
--------------------------------------------------------------------------
{quote}
What did you run before taking the allocation screenshots ?
Can you provide a comparison of allocations with the patch ?
{quote}
Let me capture clean results and share the details. I have a set of different
optimisations and it is one of them. On a high level is it the same e2e stress
test as in CASSANDRA-20226
{quote}
Should we put this into a jmh test (test/microbench) ?
{quote}
I am not sure..,
from one side it can be quite expensive to implement as an isolated test taking
in account the level of coupling in Cassandra code
and from another side I am not sure if anyone will run such kind of test after
:-) to pay of the efforts..
I think JMH is more a tool for method/class level microbenchmarks (like I did
for thread-local metrics), here the scale is a bit different.
was (Author: dnk):
{quote}
What did you run before taking the allocation screenshots ?
Can you provide a comparison of allocations with the patch ?
{quote}
Let me capture clean results and share the details. I have a set of different
optimisations and it is one of them. On a high level is it the same e2e stress
test as in CASSANDRA-20226
{quote}
Should we put this into a jmh test (test/microbench) ?
{quote}
I am not sure..,
from one side it can be quite expensive to implement as an isolated test taking
in account the level of coupling in Cassandra code
and from another side I am not sure if anyone will run such kind of test after
:-) to pay of the efforts..
I think JMH is more a tool for method/class level microbenchmarks (like I did
for thread-local metrics).
> Avoid ByteBuffer allocation during decoding of prepared CQL write requests
> --------------------------------------------------------------------------
>
> Key: CASSANDRA-20166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20166
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: CQL/Interpreter
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-20166-trunk_ci_summary.htm,
> CASSANDRA-20166-trunk_results_details.tar.xz, async_profiler_alloc.png,
> image-2024-12-26-17-33-39-031.jpg, image-2024-12-26-17-33-39-031.png,
> image-2024-12-26-17-35-05-485.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> A lot of ByteBuffer objects are allocated when we decode CQL queries,
> frequently the space spent for such objects is large than the actual amount
> of data received.
> There was a similar optimization (use byte[] directly and wrap them into
> ArrayCell instead of BufferCell) done some time ago for the place where a
> Mutation object is deserializing during a Cassandra cross-node communication
> or reading from a disk: CASSANDRA-15393
> While a complete replacement of ByteBuffer with byte[] during CQL decoding
> step looks like a very complex task (ByteBuffer is a part of too many
> entities involved into CQL parsing) we can optimize 20% of logic to get 80%
> of benefit by focusing only on batch and modification statements when
> prepared statements are used and cell values are provided as bind variables.
> !image-2024-12-26-17-33-39-031.jpg|width=570!
> In case of 10-symbol String values I used for a test the wrapping ByteBuffer
> objects are costlier than inner byte[] with data:
>
> !image-2024-12-26-17-35-05-485.png|width=570!
> Async profiler (-e alloc) view:
> !async_profiler_alloc.png|width=570!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]