Dmitry Konstantinov created CASSANDRA-20052:
-----------------------------------------------
Summary: Size of CQL messages is not limited in V5 protocol logic
Key: CASSANDRA-20052
URL: https://issues.apache.org/jira/browse/CASSANDRA-20052
Project: Cassandra
Issue Type: Bug
Components: Messaging/Client
Reporter: Dmitry Konstantinov
Size of CQL messages is not limited in V5 protocol logic
- After introducing of v5 frames we do not have any CQL message limit anymore,
native_transport_max_frame_size_in_mb which had such limit in pre-V5 epoch is
applicable now only to pre-V5 protocol sessions, otherwise it is applied only
to the initial STARTUP/OPTIONS messages handling, it is not checked in any v5
logic. So, currently a v5 CQL message of any size can be sent to Cassandra
server.
- The overload logic just allows to process huge messages for free to avoid
starvation, so it does not provide any protection against the most dangerous
requests from a memory pressure point of view.
- The situation even more dangerous: the v5 framing logic is enabled just
after AUTH response, so we do not limit message size even for AUTH_RESPONSE
messages from a client. It can be used as a DoS attack: a non-authenticated
client can send a huge username/password to Cassandra server to cause troubles
with GC or even kill it.
An easy example:
{code:java}
public class TestBigAuthRequest {
public static void main(String[] args) {
String password = getString(500_000_000, '-');
try (CqlSession session = CqlSession.builder()
.addContactEndPoint(new DefaultEndPoint(new
InetSocketAddress("localhost", 9042)))
.withAuthCredentials("cassandra", password)
.withLocalDatacenter("datacenter1")
.build()) {
session.execute("select * from system.local");
}
}
private static String getString(int length, char charToFill) {
if (length > 0) {
char[] array = new char[length];
Arrays.fill(array, charToFill);
return new String(array);
}
return "";
}
}
{code}
A thread stack of such invocation (captured to show the execution flow):
{code:java}
"nioEventLoopGroup-5-21@9164" prio=10 tid=0x86 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at
org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:45)
at
org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:39)
at
org.apache.cassandra.transport.Message$Decoder.decodeMessage(Message.java:432)
at
org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:467)
at
org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:459)
at
org.apache.cassandra.transport.CQLMessageHandler.processRequest(CQLMessageHandler.java:377)
at
org.apache.cassandra.transport.CQLMessageHandler$LargeMessage.onComplete(CQLMessageHandler.java:755)
at
org.apache.cassandra.net.AbstractMessageHandler$LargeMessage.supply(AbstractMessageHandler.java:561)
at
org.apache.cassandra.net.AbstractMessageHandler.processSubsequentFrameOfLargeMessage(AbstractMessageHandler.java:257)
at
org.apache.cassandra.net.AbstractMessageHandler.processIntactFrame(AbstractMessageHandler.java:229)
at
org.apache.cassandra.net.AbstractMessageHandler.process(AbstractMessageHandler.java:216)
at
org.apache.cassandra.transport.CQLMessageHandler.process(CQLMessageHandler.java:147)
at
org.apache.cassandra.net.FrameDecoder.deliver(FrameDecoder.java:330)
at
org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:294)
at
org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:277)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:829)
{code}
The provided MR contains a fix for the issue which introduces 2 new parameters:
native_transport_max_message_size - to limit any CQL message size
native_transport_max_auth_message_size (default = 128KiB) - to limit auth
response message size
Design questions:
* The current implementation closes a CQL connection if a message is bigger
than the limits. A skip message body logic can be implemented to continue the
connection usage but it is more complicated and error prone.
* The tricky question is the default value for
native_transport_max_message_size,
from one side - we want to have it not more than
min(native_transport_max_request_data_in_flight_per_ip,
native_transport_max_request_data_in_flight) to reduce chances to invoke the
branch of logic when a error handling does not work
from another size - (native_transport_max_request_data_in_flight_per_ip,
native_transport_max_request_data_in_flight) can be too small and break a
backward compatibility for existing deployments where people uses large
messages and small heaps (while it is not a good idea).
Related observations:
1) https://issues.apache.org/jira/browse/CASSANDRA-16886 - Reduce
native_transport_max_frame_size_in_mb (from 256M to 16M)
2) A correspondent logic for Cassandra server internode protocol a message
limit exists and rate limiting parameters are validated to be smaller than a
single message max size:
internode_max_message_size =
min(internode_application_receive_queue_reserve_endpoint_capacity,
internode_application_send_queue_reserve_endpoint_capacity)
internode_application_receive_queue_reserve_endpoint_capacity = 128MiB
internode_application_send_queue_reserve_endpoint_capacity = 128MiB
internode_max_message_size <=
internode_application_receive_queue_reserve_endpoint_capacity
internode_max_message_size <=
internode_application_receive_queue_reserve_global_capacity
internode_max_message_size <=
internode_application_send_queue_reserve_endpoint_capacity
internode_max_message_size <=
internode_application_send_queue_reserve_global_capacity
3) Request types according to CQL specification:
4.1.1. STARTUP, in normal cases should be small
4.1.2. AUTH_RESPONSE, in normal cases should be small
4.1.3. OPTIONS, in normal cases should be small
4.1.4. QUERY, in normal cases should be small
4.1.5. PREPARE, in normal cases should be small
4.1.6. EXECUTE <-- potentially large in case of inserts, max_mutation_size =
commitlog_segment_size / 2; where commitlog_segment_size_in_mb = 32MiB
4.1.7. BATCH <-- potentially large, max_mutation_size = commitlog_segment_size
/ 2; where commitlog_segment_size_in_mb = 32MiB
4.1.8. REGISTER, in normal cases should be small
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]