[
https://issues.apache.org/jira/browse/CASSANDRA-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048637#comment-18048637
]
Nadav Har'El edited comment on CASSANDRA-19270 at 1/1/26 9:53 AM:
------------------------------------------------------------------
I checked again on Cassandra 5.0.6, and the problem still exists.
You're right that in the unit test InsertInvalidateSizedRecordsTest.java the
check
{{ // single column is too large}}
{{ Assertions.assertThatThrownBy(() -> executeNet("INSERT INTO %s (a, b)
VALUES (?, ?)", MEDIUM_BLOB, LARGE_BLOB))}}
{{ .hasRootCauseInstanceOf(InvalidQueryException.class)}}
{{ .hasRootCauseMessage("Key length of " +
(MEDIUM_BLOB.remaining() + LARGE_BLOB.remaining()) + " is longer than maximum
of 65535");}}
Which supposedly verifies that an attempt to use an oversized component in a
compound partition key provides the correct error, but when I run exactly the
same query manually, using the Python driver (as in the example above) it
doesn't actually work! Instead, I get the error:
{{*cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
against any hosts', \{<Host: 127.18.233.206:9042 datacenter1>: error("'H'
format requires 0 <= number <= 65535")})*}}
Are we sure the unit test is running? :) If it is, I wonder if it's possible
that the Python driver does something unexpected in this case, which then
confuses Cassandra. But in any case, Cassandra should report an InvalidRequest,
not a "NoHostAvailable" (which means the problem was detected on the replica,
not the coordinator, and the user thinks he should retry the same query).
was (Author: nyh):
I checked again on Cassandra 5.0.6, and the problem still exists.
You're right that in the unit test InsertInvalidateSizedRecordsTest.java the
check
{{ // single column is too large}}
{{ Assertions.assertThatThrownBy(() -> executeNet("INSERT INTO %s (a, b)
VALUES (?, ?)", MEDIUM_BLOB, LARGE_BLOB))}}
{{ .hasRootCauseInstanceOf(InvalidQueryException.class)}}
{{ .hasRootCauseMessage("Key length of " +
(MEDIUM_BLOB.remaining() + LARGE_BLOB.remaining()) + " is longer than maximum
of 65535");}}
Which supposedly verifies that an attempt to use an oversized component in a
compound partition key provides the correct error, but when I run exactly the
same query manually, using the Python driver (as in the example above) it
doesn't actually work! Instead, I get the error:
{{*cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
against any hosts', \{<Host: 127.18.233.206:9042 datacenter1>: error("'H'
format requires 0 <= number <= 65535")})*}}
{{Are we sure the unit test is running? :) If it is, }}I wonder if it's
possible that the Python driver does something unexpected in this case, which
then confuses Cassandra. But in any case, Cassandra should report an
InvalidRequest, not a "NoHostAvailable" (which means the problem was detected
on the replica, not the coordinator, and the user thinks he should retry the
same query).
> Incorrect error type on oversized compound partition key
> --------------------------------------------------------
>
> Key: CASSANDRA-19270
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19270
> Project: Apache Cassandra
> Issue Type: Bug
> Reporter: Nadav Har'El
> Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
>
> Cassandra limits key lengths (partition and clustering) to 64 KB. If a user
> attempts to INSERT data with a partition key or clustering key exceeding that
> size, the result is a clear InvalidRequest error with a message like "{{{}Key
> length of 66560 is longer than maximum of 65535{}}}".
> There is one exception: If you have a *compound* partition key (i.e., two or
> more partition key components) and attempt to write one of them larger than
> 64 KB, then instead of an orderly InvalidRequest like you got when there was
> just one component, now you get a NoHostAvailable with the message:
> "{{{}error("'H' format requires 0 <= number <= 65535")}){}}}". This is not
> only uglier, it can also confuse the Cassandra driver to retry this request -
> because it doesn't realize that the request itself is broken and there is no
> point to repeat it.
> Interestingly, if there are multiple clustering key columns, this problem
> doesn't happen: we still get a nice InvalidRequest if any one of these is
> more than 64 KB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]