Sorry to veer off from a vote in a vote thread. @Alex, can you say more about this statement: > " I think I would prefer to not introduce the change I have proposed (the one that would bring back 
non-binary compatibility)." If we have a path that resolves the issue and also maintains full compatibility for this (semi- / reluctantly-accessible) interface, that would seem ideal. Interested to 
learn more about the drawbacks to that approach. Regarding the value of C-19534 I'm happy to attest to the fact that it addresses severe metastable failure modes in clusters under heavy traffic on the 
verge of tipping. Jon Haddad's independent testing validated this as discussed on the ticket as well: https://issues.apache.org/jira/browse/CASSANDRA-19534 Last, @Tommy this is a great catch and I'm glad 
you raised it. Thanks for watching so closely and appreciate you bringing it to everyone's attention. – Scott On Jul 31, 2024, at 1:05 PM, Caleb Rackliffe <calebrackli...@gmail.com> wrote: +1 to 
proceeding with a simple upgrade note in NEWS On Wed, Jul 31, 2024 at 12:50 PM Josh McKenzie < jmcken...@apache.org > wrote: Unfortunately, I can not immediately see a good way to provide the 
critical bugfix of CASSANDRA-19534 , affecting all Cassandra users, without making at least some change in this API. I personally think that this method is very tightly coupled to the implementation to 
expose it via -D. If anyone using it could provide some context about why it is an important part of API, it would be give some useful context. Nobody stepping up to engage on the technical piece of 
this? Unless / until somebody does, Alex' argument holds the most weight as the expert with what's going on IMO. The question we're facing is - when we find a defect that requires a change in a public 
facing API, which of the following 2 is more important: Keeping the API stable Having the defect resolved Obviously this will be case-by-case. What CASSANDRA-19534 addresses: When a node is under 
pressure, hundreds of thousands of requests can show up in the native transport queue, and it looks like it can take way longer to timeout than is configured. ... After stopping the load test altogether, 
it took nearly a minute before the requests were no longer queued. I believe our priority here should be having this defect resolved. On Tue, Jul 30, 2024, at 1:43 PM, Jordan West wrote: I would make the 
case that loss of availability / significant performance issue, regardless of the amount of time it has existed for, is worth fixing on the branches that are widely deployed by the community. Especially 
when weighed against a loosely defined public interface issue. The queuing issue has been a persistent problem (like you said 10 years) and I regularly (approx once every 1-2 weeks) have to tell my 
customers “we either have to wait for Cassandra to clear the queues or do a rolling restart to fix it” both which come at a cost during an incident where a client overloaded the DB and the impact is 
severe or business impacting. Especially for customers doing LWTs or using non-standard RFs which are also more prevalent in my experience than an external implementation of QueryHandler. While not data 
loss, I would argue this is a critical bug and if we did find a data loss issue dormant for 10 years (which has happened in the past) we would fix it as soon as it was found and a patch was made 
available on all actively maintained versions. Jordan On Tue, Jul 30, 2024 at 10:30 Jeff Jirsa < jji...@gmail.com > wrote: It’s a 10 year old flaw in an 18 month old branch. Why does it need to go 
into 4.1, it’s not a regression and it clearly breaks compatibility? On Jul 30, 2024, at 8:52 AM, Jon Haddad < j...@jonhaddad.com > wrote: This patch fixes a long standing issue that's the root 
cause of availability failures. Even though folks can specify a custom query handler with the -D flag, the number of users impacted by this is going to be incredibly small. On the other hand, the fix 
helps every single user of 4.1 that puts too much pressure on the cluster, which happens fairly regularly. My POV is that it's a fairly weak argument that this is a public interface, but I don't consider 
it worth debating whether it is or not, because even if it is, this improves stability of the database for all users, so it's worth going in. Let's not be dogmatic about fixes that help 99% of users 
because an incredibly small number that actually implement a custom query handler will need to make a trivial update in order to use the latest 4.1.6 dependency. Jon On Tue, Jul 30, 2024 at 8:09 AM J. D. 
Jordan < jeremiah.jor...@gmail.com > wrote: Given we allow a pluggable query handler implementation to be specified for the server with a -D during startup. So I would consider the query handler 
one of our public interfaces. On Jul 30, 2024, at 9:35 AM, Alex Petrov < al...@coffeenco.de > wrote: Hi Tommy, Thank you for spotting this and bringing this to community's attention. I believe our 
primary interfaces are native and internode protocol, and CLI tools. Most interfaces are used to to abstract implementations internally. Few interfaces, such as DataType, Partitioner, and Triggers can be 
depended upon by external tools using Cassandra as a library. There is no official way to plug in a QueryHandler, so I did not consider it to be a part of our public API. From [1]: > These 
considerations are especially important for public APIs, including CQL, virtual tables, JMX, yaml, system properties, etc. Any planned additions must be carefully considered in the context of any 
existing APIs. Where possible the approach of any existing API should be followed. Maybe we should have an exhaustive list of public APIs, and explicitly mention that native and internode protocols are 
included, alongside with nodetool command API and output, but also which classes/interfaces specifically should be evolved with care. Thank you, --Alex [1] 
https://cassandra.apache.org/_/development/index.html On Tue, Jul 30, 2024, at 10:56 AM, Tommy Stendahl via dev wrote: Hi, There is a change in the QueryHandler interface introduced by 
https://issues.apache.org/jira/browse/CASSANDRA-19534 Do we allow changes such changes between 4.1.5 and 4.1.6? CASSANDRA-19534 looks like a very good change so maybe there is an exception in this case? 
/Tommy -----Original Message----- From : Brandon Williams < brandonwilli...@apache.org > Reply-To : dev@cassandra.apache.org To : dev < dev@cassandra.apache.org > Subject : [VOTE] Release 
Apache Cassandra 4.1.6 Date : Mon, 29 Jul 2024 09:36:04 -0500 Proposing the test build of Cassandra 4.1.6 for release. sha1: b662744af59f3a3dfbfeb7314e29fecb93abfd80 Git: 
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Ftree%2F4.1.6-tentative&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055937277%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=BWaJmvRTXvrMh%2FFBRzt%2FOost%2Bn6xAkgePP2ObtmTnbY%3D&reserved=0
 Maven Artifacts: 
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachecassandra-1339%2Forg%2Fapache%2Fcassandra%2Fcassandra-all%2F4.1.6%2F&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055947610%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=2baa1fUTwQqDpPtFAdv%2FFU6sqax3LSkKEm%2FUdbcHsbE%3D&reserved=0
 The Source and Build Artifacts, and the Debian and RPM packages and repositories, are available here: 
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fcassandra%2F4.1.6%2F&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055951106%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=9FUMT0F7c%2B0y7NbvgN9fQrSNgNO2YGfKMwk9ajy2MKA%3D&reserved=0
 The vote will be open for 72 hours (longer if needed). Everyone who has tested the build is invited to vote. Votes by PMC members are considered binding. A vote passes if there are at least three 
binding +1s and no -1's. [1]: CHANGES.txt: 
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2F4.1.6-tentative%2FCHANGES.txt&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055954173%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3u1LazTB3GixsR7MEwxT%2ByqMrnwHjBL72r8Vy0C1HhE%3D&reserved=0
 [2]: NEWS.txt: 
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2F4.1.6-tentative%2FNEWS.txt&data=05%7C02%7Ctommy.stendahl%40ericsson.com%7C30a819344e48491e561908dcafdbddf4%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638578606055957376%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=4TROx5HB5vJuLTYNoAqMx2A3%2FUUtZ3Edr6aa4JVvHEA%3D&reserved=0
 Kind Regards, Brandon

Reply via email to