On Wed, Oct 13, 2021 at 3:52 AM Henrik Ingo <henrik.i...@datastax.com> wrote: > Aren't you actually pointing out a limitation in any "single shot" > transactional algorithm? Including Accord itself, without any interactive > part? > > What you are saying is that an Accord transaction is limited by the need > for both the client, and coordinator, to be able to keep the entire > transaction in memory and process it?
I'm under the belief as well that any single-shot transaction protocol would require some limits on transaction size and/or duration, and those limits would then be imposed on SQL in a way users coming from a standard RDBMS (e.g. Postgres) wouldn't expect. The closest that I've seen databases get away with is having a distributed layer in the database that serves as an in-memory lock manager. Both Spanner and leanXcale maintain locks in memory in the database while clients execute transactions, which provides a much higher limit of what one can do in a transaction, but still presents a degree of complexity to manage to make sure that clients can't drive servers out of memory. One could just state that the particular SQL implementation *is* limited to whatever the constraints of the single-shot transaction protocol is, and deliver clear documentation of what those limits are to users, along with being loud about the fact that there are limits. This has gone okay in other non-SQL systems. My personal experience in this subject comes from FoundationDB, which offers a rather conservative 5 second transaction duration limit and 10MB transaction size limit. When presenting a raw key-value API and a database specifically geared towards supporting OLTP workloads, it works out in most situations, as users need to write their transactions from scratch utilizing the database's documentation already. OLTP is characterized by short and small transactions, and so things tend to align anyway. Some users still tried to implement workloads which weren't strictly OLTP, and ran into problems. Offering SQL carries with it a set of expectations for supported workloads, and I don't have a concrete example that I can think of for a SQL system with strict and conservative limits on queries. My only notes of wisdom here come from an ex-AWS person I once spoke to, who maintained a system with partial SQL support, and commented that it was a mistake due to the support load and customer confusion (but that was more about a restricted SQL feature set than transaction limitations). That's not to say that single-shot transaction algorithms aren't useful, even in the context of SQL. CockroachDB uses a 3 phase transaction protocol, which is reduced to only 1 phase when it's a single partition transaction and Raft may perform the atomic commitment on its own. A 1RTT transaction protocol would allow one to extend that optimized 1 phase protocol to a handful of partitions. Instead of only supporting 1 phase execution of a point insert, one could support 1 phase execution of point-ish queries, such as an insert into a table along with a handful of indexes on that table. I think there would still need to be a way to degrade into some other transaction protocol to support extremely large or long-running queries, but any single-shot multi-partition transaction protocol (Accord or otherwise) would likely offer ways to optimize your slow path transaction protocol. Maybe it's not really surprising though that protocols designed for "let me transact my entire database at once" versus "let me transact a few related keys together" turn out to be relatively different sorts of protocols... On Wed, Oct 13, 2021 at 3:52 AM Henrik Ingo <henrik.i...@datastax.com> wrote: > I responded to Blake's similar comment on this topic. Out of respect for > his request to move the discussion to a newly created thread, I will not > elaborate here rather just reference my reply to Blake. Oh! I missed the new thread. Thanks! More transaction processing~~! --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org