On Wed, Oct 13, 2021 at 3:52 AM Henrik Ingo <henrik.i...@datastax.com> wrote:
> Aren't you actually pointing out a limitation in any "single shot"
> transactional algorithm? Including Accord itself, without any interactive
> part?
>
> What you are saying is that an Accord transaction is limited by the need
> for both the client, and coordinator, to be able to keep the entire
> transaction in memory and process it?

I'm under the belief as well that any single-shot transaction protocol
would require some limits on transaction size and/or duration, and
those limits would then be imposed on SQL in a way users coming from a
standard RDBMS (e.g. Postgres) wouldn't expect.  The closest that I've
seen databases get away with is having a distributed layer in the
database that serves as an in-memory lock manager.  Both Spanner and
leanXcale maintain locks in memory in the database while clients
execute transactions, which provides a much higher limit of what one
can do in a transaction, but still presents a degree of complexity to
manage to make sure that clients can't drive servers out of memory.

One could just state that the particular SQL implementation *is*
limited to whatever the constraints of the single-shot transaction
protocol is, and deliver clear documentation of what those limits are
to users, along with being loud about the fact that there are limits.
This has gone okay in other non-SQL systems.  My personal experience
in this subject comes from FoundationDB, which offers a rather
conservative 5 second transaction duration limit and 10MB transaction
size limit.  When presenting a raw key-value API and a database
specifically geared towards supporting OLTP workloads, it works out in
most situations, as users need to write their transactions from
scratch utilizing the database's documentation already.  OLTP is
characterized by short and small transactions, and so things tend to
align anyway.  Some users still tried to implement workloads which
weren't strictly OLTP, and ran into problems.  Offering SQL carries
with it a set of expectations for supported workloads, and I don't
have a concrete example that I can think of for a SQL system with
strict and conservative limits on queries.  My only notes of wisdom
here come from an ex-AWS person I once spoke to, who maintained a
system with partial SQL support, and commented that it was a mistake
due to the support load and customer confusion (but that was more
about a restricted SQL feature set than transaction limitations).

That's not to say that single-shot transaction algorithms aren't
useful, even in the context of SQL.  CockroachDB uses a 3 phase
transaction protocol, which is reduced to only 1 phase when it's a
single partition transaction and Raft may perform the atomic
commitment on its own.  A 1RTT transaction protocol would allow one to
extend that optimized 1 phase protocol to a handful of partitions.
Instead of only supporting 1 phase execution of a point insert, one
could support 1 phase execution of point-ish queries, such as an
insert into a table along with a handful of indexes on that table.  I
think there would still need to be a way to degrade into some other
transaction protocol to support extremely large or long-running
queries, but any single-shot multi-partition transaction protocol
(Accord or otherwise) would likely offer ways to optimize your slow
path transaction protocol.  Maybe it's not really surprising though
that protocols designed for "let me transact my entire database at
once" versus "let me transact a few related keys together" turn out to
be relatively different sorts of protocols...

On Wed, Oct 13, 2021 at 3:52 AM Henrik Ingo <henrik.i...@datastax.com> wrote:
> I responded to Blake's similar comment on this topic. Out of respect for
> his request to move the discussion to a newly created thread, I will not
> elaborate here rather just reference my reply to Blake.

Oh!  I missed the new thread.  Thanks!  More transaction processing~~!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to