AFAIK, Cassandra will not process schema changes in parallel. However,
by sending requests in parallel, you can minimise the time Cassandra
staying idle while the client is waiting for schema agreement after each
CREATE KEYSPACE statement.
On 09/03/2022 20:46, Leon Zaruvinsky wrote:
Hi Bowen,
Haha, agree with you on wanting fewer keyspaces but unfortunately
we're kind of locked in to our architecture for the time being.
We do part of what you're saying, in that we shut down all but one
node and then run CREATE against that single node. But we do that
serially, O(keyspaces). If we were to submit the CREATE statements in
parallel, is your claim that Cassandra would process these in parallel
as well?
Thanks,
Leon
On Wed, Mar 9, 2022 at 12:46 PM Bowen Song <bo...@bso.ng> wrote:
First of all, you really shouldn't have that many keyspaces. Put that
aside, the quickest way to create large number of keyspaces without
causing schema disagreement is create keyspaces in parallel over a
connection pool with a number of connections all against the same
single
Cassandra node. Because all CREATE KEYPSPACE statements are sent
to the
same node, you don't need to worry about schema disagreement it may
cause, as the server side internally will ensure the consistency
of the
schema.
On 09/03/2022 18:35, Leon Zaruvinsky wrote:
> Hey folks,
>
> A step in our Cassandra restore process is to re-create every
keyspace
> that existed in the backup in a brand new cluster. Because these
> creations are sequential, and because we have _a lot_ of keyspaces,
> this ends up being the slowest part of our restore. We already
have
> some optimizations in place to speed up schema agreement after each
> create, but even so we'd like to get the time down significantly
more.
>
> I was curious if anyone has any guidance or has experimented
with ways
> of creating keyspaces that are faster than a bunch of CREATE calls.
> It's fine for the cluster to be offline during the process.
>
> Thanks,
> Leon