KEYSPACE isn’t a terrible name for a namespace that also configures how keys are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t seem to have the advantages of either.
I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to believe KEYSPACE is really a stumbling block for new users, especially when it connotes something those users should understand about them (the replication configuration). On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko <alek...@apple.com> wrote:
FYI we support SCHEMA as an alias to KEYSPACE today (have since always). Can use CREATE SCHEMA in place of CREATE KEYSPACE, etc. On 4 Apr 2023, at 19:23, Henrik Ingo <henrik.i...@datastax.com> wrote:
I find the Postgres terminology overly complex. Where most SQL databases will have several *databases*, each containing several *tables*, in Postgres we have namespaces, databases, schemas and tables...
Oracle seems to also use the words database, schema and tables. I don't know if it has namespaces.
Ah, ok, so SQL Server actually is like Oracle too!
So in MySQL, referring unambiguously (aka full path) to a table would be:
SELECT * FROM mydb.mytable;
Whereas in Postgresql and Oracle and SQL Server you'd have to:
SELECT * FROM mydb.myschema.mytable; /* And I don't even know what to do with the namespace! */
The Microsoft docs perhaps best explain the role of each: The Database contains the configuration of physical things like where on disk is the database stored. The Schema on the other hand contains "logical" objects like tables, views andprocedures.
MongoDB has databases and collections. As an easter egg / inside joke, it also supports the command `SHOW TABLES` as a synonym for collections.
Personally I would be in favor of introducing `DATABASE` as a synonym for KEYSPACE. The latter could remain the "official" usage.
henrik
While for someone who already knows Cassandra keyspace is something natural, for newcomers it is yet another concept to understand.
If namespace is used in PostgreSQL, it sounds even better to me.
Thanks, - - -- --- ----- -------- ------------- Jacek Lewandowski
My 2 cents:
Keeping it keyspace works for me, namespace could be cool also since we decide where that namespace exists in relation to Datacenters, etc. In our case, a Keyspace is more similar to a namespace than it is to a database since we expect all the UDTs,/UDFs, indexes to refer to only the tables in that keyspace/namespace.
Alternatively interesting to observe and throw some fuel into the discussion , looking at the Postgres (only because there are many distributed databases that are now PG compliant) : From the interwebs: In PostgreSQL, a schema is a namespace that contains named database objects such as tables, views, indexes, data types, functions, stored procedures and operators. A database can contain one or multiple schemas and each schema belongs to only one database.
I used to gripe about this but as a platform gets more complex it is useful to organize PG DBs into schemas. In C* world, I found myself doing similar things by having a prefix : e.g. appprefix_system1 appprefix_system2 ...
Rahul Singh
We create, support, and manage real-time global data & analytics platforms for the modern enterprise. Anant | https://anant.us
3 Washington Circle, Suite 301 Washington, D.C. 20037
KEYSPACE at least makes sense in the context that it is the unit that defines how those partitions keys are going to be treated/replicated
DATABASE may be ambiguous, but it's ambiguity shared across the industry.
Creating a new name like TABLESPACE or TABLEGROUP sounds horrible because it'll be unique to us in the world, and therefore unintuitive for new users.
I think there's competing dynamics here.
1) KEYSPACE isn't that great of a name; it's not a space in which keys are necessarily unique, and you can't address things just by key w/out their respective tables
2) DATABASE isn't that great of a name either due to the aforementioned ambiguity.
Something like "TABLESPACE" or 'TABLEGROUP" would theoretically better satisfy point 1 and 2 above but subjectively I kind of recoil at both equally. So there's that.
On Tue, Apr 4, 2023, at 12:30 PM, Abe Ratnofsky wrote:
I agree with Bowen - I find Keyspace easier to communicate with. There are plenty of situations where the use of "database" is ambiguous (like "Could you help me connect to database x?"), but Keyspace refers to a single thing. I think more software is moving towards calling these things "namespaces" (like Kubernetes), and while "Keyspaces" is not a term used in this way elsewhere, I still find it leads to clearer communication.
--
Abe
I think supporting DATABASE is a great idea.
It's better aligned with SQL databases, and can save new users one of the first troubles they find.
Probably anyone starting to use Cassandra for the first time is going to face the what is a keyspace? question in the first minutes. Saving that to users with a more common name would be a victory for usability IMO.
Hi,
I'd like to propose that we add DATABASE to the CQL grammar as an alternative to KEYSPACE.
Background: While TABLE was introduced as an alternative for COLUMNFAMILY in the grammar we have kept KEYSPACE for the container name for a group of tables. Nearly all traditional SQL databases use DATABASE as the container name for a group of tables so it would make sense for Cassandra to adopt this naming as well.
KEYSPACE would be kept in the grammar but we would update some logging and documentation to encourage use of the new name.
Mike Adamson
--
|