"X-store" refers to how data is stored, in almost every case it refers to what logical constructs are grouped together physically on disk. It has nothing to do with whether a database is relational or not.
Cassandra does, in fact meet the definition of row-store, however, I would like to re-iterate that it goes beyond that and stores all rows for a single partition together on disk as well. Therefore row-store does not do it justice, which is why I like the term "Partitioned row-store" On Mon, Oct 3, 2016 at 12:37 PM, Benedict Elliott Smith <bened...@apache.org > wrote: > ... and my response can be summed up as "you are not parsing English > correctly." The word "like" does not mean what you think it means in this > context. It does not mean "close relative." It is constrained to the > similarities expressed, and no others. You don't seem to be reading any of > my responses about this, though, so I'm not sure parsing is your issue. > > Postgresql has had arrays for years, and all RDBMS (pretty much) avoid > persisting nulls in exactly the same way C* does - encoding their absence > in the row header. > > I empathise with the recent unsubscriber. > > > > On 3 October 2016 at 15:53, Edward Capriolo <edlinuxg...@gmail.com> wrote: > >> My original point can be summed up as: >> >> Do not define cassandra in terms SMILES & METAPHORS. Such words include >> "like" and "close relative". >> >> For the specifics: >> >> Any relational db could (and I'm sure one does!) allow for sparse fields >> as well. MySQL can be backed by rocksdb now, does that make it not a row >> store? >> >> Lets draw some lines, a relational database is clearly defined. >> >> https://en.wikipedia.org/wiki/Edgar_F._Codd >> >> Codd's theorem <https://en.wikipedia.org/wiki/Codd%27s_theorem>, a >> result proven in his seminal work on the relational model, equates the >> expressive power of relational algebra >> <https://en.wikipedia.org/wiki/Relational_algebra> and relational >> calculus <https://en.wikipedia.org/wiki/Relational_calculus> (both of >> which, lacking recursion, are strictly less powerful thanfirst-order >> logic <https://en.wikipedia.org/wiki/First-order_logic>).[*citation >> needed <https://en.wikipedia.org/wiki/Wikipedia:Citation_needed>*] >> >> As the relational model started to become fashionable in the early 1980s, >> Codd fought a sometimes bitter campaign to prevent the term being misused >> by database vendors who had merely added a relational veneer to older >> technology. As part of this campaign, he published his 12 rules >> <https://en.wikipedia.org/wiki/Codd%27s_12_rules> to define what >> constituted a relational database. This made his position in IBM >> increasingly difficult, so he left to form his own consulting company with >> Chris Date and others. >> >> Cassandra is not a relational database. >> >> I am have attempted to illustrate that a "row store" is defined as well. >> I do not believe Cassandra is a "row store". >> >> "Just because it uses log structured storage, sparse fields, and >> semi-flexible collections doesn't disqualify it from calling it a "row >> store"" >> >> What is the definition of "row store". Is it a logical construct or a >> physical one? >> >> Why isn't mongo DB a "row store"? I can drop a schema on top of mongo and >> present it as rows and columns. It seems to pass the litmus test being >> presented. >> >> https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage >> >> >> >> >> >> On Mon, Oct 3, 2016 at 10:02 AM, Jonathan Haddad <j...@jonhaddad.com> >> wrote: >> >>> Sorry Ed, but you're really stretching here. A table in Cassandra is >>> structured by a schema with the data for each row stored together in each >>> data file. Just because it uses log structured storage, sparse fields, and >>> semi-flexible collections doesn't disqualify it from calling it a "row >>> store" >>> >>> Postgres added flexible storage through hstore, I don't hear anyone >>> arguing that it needs to be renamed. >>> >>> Any relational db could (and I'm sure one does!) allow for sparse fields >>> as well. MySQL can be backed by rocksdb now, does that make it not a row >>> store? >>> >>> You're arguing that everything is wrong but you're not proposing an >>> alternative, which is not productive. >>> On Mon, Oct 3, 2016 at 9:40 AM Edward Capriolo <edlinuxg...@gmail.com> >>> wrote: >>> >>>> Also every piece of techincal information that describes a rowstore >>>> >>>> http://cs-www.cs.yale.edu/homes/dna/talks/abadi-sigmod08-slides.pdf >>>> https://en.wikipedia.org/wiki/Column-oriented_DBMS#Row-oriented_systems >>>> >>>> Does it like this: >>>> >>>> 001:10,Smith,Joe,40000; >>>> 002:12,Jones,Mary,50000; >>>> 003:11,Johnson,Cathy,44000; >>>> 004:22,Jones,Bob,55000; >>>> >>>> >>>> >>>> The never depict a scenario where a the data looks like this on disk: >>>> >>>> 001:10,Smith >>>> >>>> 001:10,40000; >>>> >>>> Which is much closer to how Cassandra *stores* it's data. >>>> >>>> >>>> >>>> On Fri, Sep 30, 2016 at 5:12 PM, Benedict Elliott Smith < >>>> bened...@apache.org> wrote: >>>> >>>> Absolutely. A "partitioned row store" is exactly what I would call >>>> it. As it happens, our README thinks the same, which is fantastic. >>>> >>>> I thought I'd take a look at the rest of our cohort, and didn't get far >>>> before disappointment. HBase literally calls itself a " >>>> *column-oriented* store" - which is so totally wrong it's >>>> simultaneously hilarious and tragic. >>>> >>>> I guess we can't blame the wider internet for >>>> misunderstanding/misnaming us poor "wide column stores" if even one of the >>>> major examples doesn't know what it, itself, is! >>>> >>>> >>>> >>>> >>>> On 30 September 2016 at 21:47, Jonathan Haddad <j...@jonhaddad.com> >>>> wrote: >>>> >>>> +1000 to what Benedict says. I usually call it a "partitioned row >>>> store" which usually needs some extra explanation but is more accurate than >>>> "column family" or whatever other thrift era terminology people still use. >>>> On Fri, Sep 30, 2016 at 1:53 PM DuyHai Doan <doanduy...@gmail.com> >>>> wrote: >>>> >>>> I used to present Cassandra as a NoSQL datastore with "distributed" >>>> table. This definition is closer to CQL and has some academic background >>>> (distributed hash table). >>>> >>>> >>>> On Fri, Sep 30, 2016 at 7:43 PM, Benedict Elliott Smith < >>>> bened...@apache.org> wrote: >>>> >>>> Cassandra is not a "wide column store" anymore. It has a schema. Only >>>> thrift users no longer think they have a schema (though they do), and >>>> thrift is being deprecated. >>>> >>>> I really wish everyone would kill the term "wide column store" with >>>> fire. It seems to have never meant anything beyond "schema-less, >>>> row-oriented", and a "column store" means literally the opposite of this. >>>> >>>> Not only that, but people don't even seem to realise the term "column >>>> store" existed long before "wide column store" and the latter is often >>>> abbreviated to the former, as here: http://www.planetcassandra.org >>>> /what-is-nosql/ >>>> >>>> Since it no longer applies, let's all agree as a community to forget >>>> this awful nomenclature ever existed. >>>> >>>> >>>> >>>> On 30 September 2016 at 18:09, Joaquin Casares < >>>> joaq...@thelastpickle.com> wrote: >>>> >>>> Hi Mehdi, >>>> >>>> I can help clarify a few things. >>>> >>>> As Carlos said, Cassandra is a Wide Column Store. Theoretically a row >>>> can have 2 billion columns, but in practice it shouldn't have more than 100 >>>> million columns. >>>> >>>> Cassandra partitions data to certain nodes based on the partition >>>> key(s), but does provide the option of setting zero or more clustering >>>> keys. Together, the partition key(s) and clustering key(s) form the primary >>>> key. >>>> >>>> When writing to Cassandra, you will need to provide the full primary >>>> key, however, when reading from Cassandra, you only need to provide the >>>> full partition key. >>>> >>>> When you only provide the partition key for a read operation, you're >>>> able to return all columns that exist on that partition with low latency. >>>> These columns are displayed as "CQL rows" to make it easier to reason >>>> about. >>>> >>>> Consider the schema: >>>> >>>> CREATE TABLE foo ( >>>> bar uuid, >>>> >>>> boz uuid, >>>> >>>> baz timeuuid, >>>> data1 text, >>>> >>>> data2 text, >>>> >>>> PRIMARY KEY ((bar, boz), baz) >>>> >>>> ); >>>> >>>> >>>> When you write to Cassandra you will need to send bar, boz, and baz and >>>> optionally data*, if it's relevant for that CQL row. If you chose not to >>>> define a data* field for a particular CQL row, then nothing is stored nor >>>> allocated on disk. But I wouldn't consider that caveat to be "schema-less". >>>> >>>> However, all writes to the same bar/boz will end up on the same >>>> Cassandra replica set (a configurable number of nodes) and be stored on the >>>> same place(s) on disk within the SSTable(s). And on disk, each field that's >>>> not a partition key is stored as a column, including clustering keys (this >>>> is optimized in Cassandra 3+, but now we're getting deep into internals). >>>> >>>> In this way you can get fast responses for all activity for bar/boz >>>> either over time, or for a specific time, with roughly the same number of >>>> disk seeks, with varying lengths on the disk scans. >>>> >>>> Hope that helps! >>>> >>>> Joaquin Casares >>>> Consultant >>>> Austin, TX >>>> >>>> Apache Cassandra Consulting >>>> http://www.thelastpickle.com >>>> >>>> On Fri, Sep 30, 2016 at 11:40 AM, Carlos Alonso <i...@mrcalonso.com> >>>> wrote: >>>> >>>> Cassandra is a Wide Column Store http://db-engines.com/en >>>> /system/Cassandra >>>> >>>> Carlos Alonso | Software Engineer | @calonso >>>> <https://twitter.com/calonso> >>>> >>>> On 30 September 2016 at 18:24, Mehdi Bada <mehdi.b...@dbi-services.com> >>>> wrote: >>>> >>>> Hi all, >>>> >>>> I have a theoritical question: >>>> - Is Apache Cassandra really a column store? >>>> Column store mean storing the data as column rather than as a rows. >>>> >>>> In fact C* store the data as row, and data is partionned with row key. >>>> >>>> Finally, for me, Cassandra is a row oriented schema less DBMS.... Is it >>>> true for you also??? >>>> >>>> Many thanks in advance for your reply >>>> >>>> Best Regards >>>> Mehdi Bada >>>> ---- >>>> >>>> *Mehdi Bada* | Consultant >>>> Phone: +41 32 422 96 00 | Mobile: +41 79 928 75 48 | Fax: +41 32 422 >>>> 96 15 >>>> dbi services, Rue de la Jeunesse 2, CH-2800 Delémont >>>> mehdi.b...@dbi-services.com >>>> www.dbi-services.com >>>> >>>> >>>> >>>> >>>> *⇒ dbi services is recruiting Oracle & SQL Server experts ! – Join the >>>> team >>>> <http://www.dbi-services.com/fr/dbi-services-et-ses-collaborateurs/offres-emplois-opportunites-carrieres/>* >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> >