Re: Performance problem with large wide row inserts using CQL

Mohit Anchlia Thu, 20 Feb 2014 16:45:07 -0800

On Thu, Feb 20, 2014 at 4:37 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:


> Recomendations in cassandra have a shelf life of about 1 to 2 years. If
> you try to assert a recomendation from year ago you stand a solid chance of
> someone telling you there is now a better way.
>
> Casaandra once loved being a schemaless datastore. Imagine that?
>


> >> I agree with that. I also think that using CQL hides the basics of how
> Cassandra really stores the columns underneath and much of it's
> capabilities. People often confuse CQL with SQL and not completely
> understand that purpose of CQL is simply make it easy for users to
> understand and use Cassandra. Some of the things like INSERT doesn't make
> sense from DB standpoint since everything in Cassandra essentially is
> MUTATION.
>


> On Thursday, February 20, 2014, Peter Lin <wool...@gmail.com> wrote:
> >
> > good example Ed.
> >
> > I'm so happy to see other people doing things like this. Even if the
> official DataStax docs recommend don't mix static and dynamic, to me that's
> a huge disservice to Cassandra users.
> >
> > If someone really wants to stick to relational model, then NewSql is a
> better fit, plus gives users the full power of SQL with subqueries, like,
> and joins. NewSql can't handle these kinds of use cases due to static
> nature of relational tables, row size limit and column limit.
> >
> >
> >
> > On Thu, Feb 20, 2014 at 6:18 PM, Edward Capriolo <edlinuxg...@gmail.com>
> wrote:
> >
> > CASSANDRA-6561 is interesting. Though having statically defined columns
> are not exactly a solution to do everything in "thrift".
> >
> >
> http://planetcassandra.org/blog/post/poking-around-with-an-idea-ranged-metadata/
> >
> > Before collections or CQL existed I did some of these concepts myself.
> >
> > Say you have a column family named AllMyStuff
> >
> > columns named "friends_" would be a string and they would be a "Map" of
> friends to age
> >
> > set AllMySuff[edward][friends_bob]=34
> >
> > set AllMySuff[edward][friends_sara]=33
> >
> > Column name password could be a string
> >
> > set AllMySuff[edward][password]='mother'
> >
> > Columns named phone[00] phone[100] would be an array of phone numbers
> >
> > set AllMySuff[edward][phone[00]]=555-5555'
> >
> > It was quite easy for me to slice all the phone numbers
> >
> > startkey: phone
> > endkey: phone[100]
> >
> > But then every column starting with "action_xxxx" could be a page hit
> and i could have thousands / ten thousands of these
> >
> > In many cases CQL has nice/nicer abstractions for some of these things.
> But its largest detraction for me is that I can not take this already
> existing column family AllMyStuff and 'explain' it to CQL. Its a perfectly
> valid way to design something, and might be (probably) is more space
> efficient then the system of using composites CQL uses to pack things. I
> feel that as a data access language it dictates too much schema, not only
> what is in row schema, but it controls the format of the data on disk as
> well. Also schema's like mine above are very valid but selecting them into
> a table of fixed rows and columns does not map well.
> >
> > The way hive handles tackles this problem, is that the metadata is
> interpreted by a SerDe so that the physical data and the logical definition
> are not coupled.
> >
> >
> >
> >
> > On Thu, Feb 20, 2014 at 5:23 PM, DuyHai Doan <doanduy...@gmail.com>
> wrote:
> >
> > Rüdiger
> >
> > "SortedMap<byte[], SortedMap<byte[], Pair<Long, byte[]>>"
> >
> >  When using a RandomPartitioner or Murmur3Partitioner, the outer map is
> a simple Map, not SortedMap.
> >
> >  The only case you have a SortedMap for row key is when using
> OrderPreservingPartitioner, which is clearly not advised for most cases
> because of hot spots in the cluster.
> >
> >
> >
> > On Thu, Feb 2
>
> --
> Sorry this was sent from mobile. Will do less grammar and spell check than
> usual.
>

Re: Performance problem with large wide row inserts using CQL

Reply via email to