Re: Re: Dynamic Columns

Jonathan Lacefield Wed, 21 Jan 2015 04:39:34 -0800

Hello,

  Peter highlighted the tradeoff between Thrift and CQL3 nicely in this
case, i.e. requiring a different design approach for this solution.
Collections do not sound like a good fit for your current challenge, but is
there a different way to design/solve your challenge using CQL techniques?


  It is recommended to leverage CQL for new projects as this is the
direction that Cassandra is heading and where the majority of effort is
being applied from a development perspective.

  Sounds like you have a decision to make.  Leverage Thrift and the Dynamic
Column approach to solving this problem.  Or, rethink the design approach
and leverage CQL.

  Please let the mailing list know the direction you choose.

Jonathan

[image: datastax_logo.png]

Jonathan Lacefield

Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

[image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

On Tue, Jan 20, 2015 at 9:46 PM, Peter Lin <wool...@gmail.com> wrote:

>
> the thing is, CQL only handles some types of dynamic column use cases.
> There's plenty of examples on datastax.com that shows how to do CQL style
> dynamic columns.
>
> based on what was described by Chetan, I don't feel CQL3 is a perfect fit
> for what he wants to do. To use CQL3, he'd have to change his approach.
>
> In my temporal database, I use both Thrift and CQL. They compliment each
> other very nice. I don't understand why people have to put down Thrift or
> pretend it supports 100% of the use cases. Lots of people who started using
> Cassandra pre CQL and had no problems using thrift. Yes you have to
> understand more and the learning curve is steeper, but taking time to learn
> the internals of cassandra is a good thing.
>
> Using CQL3 lists or maps, it would force the query to load the enter
> collection, but that is by design. To get the full power of the old style
> of dynamic columns, thrift is a better fit. I hope CQL continues to improve
> so that it supports 100% of the existing use cases.
>
>
>
> On Tue, Jan 20, 2015 at 8:50 PM, Xu Zhongxing <xu_zhong_x...@163.com>
> wrote:
>
>> I approximate dynamic columns by data_key and data_value columns.
>> Is there a better way to get dynamic columns in CQL 3?
>>
>> At 2015-01-21 09:41:02, "Peter Lin" <wool...@gmail.com> wrote:
>>
>>
>> I think that table example misses the point of chetan's functional
>> requirement. he actually needs dynamic columns.
>>
>> On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing <xu_zhong_x...@163.com>
>> wrote:
>>
>>> Maybe this is the closest thing to "dynamic columns" in CQL 3.
>>>
>>> create table reivew (
>>>     product_id bigint,
>>>     created_at timestamp,
>>>     data_key text,
>>>     data_tvalue text,
>>>     data_ivalue int,
>>>     primary key ((priduct_id, created_at), data_key)
>>> );
>>>
>>> data_tvalue and data_ivalue is optional.
>>>
>>> At 2015-01-21 04:44:07, "chetan verma" <chetanverm...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> Adding to previous mail. For example: We have a column family named
>>> review (with some arbitrary data in map).
>>>
>>> CREATE TABLE review(
>>> product_id bigint,
>>> created_at timestamp,
>>> data_int map<text, int>,
>>> data_text map<text, text>,
>>> PRIMARY KEY (product_id, created_at)
>>> );
>>>
>>> Assume that these 2 maps I use to store arbitrary data (i.e. data_int
>>> and data_text for int and text values)
>>> when we see output on cassandra-cli, it looks like in a partition as :
>>> <clustering_key>:data_int:map_key as column name and value as map value.
>>> suppose I need to get this value, I couldn't do that with CQL3 but in
>>> thrift its possible. Any Solution?
>>>
>>> On Wed, Jan 21, 2015 at 1:06 AM, chetan verma <chetanverm...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Most of the time I will  be querying on product_id and created_at, but
>>>> for analytic I need to query almost on all column.
>>>> Multiple collections ideas is good but the only is cassandra reads a
>>>> collection entirely, what if I need a slice of it, I mean
>>>> columns for certain keys which is possible with thrift. Please suggest.
>>>>
>>>> On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield <
>>>> jlacefi...@datastax.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> There are probably lots of options to this challenge.  The more
>>>>> details around your use case that you can provide, the easier it will be
>>>>> for this group to offer advice.
>>>>>
>>>>> A few follow-up questions:
>>>>>   - How will you query this data?
>>>>>   - Do your queries require filtering on specific columns other than
>>>>> product_id and created_at, i.e. the dynamic columns?
>>>>>
>>>>> Depending on the answers to these questions, you have several options,
>>>>> of which here are a few:
>>>>>
>>>>>    - Cassandra efficiently stores sparse data, so you could create
>>>>>    columns and not populate them, without much of a penalty
>>>>>    - Could use a clustering column to store a columns type and
>>>>>    another col (potentially clustering) to store the value
>>>>>       - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
>>>>>       col4...n, PRIMARY KEY (col1, attname, attvalue));
>>>>>       - where attname stores the name of the attribute/column and
>>>>>       attvalue stores the value of that attribute
>>>>>       - have seen users use this model and create a "main" attribute
>>>>>       row within a partition that stores the values associated with 
>>>>> col4...n
>>>>>    - Could store multiple collections
>>>>>    - Others probably have ideas as well
>>>>>
>>>>> You may want to look in the archives for a similar discussion topic.
>>>>> Believe this item was asked a few months ago as well.
>>>>>
>>>>> [image: datastax_logo.png]
>>>>>
>>>>> Jonathan Lacefield
>>>>>
>>>>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com
>>>>>
>>>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>>> <https://twitter.com/datastax> [image: g+.png]
>>>>> <https://plus.google.com/+Datastax/about>
>>>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>>>>
>>>>> On Tue, Jan 20, 2015 at 1:40 PM, chetan verma <chetanverm...@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am creating a review system. for instance lets assume following are
>>>>>> the attibutes of system:
>>>>>>
>>>>>> Review{
>>>>>> id bigint,
>>>>>> product_id bigint,
>>>>>> created_at timestamp,
>>>>>> summary text,
>>>>>> description text,
>>>>>> pros set<text>,
>>>>>> cons set<text>,
>>>>>> feature_rating map<text, int>
>>>>>> etc....
>>>>>> }
>>>>>> I created partition key as product_id (so that all the reviews for a
>>>>>> given product will reside on same node)
>>>>>> and clustering key as created_at and id (Desc) so that  reviews will
>>>>>> be sorted by time.
>>>>>>
>>>>>> I can have more column and that requirement I want to fulfil by
>>>>>> dynamic columns but there are limitations to it explained above.
>>>>>> Could you please let me know the best way.
>>>>>>
>>>>>> On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield <
>>>>>> jlacefi...@datastax.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>>   Have you looked at solving this challenge with clustering
>>>>>>> columns?  Also, please describe the problem set details for more 
>>>>>>> specific
>>>>>>> advice from this group.
>>>>>>>
>>>>>>>   Starting new projects on Thrift isn't the recommended approach.
>>>>>>>
>>>>>>> Jonathan
>>>>>>>
>>>>>>> [image: datastax_logo.png]
>>>>>>>
>>>>>>> Jonathan Lacefield
>>>>>>>
>>>>>>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com
>>>>>>>
>>>>>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>>>>>>> facebook.png] <https://www.facebook.com/datastax> [image:
>>>>>>> twitter.png] <https://twitter.com/datastax> [image: g+.png]
>>>>>>> <https://plus.google.com/+Datastax/about>
>>>>>>> <http://feeds.feedburner.com/datastax>
>>>>>>> <https://github.com/datastax/>
>>>>>>>
>>>>>>> On Tue, Jan 20, 2015 at 1:24 PM, chetan verma <
>>>>>>> chetanverm...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am starting a new project with cassandra as database.
>>>>>>>> I have unstructured data so I need dynamic columns,
>>>>>>>> though in CQL3 we can achive this via Collections but there are
>>>>>>>> some downsides to it.
>>>>>>>> 1. Collections are used to store small amount of data.
>>>>>>>> 2. The maximum size of an item in a collection is 64K.
>>>>>>>> 3. Cassandra reads a collection in its entirety.
>>>>>>>> 4. Restrictions on number of items in collections is 64,000
>>>>>>>>
>>>>>>>> And no support to get single column by map key, which is possible
>>>>>>>> via cassandra cli.
>>>>>>>> Please suggest whether I should use CQL3 or Thrift and which driver
>>>>>>>> is best.
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Regards,*
>>>>>>>> *Chetan Verma*
>>>>>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Regards,*
>>>>>> *Chetan Verma*
>>>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Regards,*
>>>> *Chetan Verma*
>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>
>>>
>>>
>>>
>>> --
>>> *Regards,*
>>> *Chetan Verma*
>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>
>>>
>>
>

Re: Re: Dynamic Columns

Reply via email to