If you have set up the table as described in my previous message, you could
run this python snippet to return the desired result:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging
logging.basicConfig()
from operator import itemgetter
import cassandra
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement
cql_cluster = Cluster()
cql_session = cql_cluster.connect()
cql_session.set_keyspace('latest')
select_stmt = "select * from time_series where userid = 'XYZ'"
query = SimpleStatement(select_stmt)
rows = cql_session.execute(query)
results = []
for row in rows:
max_time = max(row.colname.keys())
results.append((row.userid, row.pkid, max_time, row.colname[max_time]))
sorted_results = sorted(results, key=itemgetter(2), reverse=True)
for result in sorted_results: print result
# prints:
# (u'XYZ', u'1002', u'204', u'Col-Name-5')
# (u'XYZ', u'1000', u'203', u'Col-Name-4')
# (u'XYZ', u'1001', u'201', u'Col-Name-2')
On Tue, Sep 10, 2013 at 6:32 PM, Laing, Michael
<[email protected]>wrote:
> You could try this. C* doesn't do it all for you, but it will efficiently
> get you the right data.
>
> -ml
>
> -- put this in <file> and run using 'cqlsh -f <file>
>
> DROP KEYSPACE latest;
>
> CREATE KEYSPACE latest WITH replication = {
> 'class': 'SimpleStrategy',
> 'replication_factor' : 1
> };
>
> USE latest;
>
> CREATE TABLE time_series (
> userid text,
> pkid text,
> colname map<text, text>,
> PRIMARY KEY (userid, pkid)
> );
>
> UPDATE time_series SET colname = colname + {'200':'Col-Name-1'} WHERE
> userid = 'XYZ' AND pkid = '1000';
> UPDATE time_series SET colname = colname +
> {'201':'Col-Name-2'} WHERE userid = 'XYZ' AND pkid = '1001';
> UPDATE time_series SET colname = colname +
> {'202':'Col-Name-3'} WHERE userid = 'XYZ' AND pkid = '1000';
> UPDATE time_series SET colname = colname +
> {'203':'Col-Name-4'} WHERE userid = 'XYZ' AND pkid = '1000';
> UPDATE time_series SET colname = colname +
> {'204':'Col-Name-5'} WHERE userid = 'XYZ' AND pkid = '1002';
>
> SELECT * FROM time_series WHERE userid = 'XYZ';
>
> -- returns:
> -- userid | pkid | colname
>
> ----------+------+-----------------------------------------------------------------
> -- XYZ | 1000 | {'200': 'Col-Name-1', '202': 'Col-Name-3', '203':
> 'Col-Name-4'}
> -- XYZ | 1001 | {'201':
> 'Col-Name-2'}
> -- XYZ | 1002 | {'204':
> 'Col-Name-5'}
>
> -- use an app to pop off the latest key/value from the map for each row,
> then sort by key desc.
>
>
> On Tue, Sep 10, 2013 at 9:21 AM, Ravikumar Govindarajan <
> [email protected]> wrote:
>
>> I have been faced with a problem of grouping composites on the
>> second-part.
>>
>> Lets say my CF contains this
>>
>>
>> TimeSeriesCF
>> key: UserID
>> composite-col-name: TimeUUID:PKID
>>
>> Some sample data
>>
>> UserID = XYZ
>> Time:PKID
>> Col-Name1 = 200:1000
>> Col-Name2 = 201:1001
>> Col-Name3 = 202:1000
>> Col-Name4 = 203:1000
>> Col-Name5 = 204:1002
>>
>> Whenever a time-series query is issued, it should return the following in
>> time-desc order.
>>
>> UserID = XYZ
>> Col-Name5 = 204:1002
>> Col-Name4 = 203:1000
>> Col-Name2 = 201:1001
>>
>> Is something like this possible in Cassandra? Is there a different way to
>> design and achieve the same objective?
>>
>> --
>> Ravi
>>
>>
>
>