Hi Jon, Thanks for quick reply, I'm a newbie to Cassandra. Even though I made a mistake in previous mail. you got it right. I'll check what you've said.
Cheers, Rajith. On Fri, Oct 18, 2013 at 11:47 AM, Jonathan Haddad <j...@jonhaddad.com> wrote: > I'd avoid using super columns. I don't believe they're recommended > anymore, and with CQL3 they aren't even supported (if you're interested in > going that route). I think it's unlikely that you'll want a column family > per company either. > > How many "ticker" entries do you plan on writing per company? You've got > a lot of elipses in there as well, which makes me wonder what other data > you're looking to store. > > To take a guess, I'd wager you'd be looking for a trades table, and > another table that tracks the closing price per day. In the trades table, > something along the lines of this CQL3 definition might be helpful: > > create table trades ( > company text, > ts timeuuid, > price decimal, > primary key(company, ts) ); > > This would give you a single row in the traditional Cassandra sense, and > it would be ordered by the timestamp you supply. You can use a timeuuid to > avoid the duplicate timestamp problem. > > This is about as far as I can go without knowing more about what you're > actually trying to do... I think it's going to be difficult for anyone to > give you helpful advice unless you can elaborate a bit on what your > requirements are. > > Jon > > > > On Thu, Oct 17, 2013 at 10:51 PM, Rajith Siriwardana < > rajithsiriward...@gmail.com> wrote: > >> Hi all, >> >> I have a problem like this, >> >> I have stock transaction data, as follows. >> Ticker data: >> Company name: >> timestamp: >> closing price (N): (V) >> trades (N) : (V) >> ...... >> ..... >> ...... >> >> In my model : I want to execute range queries on timestamps, (sorted >> order) >> >> approaches currently have in mind, >> >> 1. I can have ticker data : columnfamily, company name : rowkey, >> timestamp: super column, and other attributes as columns. In this way >> there will be around *100 rowkeys*, around *1M timestamps*, around *10 >> columns under one super column.* >> Problems >> >> - Cassandra best practices are to use the RandomPartitioner - this >> gives you 'free' load balancing, as long as your tokens are evenly >> distributed. so the load balancing would happen on 100 row keys. is >> this >> acceptable approach? >> - and there is a possibility to have duplicates in timestamps. >> that will be a problem. >> >> >> 2. I can have ticker data : keyspace, company name : column family, >> timestamp: row key, and other attributes as columns. In this way there >> will be around *100 column families*, around *1M row keys*, around *10 >> columns per one row.* >> >> Problems >> >> - In this way, range queries are not in sorted order. >> - and I guess there is also duplicate row key problem >> >> Any suggestions how I can overcome this? >> >> Cheers, >> Rajith >> >> >> >> > > > -- > Jon Haddad > http://www.rustyrazorblade.com > skype: rustyrazorblade >