What is the upper limit on the number of super columns? Is it pretty much the same as for columns in general?
On Apr 28, 2010, at 10:09 PM, Schubert Zhang wrote: > key : stock ID, e.g. AAPL+year > column family: closting price and valume, tow CFs. > colum name: timestamp LongType > > AAPL+2010-> CF:closingPrice -> {'04-13' : 242, '04-14': 245} > AAPL+2010-> CF:volume -> {'04-13' : 242, '04-14': 245} > > > On Thu, Apr 22, 2010 at 2:00 AM, Miguel Verde <miguelitov...@gmail.com> wrote: > On Wed, Apr 21, 2010 at 12:17 PM, Steve Lihn <stevel...@gmail.com> wrote: > [...] > > Design 1: Each attribute is a super column. Therefore each date is a column. > So we have: > > AAPL -> closingPrice -> { '2010-04-13' : 242, '2010-04-14': 245 } > AAPL -> volume -> { '2010-04-13' : 10.9m, '2010-04-14': 14.4m } > etc. > I would suggest not using this design, as each query involving an attribute > will pull all dates for that attribute into memory on the server. i.e. > getting the closingPrice for AAPL on '2010-04-13' would pull all closing > prices for AAPL across all dates into memory. > > > Design 2: Each date is a super column. Therefore each attribute is a column. > So we have: > > AAPL -> '2010-04-13' -> { closingPrice -> 242, volume -> 10.9m } > AAPL -> '2010-04-14' -> {closingPrice -> 245, volume -> 14.4m } > etc. > > The date column / superColumn will need Order Perserving Partitioner since we > are going to do a lot of range queries. > > Partitioners split up keys between nodes, the partitioner you use has no > effect on your ability to query columns in a row. > > Examples are: > Query 1: Give me the data between date1 and date2 for a set of tickers (say, > the 100 tickers in QQQ). > You could use http://wiki.apache.org/cassandra/API#multiget_slice for this. > > Query 2: More often than not, the query is: Give me the data for the max > available dates (for each ticker) between date1 and date2 in a set of tickers. > (Since not every day is traded, and we only want the most recent data, given > a range of dates.) > A http://wiki.apache.org/cassandra/API#SliceRange allows you to specify > limits and ordering for columns you are slicing. > > > > > >