i would expect row per log entry will be substantially faster to query. 2010/7/5 Bartosz Kołodziej <bartosz.kolodz...@gmail.com>: > I have big and dynamic number of loggers. > According to this https://issues.apache.org/jira/browse/CASSANDRA-16 2GB > size limit is no longer an issue in 0.7 (btw mnesia has similar issue ;-) ) > I think I can go with svn release at the moment. > Solving this by composite key (logger+timestamp) would require > OrderPreservingPartitioner to make efficient range queries, while in first > approach in can go with RandomPartitioner (data would be partitioned by > logger - simple and effective). > Btw which model provides faster queries ? > (i need only to get slice (timestamp1 to timestmap2) of data for logger X ) > On Mon, Jul 5, 2010 at 6:23 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >> >> You don't want to have all the data from a single logger in a single >> row b/c of the 2GB size limit. >> >> If you have a small, static number of loggers you could create one CF >> per logger and use timestamp as the row key. Otherwise use a >> composite key (logger+timestamp) as the key in a single CF. >> >> 2010/7/2 Bartosz Kołodziej <bartosz.kolodz...@gmail.com>: >> > I'm new to cassandra, and I want use it to store: >> > loggers = { // (super)ColumnFamily ? >> > logger1 : { // row inside super CF ? >> > timestamp1 : { >> > value : 10 >> > }, >> > timestamp2 : { >> > value : 12 >> > } >> > (many many many more) >> > } >> > logger2 : { //logger of diffrent type (in this example it logs 3 >> > values >> > instead of 1) >> > timestamp1 : { >> > v : 300, >> > c : 123, >> > s : 12.13 >> > }, >> > timestamp2 : { >> > v : 300 >> > c : 123 >> > s : 12.13 >> > } >> > (many many many more) >> > } >> > (many many many more) >> > } >> > the only way i will be accesing this data is: >> > - example: fetch slice of data from logger2 ( start = 1278009131 >> > (timestmap) >> > , end = 1278109131 ) >> > expecting sorted array of data. >> > - example: fetch slice of data from (logger2 and logger10 and logger20 >> > and >> > logger1234) ( start = 1278009131 (timestmap) , end = 1278109131 ) >> > expecting map of sorted arrays of data. [it is basically N queries >> > of >> > first type] >> > is this right definition of above: <ColumnFamily >> > CompareWith="TimeUUIDType" >> > ColumnType="Super" >> > CompareSubcolumnsWith="BytesType" Name="loggers"/> ? >> > what's the best way to model this data in cassadra (keeping in mind >> > partitioning and other important stuff) ? >> > >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com > >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com