Hi All I've another related question.
I am using a stream of records of the form (A, B, n) where the pair (A,B) can occur multiple times. For example, you could have the following rset of records - A, B, 2 P, Q, 5 X, Y, 3 A, B, 8 A, B, 2 ... The data store has a set of columns - (key, count, sum). Because of the possibility of duplicate A and B, I am using the string A+B as my key. Every time there is a duplicate A+B, I update a count field, and add "n" to the existing value of sum. So, for the above set of records, cassandra should actually hold the following set - A+B, 3, 12 P+Q, 1, 5 X+Y, 1, 3 ... My question is - is it possible to have multiple threads reading different streams so that I can parallelize the insertion mechanism? What may happen if two threads try to insert two different records with the same A+B key? Regards Arijit On 11 October 2010 18:32, Gary Dusbabek <gdusba...@gmail.com> wrote: > On Mon, Oct 11, 2010 at 04:01, Arijit Mukherjee <ariji...@gmail.com> wrote: >> Hi All >> >> I've just started reading about Cassandra and writing simple tests >> using Cassandra 0.6.5 to see if we can use it for our product. >> >> I have a data store with a set of columns, like C1, C2, C3, and C4, >> but the columns aren't mandatory. For example, there can be a list of >> (k.v) pairs with only C1 and C2, but no C3 and C4. At the same time, >> there can be a set of records with all the columns present. It's >> possible to consider them as three sets A (with all columns), B (with >> C1 and C2) and C (with C3 and C4). And I'm trying to find out the >> following: >> >> 1. A - B (all records who don't have C3 and C4) and A - C (all record >> who don't have C1 and C2) >> 2. records for whom C2 != C4 >> >> It's possible to pick all records and do this processing in my client >> code - but that won't perform well. Is there any way to do these >> within Cassandra? For example, by passing a list of column names so >> that cassandra returns the records with only those columns? > > multiget_slice with the SlicePredicate specified using column_names > can do the lookups. As far as doing the set operations: no, Cassandra > doesn't have the ability to do this server-side. > > Gary. > > >> >> Regards >> Arijit >> > -- "And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be."