Re: how to handle join properly in this case

2013-05-29 Thread Hiller, Dean
There is cassandra partitioning which puts all of one partition on a single node. Playorm's partitions are virtual which we needed way more since it is likely we want 5000 rows from a partition and playorm ends up reading from X disks instead of one disk for better performance. Then we leverage t

Re: how to handle join properly in this case

2013-05-29 Thread Jiaan Zeng
Thanks for all the comments and thoughts! I think Hiller points out a promising direction. I wonder if the partition and filter are features shipped with Cassandra or features came from PlayOrm. Any resources about that would be appreciated. Thanks! On Tue, May 28, 2013 at 11:39 AM, Hiller, Dean

Re: how to handle join properly in this case

2013-05-28 Thread Hiller, Dean
Another option is joins on partitions to keep the number of stuff needing to join relatively small. PlayOrm actually supports joins of partition 1 of table A with partition X of table B. You then just keep the number of rows in each partition at less than millions and you can filter with the wher

Re: how to handle join properly in this case

2013-05-28 Thread aaron morton
A common pattern is to materialise views, that is store the join at the same time you are writing to CF's A and B. In this case it sounds like the two CF's are written to at different times. If that is the case you may need to do the join client side (do two reads). Hope that helps. ---

Re: how to handle join properly in this case

2013-05-26 Thread Vegard Berget
Hi, I am no expert, but a couple of suggestions:1)  Remember that writes are very fast i Cassandra, so don't be afraid to store more information than you would in an Sql-ish server. 2)  It would be better with an example, but again - by storing more than you would in an sql-schema, would you still