@paul - cassandra is really good for storing indices. But i like redis because it provides us with some of the really good data-structures like sorted sets and all. So we use both to their strengths. For example in a forum - all the posts and replies in a thread + which user is following which threads etc etc are all stored in cassandra.
But for something like which threads make up the top page (should be able to sort these both in terms of latest posted thread first - most active thread first etc etc) - sorted sets makes lots of sense as recomputing the whole (which threads make up the first page) is too much to re-compute every time a post is made. Hence in such cases i use redis but for storage of posts and follower lists et al i use cassandra. @micheal - Well with mysql writes are fast but reads are slow cause of joins et al . my main criterion is reads should be really really fast - which means i have to precompute indices so that my front end code will just read the index and output it - no more complex joins et al . so you need a store which is really good for storing lots of indices - in my opinion cassandra is one of the best store for storing indices. Hence the use of cassandra. Also complete de-normalization isn't so easy. There will be many cases where it will be a pain in the ass. I could handle most of such cases with the help of Varnish - ESI module. For example. You have a timeline of posts stored in cassandra. But we cannot store the author information in all the posts as it is subjected to change any time. This is the classic case of a join. If you want to use a NoSQL for that then you will end up making multiple calls to the database (first to fetch the posts and then to fetch the user information). But if you just use ESI then no need to make so many calls because you can cache that small user info module for as long as that information doesn't change. This seems a much more elegant solution than a mysql join. Hope i explained the point. Cheers, Deepu. On Tue, Jul 13, 2010 at 11:38 AM, Michael Dürgner <mich...@duergner.de>wrote: > Are your PVs mostly read or write? As if they are read, I'd think you > wouldn't need a Cassandra like storage which is tuned towards writes. > > Am 12.07.2010 um 23:40 schrieb Sandeep Kalidindi at PaGaLGuY.com: > > > well we were going down constantly with VB running on 3-4 dedicated > servers due to huge traffic(couple of tens of millions of page views). We > are also planning on some new major features, hence the shift to cassandra > with future in mind. > > > > Well roughly the architecture is like this(in order of how the request > proceeds) :- > > > > 1) Varnish - php reads from cassandra and the performance isn't always > good(i am still yet to master it though. so probably my lack of expertise > here). So we use heavy use of varnish to cache as much as possible. VCL > means we can cache same page for different logged in users differently. ESI > means no need to worry about joins. Really varnish is quite a good companion > for NoSQL . > > > > 2) Front end php servers - contains most of the template code - reads > directly from cassandra and Redis. > > > > 3) Middleware(written in scala + python -- planning to move middleware to > scala completely to reduce no of langs in production) - all writes from php > directly go to the middleware - As cassandra is infact mostly a storage of > indices - which means you need to change your strategy from mysql(post > computation) to precomputing all the needed indices and storing them on > cassandra. so middleware takes care of computing the indices and storing > them in cassandra and redis accordingly. This way php will just submit the > write to middleware and the request can be completed while middleware might > take couple secs at most to compute the indices and finish the request > completely. > > > > 4) Cassandra + redis clusters. > > > > > > So writes are taken care of by the middleware and hence writes complete > uber fast and reads are also quite fast courtesy of utilizing varnish where > ever it helps. > > > > Still not in production though. Hope it helped. Would welcome anybody's > suggestions on the way i am using cassandra and if i can do anything better > > > > Cheers, > > Deepu. > > > > On Tue, Jul 13, 2010 at 2:48 AM, S Ahmed <sahmed1...@gmail.com> wrote: > > What sort of traffic levels made you port the application to Cassandra? > > > > Very interested in seeing this go live. > > > > What sort of server setup are you looking at using? > > > > > > On Mon, Jul 12, 2010 at 4:39 PM, Sandeep Kalidindi at PaGaLGuY.com < > sandeep.kalidi...@pagalguy.com> wrote: > > No we re-coded from scratch with most of the needed functionality. > > > > Cheers, > > Deepu. > > > > > > On Mon, Jul 12, 2010 at 7:49 PM, S Ahmed <sahmed1...@gmail.com> wrote: > > Very interesting! > > > > What kind of integration do you have between vB and Cassandra? its not a > port then? > > > > > > On Mon, Jul 12, 2010 at 3:34 AM, Sandeep Kalidindi at PaGaLGuY.com < > sandeep.kalidi...@pagalguy.com> wrote: > > we were one of the vbulletin customers and our forums has been facing > some bad scaling issues. > > > > we coded our forum software to work with cassandra. we are still testing > for bugs and might go live in couple of weeks. You can ask any specific > questions about vbulletin and cassandra and i will answer to the best of my > knowledge. > > > > I our case a combination of cassandra and redis took care of most of the > functionality that vbulletin offers and much more. > > > > Cheers, > > Deepu. > > > > > > On Mon, Jul 12, 2010 at 9:58 AM, Paul Prescod <pres...@gmail.com> wrote: > > On Sun, Jul 11, 2010 at 8:39 AM, S Ahmed <sahmed1...@gmail.com> wrote: > > > I want to build a vBulletin type application (forums, threads, posts, > user > > > management, etc). > > > Support multi-tenancy for a Saas type environment. > > > Would Cassandra be suitable for this type of application? > > > > > > > > > Thanks in advance. > > > > Most likely, it is technically a fine fit. But Cassandra is very early > > stage software, so you should expect that the documentation will not > > always be clear and things will change from version to version. If you > > are not extremely self-reliant, you may find it a frustrating > > experience. Unless you are confident you will have trouble scaling > > traditional technologies, it might not make business sense. > > > > Paul Prescod > > > > > > > > > > > >