The thing about slow on joins is true (we experience that ourselves) but still 
I wonder myself, why you use cassandra for the indices. Can't you just store 
them in MySQL although?

Am 13.07.2010 um 08:26 schrieb Sandeep Kalidindi at PaGaLGuY.com:

> @paul - cassandra is really good for storing indices. But i like redis 
> because it provides us with some of the really good data-structures like 
> sorted sets and all. So we use both to their strengths. For example in a 
> forum - all the posts and replies in a thread + which user is following which 
> threads etc etc are all stored in cassandra. 
> 
> But for something like which threads make up the top page (should be able to 
> sort these both in terms of latest posted thread first - most active thread 
> first etc etc) - sorted sets makes lots of sense as recomputing the whole 
> (which threads make up the first page) is too much to re-compute every time a 
> post is made. Hence in such cases i use redis but for storage of posts and 
> follower lists et al  i use cassandra. 
> 
> @micheal  - Well with mysql writes are fast but reads are slow cause of joins 
> et al . my main criterion is reads should be really really fast - which means 
> i have to precompute indices so that my front end code will just read the 
> index and output it - no more complex joins et al . so you need a store which 
> is really good for storing lots of indices - in my opinion cassandra is one 
> of the best store for storing indices. Hence the use of cassandra. 
> 
> Also complete de-normalization isn't so easy. There will be many cases where 
> it will be a pain in the ass. I could handle most of such cases with the help 
> of Varnish - ESI module. For example. You have a timeline of posts stored in 
> cassandra. But we cannot store the author information in all the posts as it 
> is subjected to change any time. This is the classic case of a join. If you 
> want to use a NoSQL for that then you will end up making multiple calls to 
> the database (first to fetch the posts and then to fetch the user 
> information).  But if you just use ESI then no need to make so many calls 
> because you can cache that small user info module for as long as that 
> information doesn't change. This seems a much more elegant solution than a 
> mysql join. Hope i explained the point. 
> 
> Cheers,
> Deepu.
> 
> On Tue, Jul 13, 2010 at 11:38 AM, Michael Dürgner <mich...@duergner.de> wrote:
> Are your PVs mostly read or write? As if they are read, I'd think you 
> wouldn't need a Cassandra like storage which is tuned towards writes.
> 
> Am 12.07.2010 um 23:40 schrieb Sandeep Kalidindi at PaGaLGuY.com:
> 
> > well we were going down constantly with VB running on 3-4 dedicated servers 
> > due to huge traffic(couple of tens of millions of page views). We are also 
> > planning on some new major features, hence the shift to cassandra with 
> > future in mind.
> >
> > Well roughly the architecture is like this(in order of how the request 
> > proceeds) :-
> >
> > 1) Varnish - php reads from cassandra and the performance isn't always 
> > good(i am still yet to master it though. so probably my lack of expertise 
> > here).  So we use heavy use of varnish to cache as much as possible. VCL 
> > means we can cache same page for different logged in users differently. ESI 
> > means no need to worry about joins. Really varnish is quite a good 
> > companion for NoSQL .
> >
> > 2) Front end php servers - contains most of the template code - reads 
> > directly from cassandra and Redis.
> >
> > 3) Middleware(written in scala + python -- planning to move middleware to 
> > scala completely to reduce no of langs in production) - all writes from php 
> > directly go to the middleware - As cassandra is infact mostly a storage of 
> > indices - which means you need to change your strategy from mysql(post 
> > computation) to precomputing all the needed indices and storing them on 
> > cassandra. so middleware takes care of computing the indices and storing 
> > them in cassandra and redis accordingly. This way php will just submit the 
> > write to middleware and the request can be completed while middleware might 
> > take couple secs at most to compute the indices and finish the request 
> > completely.
> >
> > 4) Cassandra + redis clusters.
> >
> >
> > So writes are taken care of by the middleware and hence writes complete 
> > uber fast and reads are also quite fast courtesy of utilizing varnish where 
> > ever it helps.
> >
> > Still not in production though. Hope it helped. Would welcome anybody's 
> > suggestions on the way i am using cassandra and if i can do anything better
> >
> > Cheers,
> > Deepu.
> >
> > On Tue, Jul 13, 2010 at 2:48 AM, S Ahmed <sahmed1...@gmail.com> wrote:
> > What sort of traffic levels made you port the application to Cassandra?
> >
> > Very interested in seeing this go live.
> >
> > What sort of server setup are you looking at using?
> >
> >
> > On Mon, Jul 12, 2010 at 4:39 PM, Sandeep Kalidindi at PaGaLGuY.com 
> > <sandeep.kalidi...@pagalguy.com> wrote:
> > No we re-coded from scratch with most of the needed functionality.
> >
> > Cheers,
> > Deepu.
> >
> >
> > On Mon, Jul 12, 2010 at 7:49 PM, S Ahmed <sahmed1...@gmail.com> wrote:
> > Very interesting!
> >
> > What kind of integration do you have between vB and Cassandra? its not a 
> > port then?
> >
> >
> > On Mon, Jul 12, 2010 at 3:34 AM, Sandeep Kalidindi at PaGaLGuY.com 
> > <sandeep.kalidi...@pagalguy.com> wrote:
> > we were one of the vbulletin customers and our forums has been facing some 
> > bad scaling issues.
> >
> > we coded our forum software to work with cassandra. we are still testing 
> > for bugs and might go live in couple of weeks. You can ask any specific 
> > questions about vbulletin and cassandra and i will answer to the best of my 
> > knowledge.
> >
> > I our case a combination of cassandra and redis took care of most of the 
> > functionality that vbulletin offers and much more.
> >
> > Cheers,
> > Deepu.
> >
> >
> > On Mon, Jul 12, 2010 at 9:58 AM, Paul Prescod <pres...@gmail.com> wrote:
> > On Sun, Jul 11, 2010 at 8:39 AM, S Ahmed <sahmed1...@gmail.com> wrote:
> > > I want to build a vBulletin type application (forums, threads, posts, user
> > > management, etc).
> > > Support multi-tenancy for a Saas type environment.
> > > Would Cassandra be suitable for this type of application?
> > >
> > >
> > > Thanks in advance.
> >
> > Most likely, it is technically a fine fit. But Cassandra is very early
> > stage software, so you should expect that the documentation will not
> > always be clear and things will change from version to version. If you
> > are not extremely self-reliant, you may find it a frustrating
> > experience. Unless you are confident you will have trouble scaling
> > traditional technologies, it might not make business sense.
> >
> >  Paul Prescod
> >
> >
> >
> >
> >
> 
> 

Reply via email to