Re: building distributed systems with django?

John Dohn Sat, 07 Jun 2008 08:08:18 -0700

On Sun, Jun 8, 2008 at 2:11 AM, lgr888999 <[EMAIL PROTECTED]> wrote:

> how you would build a huge decentraliced system. Now of course it
> would depend on what the purpose of the system is so lets just take
> twitter as an example. :)
>


It's easy. All you have to do is to avoid all single points of failure and
all possible bottlenecks. Just that ;-)

Now, in practice this is *very* complicated. Have an example of a pretty
simple website with 3 classic tiers - webservers, app logic and database
backend.

The first bottleneck and point of failure is the path to reach your
internet-facing servers. This is relatively easily avoidable with acquiring
your own "portable" block of IP addresses (PA) and have multiple paths to
the wide net through independent ISPs. Provided you have your datacenters in
multiple locations you'll get pretty reliable access to your service for
most of the internet.

Another major bottleneck is indeed the database. Unless you look as high as
Google or Yahoo are with their custom replicated/redundant DB solutions
you'll probably end up with some sort of SQL backend. You shouldn't aim for
having access to all DB updates from all connected clients immediately, in
no time. It helps a lot if you could identify "clouds" of objects that must
appear to work synchronously and the rest that may get updated when its time
comes. For instance - a twitter user that posts a message must be able to
see it immediately on his page. Otherwise he'll ge confused. On the other
hand whether his friends can see it in 1 secs or 1 minute is not that
important in most cases.

Objects directly related to one user's session are obviously in the
"synchronous cloud", others are in "async cloud" and it's not that critical
that one session has immediate access to other sessions' clouds. The
importance of this separation comes up once you have to deal with multiple
geographically distant datacenters (DCs). You can have a DB cluster in each
of them (Oracle RAC, MySQL NDB, or something similar) and then you'll have
to design replication strategies between the datacenters.

This is probably one of the most difficult parts of application design. You
must ensure the replication is resilient against things like conflicting
updates (since transactions won't work over multiple DCs) leading e.g. to
duplicate keys. And there's much more. Some things will require a "global
ack" from all DCs worldwide before they could be committed, e.g.
registration of new user must ensure that the same one is not being
registered at the same time somewhere else.

OTOH Things like currently logged-in users and their session information may
not need to be replicated elsewhere at all. These tend to be high-volume
things and often are better treated differently from "real content". Luckily
for you most user sessions will send all requests to just one DC because of
quite stable routing paths in the internet. However it may happen that a
user starts a session talking to DC1 and after a while transfers to DC2. In
that case you can require re-login or, better and more user-friendly,
request his session data from his "home" DC.

As you can see it's not that much about Django or the web application to
build a distributed scalable website. The core part is the datastore
management.

All the above comes from my experience with operational management of a
major news-site with three distinct datacenters on two continents with
millions page views a day. Indeed our setup is much more complex with
different subsystems having their specific requirements but to share some
hints the above simplification is sufficient.

Hope that helps ;-)

JDJ

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: building distributed systems with django?

Reply via email to