Hey Everybody,

I've been using django for almost a year now and I've been spending
some time recently trying to optimize the slicehost VPS(s) that I use
to run several django sites I've developed.  I wanted to share my
findings with the larger group in hopes that my oversights can be
pointed out and whatever 'findings' I've made can be useful to folks
who are just starting off.  I've been developing a blow-by-blow of my
slicehost setup - I gained a lot from the "dreamier django dream
server" blog post a while back.  But to make things brief for the
first post, I'll just summarize my setup here:

512 meg slicehost slice w/ Hardy Heron
memcached with cmemcached bindings doin' its cache thang with 256 megs
of RAM
nginx on port 80 serving static files
apache mpm worker on 8080 w / mod_wsgi serving dynamic content
postgres 8.3 w/ geo libraries
django_gis (thanks justin!)
my application

I'll keep it to 3 sections of musings for this post:

triage troubles
memcached musings
context-processor conundrum

triage troubles

At pycon someone asked Jacob KM what he used to performance test his
websites and he said "siege".  A quick google search turned it up
(http://www.joedog.org/JoeDog/Siege).
I seem to recall Jacob mentioning that this was his preferred method
because it was more of a "real life" test than perhaps benchmarking
tools that would profile the code.  Compiling and using siege was a
snap.  My test was of a site I wrote that does a lot of database
queries to draw up any given page (mostly because of a complex
sidebar) when I turned it on, real easy like, to a dev server, the
server crumbled with only 10 simultaneous users and anything higher
than 5 clicks per user.

Observation #1: Make sure your debug settings are turned off.

After I turned debug settings off, performance maybe doubled, but
still was nothing that could handle even moderate traffic gracefully.
20 simultaneous users on 3 clicks per user were getting up into the
20+ second wait for a response range. Basically awful.  Not shocked,
because I knew that my db querying was horrendously inefficient.  This
was OK, because I had memcached up my sleeve.  An observation that I
made on the first test that was constant throughout all subsequent
tests, was that initial queries were the fastest and subsequent
queries became progressively slower and slower.  I'm assuming this is
because of something like queries queuing up at that db, or running
through memory, but I don't have enough context or knowledge of the
whole stack to isolate the problem, more on this later.

memcached musings

I went on and compiled cmemcache because the consensus opnion on the
internets is that its fastest.  I'll just assume that's so because it
has 'c' in the name and if you read it on the internets, it must be
true.

I put in all the cache settings, put in the Cache middleware and ran
siege again, waiting for the glorius results.  Blam.  Exactly the
same.  Actually, a little worse.  I scratched my head for about 3
hours before I realized that I had mistyped the memcached port number
in the settings.  After that, much improved.  I could do 300
simultaneous visitors doing 3-5 clicks apiece with tolerable
performace.  1000 visits doing 1 click each also held up very well,
the longest response time being in the 4-6 second range.  Without
fail, the earliest requests were the shortest wait, many well under a
second,  the last requests were the longest waits.  Also, as I
ratcheted up pressure from siege, I was running top on the 'beseiged'
server watching the running processes.  I notice a ton of postgres
processes.  This challenged my notion of how memcached worked.  I
thought that memcached would take the resulting page for a given view
and spit it back out if the url was requested again with no database
involved.  I was still hitting the db _alot_.

Observation #2 Is this thing on?: Memcached really does dramatically
improve your sites responsiveness under load, if you don't see massive
improvement, you haven't gotten memcached configured correctly.

context-processor conundrum

Then I remembered that I had written a custom context processor that
was doing the bulk of the nasty database querying.  I reckon that
whatever the order of operations was for request / response handling,
the result of the context processing was not getting cached.  So I
wrote 4-5 lines to check / set the cache in my custom
context_processors.py  and voila, that instantly knocked all queries
to the db down to zero.  Despite the absense of postgres processes
stacking up, the same phenom of early queries fast, subsequent queries
slow still applied, at this point I'm not exactly sure what's causing
it.  It's not that it's surprising, its just that I'd like to
understand exactly why its happening.

Observation #3:  Low level cachin' works well in cases like
context_processors, or other expensive non-view functions.

OK - I'll stop here for now, I hope this was useful or at least
amusing.  I'd love to hear stories from other "optimization" newbies
or suggestions from the experts about how folks go about their
optimizing their own projects.

Perhaps more on this to come.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to