Hey Everybody, I've been using django for almost a year now and I've been spending some time recently trying to optimize the slicehost VPS(s) that I use to run several django sites I've developed. I wanted to share my findings with the larger group in hopes that my oversights can be pointed out and whatever 'findings' I've made can be useful to folks who are just starting off. I've been developing a blow-by-blow of my slicehost setup - I gained a lot from the "dreamier django dream server" blog post a while back. But to make things brief for the first post, I'll just summarize my setup here:
512 meg slicehost slice w/ Hardy Heron memcached with cmemcached bindings doin' its cache thang with 256 megs of RAM nginx on port 80 serving static files apache mpm worker on 8080 w / mod_wsgi serving dynamic content postgres 8.3 w/ geo libraries django_gis (thanks justin!) my application I'll keep it to 3 sections of musings for this post: triage troubles memcached musings context-processor conundrum triage troubles At pycon someone asked Jacob KM what he used to performance test his websites and he said "siege". A quick google search turned it up (http://www.joedog.org/JoeDog/Siege). I seem to recall Jacob mentioning that this was his preferred method because it was more of a "real life" test than perhaps benchmarking tools that would profile the code. Compiling and using siege was a snap. My test was of a site I wrote that does a lot of database queries to draw up any given page (mostly because of a complex sidebar) when I turned it on, real easy like, to a dev server, the server crumbled with only 10 simultaneous users and anything higher than 5 clicks per user. Observation #1: Make sure your debug settings are turned off. After I turned debug settings off, performance maybe doubled, but still was nothing that could handle even moderate traffic gracefully. 20 simultaneous users on 3 clicks per user were getting up into the 20+ second wait for a response range. Basically awful. Not shocked, because I knew that my db querying was horrendously inefficient. This was OK, because I had memcached up my sleeve. An observation that I made on the first test that was constant throughout all subsequent tests, was that initial queries were the fastest and subsequent queries became progressively slower and slower. I'm assuming this is because of something like queries queuing up at that db, or running through memory, but I don't have enough context or knowledge of the whole stack to isolate the problem, more on this later. memcached musings I went on and compiled cmemcache because the consensus opnion on the internets is that its fastest. I'll just assume that's so because it has 'c' in the name and if you read it on the internets, it must be true. I put in all the cache settings, put in the Cache middleware and ran siege again, waiting for the glorius results. Blam. Exactly the same. Actually, a little worse. I scratched my head for about 3 hours before I realized that I had mistyped the memcached port number in the settings. After that, much improved. I could do 300 simultaneous visitors doing 3-5 clicks apiece with tolerable performace. 1000 visits doing 1 click each also held up very well, the longest response time being in the 4-6 second range. Without fail, the earliest requests were the shortest wait, many well under a second, the last requests were the longest waits. Also, as I ratcheted up pressure from siege, I was running top on the 'beseiged' server watching the running processes. I notice a ton of postgres processes. This challenged my notion of how memcached worked. I thought that memcached would take the resulting page for a given view and spit it back out if the url was requested again with no database involved. I was still hitting the db _alot_. Observation #2 Is this thing on?: Memcached really does dramatically improve your sites responsiveness under load, if you don't see massive improvement, you haven't gotten memcached configured correctly. context-processor conundrum Then I remembered that I had written a custom context processor that was doing the bulk of the nasty database querying. I reckon that whatever the order of operations was for request / response handling, the result of the context processing was not getting cached. So I wrote 4-5 lines to check / set the cache in my custom context_processors.py and voila, that instantly knocked all queries to the db down to zero. Despite the absense of postgres processes stacking up, the same phenom of early queries fast, subsequent queries slow still applied, at this point I'm not exactly sure what's causing it. It's not that it's surprising, its just that I'd like to understand exactly why its happening. Observation #3: Low level cachin' works well in cases like context_processors, or other expensive non-view functions. OK - I'll stop here for now, I hope this was useful or at least amusing. I'd love to hear stories from other "optimization" newbies or suggestions from the experts about how folks go about their optimizing their own projects. Perhaps more on this to come. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---