Thanks for the whitepapers and incredibly useful advice. I'm beginning to get a picture of what I should be thinking about and implementing to achieve this kind of scalability. Before I go down any particular route here's a synopsis of the application.
1)User requests are received only during subscription. We currently don't have any problems with this because subscription rates increase along a sigmoid curve. 2)Once a user subscribes we begin to send them content as it becomes available. 3)The content is sports data. Content generation is dependent on the day. On days when there's a lot of action, we can generate up to 20 separate items in a second every 10 minutes. 4)The content is event driven e.g. a goal is scored. It is therefore imperative that we send the content to the subscribers within a period of 5 minutes or less. >There is a difference between one million users each who make one request once >a >month, and one million users who are each hammering the system with ten >>requests a second. Number of users on its own is a meaningless indicator of >>requirements. Quite true and this lack of clarity was a mistake on my part. Requests from users do not really become a significant part of this equation because, as described above, once a user subscribes the onus is upon us to generate messages throughout a given period determined by the number of updates a user has subscribed to receive. 5)Currently, hardware is a constraint (we're a startup and can't afford high end servers). I would prefer a solution that doesn't have to result in any changes to the hardware stack. For now, let's assume that hardware is not part of the equation and every optimization has to be software based. (except the beautiful network optimizations suggested) -- http://mail.python.org/mailman/listinfo/python-list