On Wed, Apr 4, 2018 at 9:02 PM, Richard Damon <rich...@damon-family.org> wrote: > Asynchronous processing will use a bit more of some processing resources > to handle the multi-processing, but it can be more efficient at fully > using many of the resources that are available. > > Take your file download example. When you are downloading a file, your > processor sends out a request for a chuck of the data, then waits for > the response. The computer on the other end gets that requests, sends a > packet, and then waits. You get that packet, send and acknowledgement, > and then wait, the other computer gets that acknowledgement and sends > more data, and then waits, and so on. Even if your pipe to the Internet > is the limiting factor, there is a fair amount of dead time in this > operation, so starting another download to fill more of that pipe can > decrease the total time to get all the data downloaded.
Assuming that you're downloading this via a TCP/IP socket (eg from a web server), the acknowledgements are going to be handled by the OS kernel, not your process. Plus, TCP allows acknowledgements to stack, so you're not really waiting for each other's acks very much. A single socket is entirely capable of saturating one computer's uplink. I once proved to my employer that a particular host had gigabit internet by renting a dozen EC2 instances with 100Mbit uplinks and having each of them transfer data to the same host concurrently - via one socket connection each. Much more interesting is operating a high-bandwidth server (let's say, a web application) that is responding to requests from myriad low-bandwidth clients. Traditional servers such as Apache's prefork mode would follow a model like this: while "more sockets": newsock = mainsock.accept() fork_to_subprocess(handler, newsock) def handler(sock): while "need headers": sock.receive_headers() while "need body": sock.receive_body() generate_response() while "response still sending": sock.send_response() A threaded model does the same thing, but instead of forking a subprocess, it spawns a thread. The handler is pretty simple and straight-forward; it reads from the socket until it has everything it needs, then it sends off a response. Both reading and writing can and will block, and generating the response is the only part that's really CPU-bound. "Parallelism" here means two things: how many active clients can you support (throughput), and how many dormant clients can you support (saturation). In a forked model, you spend a lot of resources spinning up processes (you can reduce this with process pools and such, at the expense of code complexity and a slower spin-down when idle); in a threaded model, you spend far less, but you're still paying a significant price per connection, and saturation can be a problem. The beauty of async I/O is that saturation becomes almost completely insignificant; the cost is that throughput is capped at a single thread's capabilities. In theory, you could use async I/O with multiple threads pumping the same set of events. I'm not sure if anyone has ever actually done this, as it combines the complexities of both models, but it would maximize both throughput and saturation levels - dormant clients cost very little, and you're able to use multiple CPU cores. More commonly, you could run a thread pool, doling out clients to whichever thread is least busy, and then having each thread run an independent event loop, which would be fine in the average case. But that's still more complicated; you still have to think about threads. > Yes, if your single path of execution can fully use the critical > resources, then adding asynchronous processing won't help, but rarely > does it. Very few large machines today a single-threaded, but most have > multiple cores and often even those cores have the ability to handle > multiple threads at once. Thus there normally are extra resources that > the asynchronous processing can better use up, so even processor usage > can be improved in many cases. I'm not sure what tasks would allow you to reduce processor usage this way. Got an example? ChrisA -- https://mail.python.org/mailman/listinfo/python-list