In article <7xy5pgqwto....@ruckus.brouhaha.com>, Paul Rubin <no.email@nospam.invalid> wrote:
> John Nagle <na...@animats.com> writes: > > I may do that to prevent the stall. But the real problem was all > > those DNS requests. Parallizing them wouldn't help much when it took > > hours to grind through them all. > > True dat. But building a DNS cache into the application seems like a > kludge. Unless the number of requests is insane, running a caching > nameserver on the local box seems cleaner. I agree that application-level name cacheing is "wrong", but sometimes doing it the wrong way just makes sense. I could whip up a simple cacheing wrapper around getaddrinfo() in 5 minutes. Depending on the environment (both technology and bureaucracy), getting a cacheing nameserver installed might take anywhere from 5 minutes to a few days to kicking a dead whale down the beach (if you need to involve your corporate IT department) to it just ain't happening (if you need to involve your corporate IT department). Doing DNS cacheing correctly is non-trivial. In fact, if you're building it on top of getaddrinfo(), it may be impossible, since I don't think getaddrinfo() exposes all the data you need (i.e. TTL values). But, doing a half-assed job of cache expiration is better than not expiring your cache at all. I would suggest (from experience) that if you build a getaddrinfo() wrapper, you have cache entries time out after a fairly short time. From the problem description, it sounds like using a 1-minute timeout would get 99% of the benefit and might keep you from doing some bizarre things. PS -- I've also learned by experience that nscd can mess up. If DNS starts doing stuff that doesn't make sense, my first line of attack is usually killing and restarting the local nscd. Often enough, that solves the problem, and it rarely causes any problems that anybody would notice. -- http://mail.python.org/mailman/listinfo/python-list