Patrick Ben Koetter: > > I think that a design (the stage before code is written) should > > consider how scoring would play with the other tests that postscreen > > implements, and how it would play with things that I intend to add > > such as light-weight greylisting. > > > > We can model postscreen-like programs in several ways. In all cases > > the program subjects each SMTP client to a number of tests (permanent > > white/blacklist, RBL lookup, pregreet, greylist, other). > > How about a postscreen API to external logic somewhere along the concept of > smtpd policy services?
I created postscreen because it is becoming too expensive to spend one server process per zombie connection. Instead, one postscreen process manages up to thousands of inbound connections simultaneously, and drops the majority of them before they can affect Postfix's availability for legitimate clients. However, handling thousands of connections in one process presents some unique opportunities and challenges. The main challenge is that postscreen must not become a bottleneck itself. - It would be a mistake to move arbitrary pieces of smtpd code into a single postscreen process. The reason that Postfix achieves a high level of robustness and security is that a process can simply die when it runs into an error; this simple error recovery model eliminates massive amounts of opportunities to introduce security and other problems when trying to back out from an error. Obviously, dying on error is undesirable when a process like postscreen manages thousands of connections, even if it manages connections from "good" clients for only a split second. The practical way to avoid errors is to keep postscreen simple, and to implement all the complex functionality in processes that can safely die when they run into an error. These may be familiar Postfix processes like smtpd, or they may be dedicated postscreen helper processes (e.g.,, to implement DNS or TLS). - It would be a mistake to make postscreen dependent on one policy daemon process per SMTP client (especially if it's a memory pig like a Perl etc. interpreter). This would would defeat the purpose of managing thousands of zombie connections in one process. - It would be a mistake to outsource all postscreen decisions to a single policy daemon process. I don't want to sound arrogant, but most people simply aren't up to building performant and robust servers that manage a large number of overlapping requests simultaneously, and that recover safely from all forseeable errors. This is the wrong solution if we want an easy-to-deploy extension interface. Case in point: in the Postfix project, postscreen is the first code in 12 years that handles a large number of overlapping requests. If such code is not developed with great care then it will become a single point of failure. - It may be possible to outsource postscreen decisions to a pool of single-threaded policy daemon processes. This requires that the request/reply latency for each request is small enough, otherwise we still end up with one policy daemon process per zombie, or with one postscreen process that is waiting for policy daemon replies. When that happens, we have just added another failure mode to Postfix. Example: someone used MySQL lookups with postscreen, and ran into all kinds of bizarre performance anomalies under load. I had to force him to stop using MySQL, by logging nasty warning messages when a table access took more than a few 100ms. These checks waste CPU cycles, but they are necessary to protect postscreen's reputation as a system that you can rely on. Postscreen outsources DNSBL lookups to single-threaded dnsblog processes, because of a lack of program development time, not because it is safe to do so. With minor change to the Postfix DNS reply parser code, one postscreen process should be able to manage thousands of overlapping DNSBL lookups by itself. > People might even be able to implement bandwidth throttling where Postfix > provides the data and an external 'postscreen policy' daemon controls firewall > settings. Mailchannels already implements bandwidth throttling entirely in user-space, without any need for firewall kludgery. Their technology is patented (I have no problem with that) but it means that I'm not likely to release code that emulates their functionality. Wietse