Hello Andrzej, thanks for your suggestion. However, increasing ulimit -n did not help in the first place .
Debugging info by adding -v in master.cf came up with lots of debugging info yet not leading to success. After attaching strace to the local daemon, I observed a kind of a loop when it came to evaluate the .forward files in the users home directories, several of them in the form /home/username/.forward = "\username". These files are produced by a vacation plugin to our squirrelmail web access to the mailboxes, when the users disable forwarding and or vacation messages. After sending mails to almost all of the recipients - many of them have these .forward-files, evaluated without any problem - strace showed up repeated accesses to the .forward file for the same user about 4300 (!) times - probably until the above ulimit is touched and the process segfaults. As I didn't figure out by now what causes the loop, I'd appreciate any other idea. Here is a snippet of the loops output, lots of similar block before and after this one: The only parameter that changes is the 4304 here, which i believe to be the file handle. This number is incremented with each loop iteration. ======================================================= Aug 9 12:06:05 postfix logger: lstat64("/home/wre/.forward",{st_mode=S_IFREG|0600, st_size=6, ...}) = 0 Aug 9 12:06:05 postfix logger: geteuid32() = 1193 Aug 9 12:06:05 postfix logger: setresuid32(-1, 0, -1) = 0 Aug 9 12:06:05 postfix logger: setresgid32(-1, 51, -1) = 0 Aug 9 12:06:05 postfix logger: setgroups32(1, [51]) = 0 Aug 9 12:06:05 postfix logger: setresuid32(-1, 51, -1) = 0 Aug 9 12:06:05 postfix logger: geteuid32() = 51 Aug 9 12:06:05 postfix logger: getegid32() = 51 Aug 9 12:06:05 postfix logger: geteuid32() = 51 Aug 9 12:06:05 postfix logger: setresuid32(-1, 0, -1) = 0 Aug 9 12:06:05 postfix logger: setresgid32(-1, 100, -1) = 0 Aug 9 12:06:05 postfix logger: setgroups32(1, [100]) = 0 Aug 9 12:06:05 postfix logger: setresuid32(-1, 1193, -1) = 0 Aug 9 12:06:05 postfix logger: open("/home/wre/.forward", O_RDONLY) = 4304 Aug 9 12:06:05 postfix logger: geteuid32() = 1193 Aug 9 12:06:05 postfix logger: setresuid32(-1, 0, -1) = 0 Aug 9 12:06:05 postfix logger: setresgid32(-1, 51, -1) = 0 Aug 9 12:06:05 postfix logger: setgroups32(1, [51]) = 0 Aug 9 12:06:05 postfix logger: setresuid32(-1, 51, -1) = 0 Aug 9 12:06:05 postfix logger: fcntl64(4304, F_GETFD) = 0 Aug 9 12:06:05 postfix logger: fcntl64(4304, F_SETFD, FD_CLOEXEC) = 0 Aug 9 12:06:05 postfix logger: read(4304, "\\wre\r\n", 4096) = 6 Aug 9 12:06:05 postfix logger: time(NULL) = 1281348364 Aug 9 12:06:05 postfix logger: time(NULL) = 1281348364 Aug 9 12:06:05 postfix logger: geteuid32() = 51 Aug 9 12:06:05 postfix logger: getegid32() = 51 Aug 9 12:06:05 postfix logger: geteuid32() = 51 Aug 9 12:06:05 postfix logger: setresuid32(-1, 0, -1) = 0 Aug 9 12:06:05 postfix logger: setresgid32(-1, 100, -1) = 0 Aug 9 12:06:05 postfix logger: setgroups32(1, [100]) = 0 Aug 9 12:06:05 postfix logger: setresuid32(-1, 1193, -1) = 0 ============================================================================ For now, after deleting all the "quasi empty" forward-files seems to solve the problem, but I fear running into the same thing when the number of recipients will increase. Dominik Am 05.08.2010 12:58, schrieb Andrzej Kukuła: > On Wed, Aug 4, 2010 at 10:39, Dominik Storck <domi...@storck.net> wrote: >> >> This has been working perfectly for years. Now the number of recipients >> for some of >> these lists have increased to more than 200. >> >> When a mail is sent to these recipients mail delivery starts as expected >> but stops >> short before end of list. The exact count changes, probably due to different >> state of of concurrent mail queue entries. >> >> The error message is an "unknown mail transport error", the mail stays >> in the queue and >> delivery starts over again from the beginnig until I remove the mail >> from the queue. >> >> I believe there is some limit to 200 recipients, queue entries or whatever. > > I'd speculate it's low open file limit in operating system. I had this > once when my 'everyone' alias exceeded several hundred users. See > ulimit -n > Increase it in your postfix startup script to, say, 100000, and > observe the difference. > > Regards, > Andrzej