Hello Andrzej,

thanks for your suggestion. However, increasing ulimit -n did not help
in the first place .

Debugging info by adding -v in master.cf came up with lots of debugging
info yet not leading to success.

After attaching strace to the local daemon, I observed a kind of a loop
when it came to evaluate  the .forward files in the users home
directories, several of them in the form /home/username/.forward =
"\username". These files are produced by a vacation plugin to our
squirrelmail web access to the mailboxes, when the users disable
forwarding and or vacation messages.

After sending mails to almost all of the recipients - many of them have
these .forward-files, evaluated without any problem - strace showed up
repeated accesses to the .forward file for the same user about 4300 (!)
 times - probably until the above ulimit is touched and the process
segfaults.

As I didn't figure out by now what causes the loop, I'd appreciate any
other idea.

Here is a snippet of the loops output, lots of similar block before and
after this one: The only parameter that changes is the 4304 here, which
i believe to be the file handle. This number is incremented with each
loop iteration.

=======================================================
Aug  9 12:06:05 postfix logger:
lstat64("/home/wre/.forward",{st_mode=S_IFREG|0600, st_size=6, ...}) = 0
Aug  9 12:06:05 postfix logger: geteuid32()
= 1193
Aug  9 12:06:05 postfix logger: setresuid32(-1, 0, -1)                  = 0
Aug  9 12:06:05 postfix logger: setresgid32(-1, 51, -1)                 = 0
Aug  9 12:06:05 postfix logger: setgroups32(1, [51])                    = 0
Aug  9 12:06:05 postfix logger: setresuid32(-1, 51, -1)                 = 0
Aug  9 12:06:05 postfix logger: geteuid32()                             = 51
Aug  9 12:06:05 postfix logger: getegid32()                             = 51
Aug  9 12:06:05 postfix logger: geteuid32()                             = 51
Aug  9 12:06:05 postfix logger: setresuid32(-1, 0, -1)                  = 0
Aug  9 12:06:05 postfix logger: setresgid32(-1, 100, -1)                = 0
Aug  9 12:06:05 postfix logger: setgroups32(1, [100])                   = 0
Aug  9 12:06:05 postfix logger: setresuid32(-1, 1193, -1)               = 0
Aug  9 12:06:05 postfix logger: open("/home/wre/.forward", O_RDONLY)
= 4304
Aug  9 12:06:05 postfix logger: geteuid32()
= 1193
Aug  9 12:06:05 postfix logger: setresuid32(-1, 0, -1)                  = 0
Aug  9 12:06:05 postfix logger: setresgid32(-1, 51, -1)                 = 0
Aug  9 12:06:05 postfix logger: setgroups32(1, [51])                    = 0
Aug  9 12:06:05 postfix logger: setresuid32(-1, 51, -1)                 = 0
Aug  9 12:06:05 postfix logger: fcntl64(4304, F_GETFD)                  = 0
Aug  9 12:06:05 postfix logger: fcntl64(4304, F_SETFD, FD_CLOEXEC)      = 0
Aug  9 12:06:05 postfix logger: read(4304, "\\wre\r\n", 4096)           = 6
Aug  9 12:06:05 postfix logger: time(NULL)
= 1281348364
Aug  9 12:06:05 postfix logger: time(NULL)
= 1281348364
Aug  9 12:06:05 postfix logger: geteuid32()                             = 51
Aug  9 12:06:05 postfix logger: getegid32()                             = 51
Aug  9 12:06:05 postfix logger: geteuid32()                             = 51
Aug  9 12:06:05 postfix logger: setresuid32(-1, 0, -1)                  = 0
Aug  9 12:06:05 postfix logger: setresgid32(-1, 100, -1)                = 0
Aug  9 12:06:05 postfix logger: setgroups32(1, [100])                   = 0
Aug  9 12:06:05 postfix logger: setresuid32(-1, 1193, -1)               = 0
============================================================================

For now, after deleting all the "quasi empty" forward-files seems to
solve the problem, but I fear running into the same thing when the
number of recipients will increase.

Dominik



Am 05.08.2010 12:58, schrieb Andrzej Kukuła:
> On Wed, Aug 4, 2010 at 10:39, Dominik Storck <domi...@storck.net> wrote:
>>
>> This has been working perfectly for years. Now the number of recipients
>> for some of
>> these lists have increased to more than 200.
>>
>> When a mail is sent to these recipients mail delivery starts as expected
>> but stops
>> short before end of list. The exact count changes, probably due to
different
>> state of of concurrent mail queue entries.
>>
>> The error message is an "unknown mail transport error", the mail stays
>> in the queue and
>> delivery starts over again from the beginnig until I remove the mail
>> from the queue.
>>
>> I believe there is some limit to 200 recipients, queue entries or
whatever.
>
> I'd speculate it's low open file limit in operating system. I had this
> once when my 'everyone' alias exceeded several hundred users. See
> ulimit -n
> Increase it in your postfix startup script to, say, 100000, and
> observe the difference.
>
> Regards,
> Andrzej


Reply via email to