On 7.10.2012, at 0.32, Peer Heinlein wrote:

> Several times we already had the problems, that accounts with more the
> 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if
> vsize_limit of 750 MB is set.
> 
> In this case, the lmtpd-process haven't been able to allocate more
> memory to read/write/update the index-files and crashed (and the
> index-files become corrupted at the end.)

I don't think dovecot.index file is much of a problem. With 1M mails it usually 
only takes something like 8-32 MB of memory depending on what mailbox format is 
used. dovecot.index.log file doesn't depend on the mailbox size at all. The 
main problem is dovecot.index.cache file.

I've thought about the cache file problems earlier also, but it's a bit 
difficult to figure out the best solution for it. And since nobody had actually 
complained about it, I hadn't really done anything about it. Also I hadn't 
previously thought of LMTP/LDA processes crashing because of it, that's a 
bigger problem than IMAP process crashing. Although I think you're getting a 
lot more of "mmap(dovecot.index.cache) failed: Out of memory" errors than 
crashes for large mailboxes?

So, subproblems related to this:

1. Filling out dovecot.index.cache too easily. A rather simple possibility that 
would catch all the possible ways would be to limit the max. size of a single 
message's cache entry to X kilobytes (64?). If it becomes larger, it's simply 
not written to the cache file.

2. Filling out memory too easily. If a long header is wanted to be cached or 
used for other purposes (e.g. Message-ID), it's still fully read into memory. 
Add some reasonable limit to max. length of a single header. Can't be too 
small, because some headers are legitimately pretty long (DKIM and such). Maybe 
something like 10kB would be safe enough for everyone?

3. If existing dovecot.index.cache is larger than X MB, shrink it first below 
X. Shrinking could begin with trying to do it the nice way of removing only 
unneeded data, but if that fails it could forcibly just remove some old 
messages. The X would have to be related to the process's VSZ limit.

4. Dovecot currently doesn't close index files immediately when mailbox is 
closed, because it's thinking that IMAP clients might reopen the index soon 
anyway. Max 3 indexes can be kept open, so 3x already different very large 
indexes can be too much. I'm not sure if this is actually useful at all. Maybe 
I should disable it for LMTP, or maybe just remove it completely.

The 3. part is what I like changing the least. An alternative solution would be 
to just not map the entire cache file into memory all at once. The code was 
actually originally designed to do just that, but munmap()ing + mmap()ing again 
wasn't very efficient. But for LMTP there's really no need to map the whole 
file. All it really wants is to read a couple of header records and then append 
to the file. Maybe it could use an alternative code path that would simply do 
that instead of mmap()ing anything. It wouldn't solve it for IMAP though.

> I don't have a clear solution for that, Dovecot needs the subject
> information in its index files. But it looks like, it isn't a good idea
> to put the whole subject into the index. Maybe it's better/necessary to
> use just the first 50-70 characters for that and to keep the rest away
> from the index?

50-70 is way too little. The cached subject gets sent to the IMAP client. I 
think 200 bytes would be minimum and 1000 would be something I could probably 
even hardcode. But anyway, subject isn't the only way to trigger this and 1000 
bytes is too low for some headers.

Reply via email to