The signature patterns are loaded into a pair of large AC tries. Patterns can 
be added but not removed, so they have to be swapped out all at once. It is 
conceivable that we could do 1 trie at a time, which would split the reload 
process into 2 chunks, but no time/budget to work on that and improvement would 
be minimal.

We are making an effort to reduce the signature count be filtering out sigs 
that haven’t been effective. I’m hopeful you should start to see the database 
size slowly decrease. We will consider providing these archived signatures in 
an archived-signature database for folks who wish to run with as much coverage 
as possible.

-Micah

From: clamav-users <clamav-users-boun...@lists.clamav.net> On Behalf Of 
PenguinWhispererThe via clamav-users
Sent: Wednesday, December 2, 2020 9:08 AM
To: ClamAV users ML <clamav-users@lists.clamav.net>
Cc: PenguinWhispererThe <th3penguinwhispe...@gmail.com>
Subject: Re: [clamav-users] Memory usage going up until OOM

That 2 hour you mention seems to explain why it's always around the same time 
and seems to come earlier or later by 2 hours.
I do not know if it literally coincides with that refresh (not sure if that's 
logged on my system).

Having the database reloaded in chunks looks a lot more efficient to me. Ok 
7xxMB memory is quite a small machine. But e.g. on a 16GB system the standard 
memory usage is still almost 10% and so when reloading about 20% (very rough 
numbers I admit). Of course I don't know how the database is implemented so 
perhaps it's easier said than done.
In that regards "just twice" still is a lot ;)


On Wed, 2 Dec 2020 at 11:00, Frans de Boer 
<fr...@fransdb.nl<mailto:fr...@fransdb.nl>> wrote:
On 02/12/2020 09:34, PenguinWhispererThe via clamav-users wrote:
Hi,

I have a webserver with 4GB of memory that also functions as a mailserver.
The mail volume is rather low (perhaps a few hundred mails/day).
Almost every day around the same time I get a swap usage warning and once in a 
while clamd crashes because it has no more swap space available blocking mails 
from being processed.

Right now it uses about 1.3GB and all is fine. I'm using FreeBSD. I'm trying to 
see the logic why every day around the same time clamd decides to need so much 
more memory. It's not like it needs it to process the emails.

I've searched and read that clamd uses a lot of memory (30% is indeed quite a 
lot). But nowhere I see these kinds of numbers (multiple gigabytes). Having it 
use 60% memory (at least 2.3GB when it crashes) is getting ridiculous. Since 
all mails are being processed just fine when clamd uses 1.3GB I don't want to 
just increase the memory as it might start using 60GB at some point for no 
clear reason.

I didn't modify cronjobs recently so I'm unclear on why this seems to be a 
periodic thing. I had it like weeks in a row at around 15:45. Then it seemed to 
have switched to 17:45 and had it now once at 19:45. There seems to be this 2 
hour change in it or it's something that happens every 2 hours and 
circumstances get "just right" later and later.

Anyone experienced the same? Knows what's going on? Has a solution to this (not 
looking for "don't run clamAV as a daemon")?

Thanks in advance!
I have not seen this before, but let me ask if the extra memory usage coincide 
with reloading the signatures? freshclam does a default 2 hour database check.

If so, it is because of the design decision made by the clamav team. When a 
database is reloaded, it is using newly allocated memory. To allow continues 
virus scanning, the old contents is only thrashed after reloading and 
processing continues with the freshly loaded databases.  This approach is 
expected when the databases are kept growing - now consumes around 1.2GB memory 
- and server memory is expected to grow also.

My situation is similar as yours, and as such I switch from a 32-bit system 
with 750 MB to 64-bit and 16GB. The switch was needed because some emails where 
not scanned due to the limited memory and processing time by clamd.

Of course, running clamd is needed to avoid the cost of reloading clamscan with 
every email and the latency incurred due to accessing the database on disk.

An alternative would be to use compressed databases. Although I am not sure if 
they get decompressed when loading the databases into memory. If not 
decompressed, the processor speed might be a bottleneck.

As for your 60GB fear: Unless the design changes, it will keep on using just 
twice the "normal" memory size. So as always, adding some GB's to your system 
memory - if possible - is better as well as the cheapest way.

--- Frans.

--

A: Yes, just like that                            A: Ja, net zo

Q: Oh, Just like reading a book backwards         Q: Oh, net als een boek 
achterstevoren lezen

A: Because it upsets the natural flow of a story  A: Omdat het de natuurlijke 
gang uit het verhaal haalt

Q: Why is top-posting annoying?                   Q: Waarom is Top-posting zo 
irritant?

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net<mailto:clamav-users@lists.clamav.net>
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to