On 26 Apr 2012 at 21:18, Török Edwin wrote: > On 04/26/2012 08:37 PM, Michael Orlitzky wrote: > > On 04/26/2012 10:32 AM, Dennis Peterson wrote: > >> On 4/25/12 7:34 AM, Michael Orlitzky wrote: > >>> On 04/25/12 07:55, Török Edwin wrote: > >>>>> > >>>>> I don't know if this can help speeding up the process but I collected > >>>>> some statistics on > >>>>> clamscan of a small file (wallclock duration: ~25sec): > >>>> > >>>> I think I'm missing some context here: which DB files are slow to load? > >>>> The official ones? Just the sanesecurity ones? Any particular DB from > >>>> the sanesecurity ones? > >>> > >>> My problem isn't so much that it takes a while to load the signatures, > >>> but that clamd (and thus the mail server) is effectively down the entire > >>> time. > >> > >> This has been a problem on every Sparc system I've ever installed ClamAV > >> on and > >> that goes back quite a few years. I still use in on several Netra 500 mHz > >> pizza > >> boxes. It is also quite a memory hole which is more related to the > >> available > >> memory and number of sigs, so on memory constrained systems I've cut back > >> on the > >> number of SS signatures. And at my peril, I might add, as they have long > >> been > >> the most valuable in terms of results. And because of the dead time when > >> reloading I've cut freshclam to once a day. That has resulted in a net > >> improvement in detections because of the higher availability time. > >> > > > > The signature databases are created once, and loaded thousands of times. > > They should just be sorted, so that lookups are instantaneous. > > > > Then it's trivial to update the databases in the background, because you > > can quickly determine if a particular signature was added or deleted. > > The wall-time-elapsed would be a bit worse, but nobody would care. > > Its a bit more complicated than that. To ensure fast pattern-matching the > signatures are loaded into an Aho-Corasick trie for example. > It would be possible to add to the trie (thats what happens when loading > signatures), but removing is more tricky. > And to determine what to remove you need to go through all the signatures in > the database anyway. > Also updating the loaded signature database would require the scanning > threads to take read locks, which would slow things down > and make updating it harder (right now the loaded signature database is never > modified, hence no locks are needed). > > It would be easier to just move reload_db to a different thread and allow > scanning with the old database during the DB reload. > Then when the DB reload is finished atomically replace the engine pointer and > free the old engine. > Downside would be that you get twice the memory usage during reload, but you > don't have downtime, > so this should probably be controlled by a flag in clamd.conf.
Doing that with 2 different processes rather than with 2 threads would at least free all the initial process memory when the "transfer of service" is done and that process can exit. AFAIK freeing the memory inside of a process does not necessarily reduce the memory space consumed. But I'm not an expert. Of course that "transfer of service" would be more tricky between 2 processes... Regards, Pierre _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml