"(don't you love C?)" I have never understood why the originators of C didn't give integers explicit widths in bits: their scheme made C code often non-portable.
When I wrote code in the mid 1990s for the DEC Alpha, ints were 32 bits while longs were 64 (unlike "standard" C). This made Alpha C code not portable to lesser CPUs. On the other hand, when I wrote C on DOS for the IBM PC in the late 1980s, ints were only 8 bits! It took some time to figure out why my C-compliant code failed so badly. In spite of all that, having started programming before C was invented, I can safely say that C is better than its predecessors for software like ClamAV. P.S. Good code these days tends to use typedefs defining things like int32, uint64 etc. A shame the original ClamAV coders didn't do that. On Tue, 3 Nov 2020 01:53:33 +0000 "Micah Snyder (micasnyd)" <micas...@cisco.com> wrote: > I hadn't really looked at the code. You raise a good point. > > Changing it isn't super simple. The info.blocks variable is passed through > cli_scandesc_callback() and scan_common() where it's placed into the scan > context. When data is scanned, the amount scanned is divided by > CL_COUNT_PRECISION (also found in clamav.h), which is what you multiply the > number by to get the value in bytes. Provided that all downstream > applications use CL_COUNT_PRECISION as clamscan does, we could shrink the > count precision from 4k to something lower, but that would also decrease the > max amount of data which could be scanned. > > If the variable were a uint64_t, that'd probably be fine... but it's an > unsigned long int... aka maybe 4 bytes or maybe 8 bytes (don't you love C?). > On systems where an unsigned long is 4 bytes, then that'd cap the scan limit > at 4GB. Changing the variable to be an uint64_t would be "best", but it > would be a non-backwards compatible change to the API which is very much not > worth it. > > Sigh :-/ > > > -----Original Message----- > > From: clamav-users <clamav-users-boun...@lists.clamav.net> On Behalf Of > > Paul Kosinski via clamav-users > > Sent: Monday, November 2, 2020 5:23 PM > > To: clamav-users@lists.clamav.net > > Cc: Paul Kosinski <clamav-us...@iment.com> > > Subject: Re: [clamav-users] ClamAV Scan - Data Read vs Data Scanned > > > > Can this really be done? I was looking at the code referred to by G.W. > > Haywood, and I see that it uses "info.blocks" and "info.rblocks". > > Looking at the definitions in "clamav-0.103.0/clamscan/", I see the > > following: > > > > struct s_info { > > unsigned int sigs; /* number of signatures */ > > unsigned int dirs; /* number of scanned directories */ > > unsigned int files; /* number of scanned files */ > > unsigned int ifiles; /* number of infected files */ > > unsigned int errors; /* number of errors */ > > unsigned long int blocks; /* number of *scanned* 16kb blocks */ > > unsigned long int rblocks; /* number of *read* 16kb blocks */ }; > > > > This suggests that the counts for "scanned" and "read" are not really byte > > counts, and EICAR's 68 bytes would always be recorded as 0 (if normal > > rounding rules are applied). > > > > > > > > On Mon, 2 Nov 2020 23:59:20 +0000 > > "Micah Snyder \(micasnyd\) via clamav-users" <clamav-users@lists.clamav.net> > > wrote: > > > > > I agree. We already have some logic in freshclam to convert bytes to > > > human > > readable B / KiB / MiB / GiB format. It should be pretty much a copypaste > > effort to improve the data scanned/read output. > > > > > > -Micah > > > > > > On 11/2/20, 9:47 AM, "clamav-users on behalf of G.W. Haywood via clamav- > > > > > users" <clamav-users-boun...@lists.clamav.net on behalf of clamav- > > us...@lists.clamav.net> wrote: > > > > > > Hi there, > > > > > > On Mon, 2 Nov 2020, Paul Kosinski via clamav-users wrote: > > > > > > > ... I still think it is a bad message that should be fixed. > > > > > > +1 > > > > > > If you want to try a very quick and dirty tweak to get more precise > > > numbers, change the value of > > > > > > 1) CL_COUNT_PRECISION in .../libclamav/clamav.h from 4096 to 1 > > > > > > 2) replace '1024' with '1' in four places in clamscan/clamscan.c > > > > > > 3) change 'MB' to 'Bytes' in two places in clamscan/clamscan.c and > > > > > > 4) rebuild. > > > > > > > > > 8<---------------------------------------------------------------------- > > > ~/clamav-0.103.0-rc2: $ grep -C3 -r CL_COUNT_PRECISION clamscan > > libclamav | ... > > > ... > > > ... > > > clamscan/clamscan.c: mb = info.blocks * (CL_COUNT_PRECISION / > > 1024) / 1024.0; > > > clamscan/clamscan.c: logg("Data scanned: %2.2lf MB\n", mb); > > > clamscan/clamscan.c: rmb = info.rblocks * (CL_COUNT_PRECISION > > > / > > 1024) / 1024.0; > > > clamscan/clamscan.c: logg("Data read: %2.2lf MB (ratio > > > %.2f:1)\n", > > rmb, info.rblocks ? (double)info.blocks / (double)info.rblocks : 0); > > > ... > > > ... > > > libclamav/clamav.h:#define CL_COUNT_PRECISION 4096 > > > ... > > > ... > > > > > > 8<-------------------------------------------------------------------- > > > -- > > > > > > This is untested, YMMV. Obviously, if you're skilled in the art, this > > > can be done better. Note that 'MB' should in any case be 'MiB' as the > > > values printed are the counts divided by 2^20 and not by 10^6. > > > > > > -- > > > > > > 73, > > > Ged. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml