-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tomasz Kojm wrote:
> On Wed, 28 Feb 2007 16:03:08 +0100
> Gianluigi Tiesi <[EMAIL PROTECTED]> wrote:
> 
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Tomasz Kojm wrote:
>>> On Wed, 28 Feb 2007 15:21:52 +0100
>>> Gianluigi Tiesi <[EMAIL PROTECTED]> wrote:
>>>
>>>>>> I've noticed it too, in my port I have changed it to:
>>>>>>
>>>>>> if(!(iscntrl(buf[i]) || isprint(buf[i])) || !internat[buf[i] & xff])
>>>>> This one is much worse because it will lead to many false nagatives with
>>>>> HTML and mail files.
>>>>>
>>>> yes so I've never posted it as official patch,
>>>> btw I do the check for whole magic buffer (150?) to be more realable
>>>> also I've noticed the internat table is quite different from the one in
>>>> file (magic) utility.
>>> In your case checking more data will only increase the chance for a false
>>> negative. After your change the first condition (i.e. !(iscntrl(buf[i]) ||
>>> isprint(buf[i]))) will disqualify LOTS (more than 100 for sure) of
>>> characters which can be valid international chars.
>>>
>> So what we can use for the better (or at least optimal) way to guess the
>> kind of data (rather than having a always true/false check)? isprint
> 
> First of all, you should drop your change which is erroneous and for now I'd
> strongly suggest to classify all unknown data as CL_TYPE_UNKNOWN_TEXT.
> 
> We will address this issue in the near future and depending on the results of
> regression testing decide which way to go.
> 
There is a reason if we (clamwin) changed this, we still prefer to skip
unknown files, and we don't need to care much about html and mail
files, so I've made some tweaks (not only this one) to save some
cpu cycles avoiding scan of unneeded files.
I'm aware that for a mail server scanner it's not the correct approach,
so in fact my post was only a "comment", it was never intended to
be in clamav tree.
A scan of a real pc hd can take ages, clamscan without any change
scans large avi files in raw mode (there is only a specific check for
anim riffs), other media files and e.g. iso files are also scanned in
raw mode.
10-20gb of media/iso is not uncommon to find in a user pc, while
they are very unlikely to be in a mail.
Perhaps linux doesn't need itself to have a scanner for executable files
(linux but also the other unixes).

Regards

- --
Gianluigi Tiesi <[EMAIL PROTECTED]>
EDP Project Leader
Netfarm S.r.l. - http://www.netfarm.it/
Free Software: http://oss.netfarm.it/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF5cQ73UE5cRfnO04RAjjYAKCLeVZnaAqru8ghdCwBgJV4v6jh4QCff8w0
hHf6lO6xin6ZsQUTKhydaIA=
=4m9a
-----END PGP SIGNATURE-----
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net

Reply via email to