On 02/16/2010 09:15 PM, Tom Shaw wrote: > At 4:15 PM +0000 2/16/10, Steve Basford wrote: >> > >> >>> Attached document? I did not see an attachment. Can you send a link? >> >> Is this the TargetType you are after... >> >> >> 2.3.4 Extended signature format >> >> The extended signature format allows for specification of additional >> information such as a target file type, virus offset or engine version, >> making the detection more reliable. The format is: >> >> MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]] >> >> >> where TargetType is one of the following numbers specifying the type >> of the >> target file: >> >> 0 = any file >> 1 = Portable Executable >> 2 = OLE2 component (e.g. a VBA script) >> 3 = HTML (normalised) >> 4 = Mail file >> 5 = Graphics >> 6 = ELF >> 7 = ASCII text file (normalised) >> >> And Offset is an asterisk or a decimal number n possibly combined with a >> special modifier: >> >> Source: http://www.clamav.com/doc/latest/signatures.pdf > > > Steve et all, > > Yes I know all this, as I told Alain I have read all available docs > but they (nor the wiki) do not explain how a "7" is determined (eg by > extension if so which ones or by contents if so how), are php's and > per'ls considered ascii, portable executable or html or what, what is > an rtf considered an OLE or ascii orwhat, and what does a zeus bin > file get categorized as? Answers for these and many other questions > like these, I have searched the docs to find out with no joy.
Hi Tom, Didn't my reply answer your question? [which I've forwarded to -users, but I forgot that it strips attachments, here it is again] The file type is determined by signatures in daily.ftm (or the builtin ones in filetypes_int.h if that is missing) on a portion at the beginning of the file. sigtool --unpack-current daily cat daily.ftm As for binary versus ascii, utf8, utf16be, utf17le see textdet.c, it looks at the beginning of the file and determines which one it could be, based on the ratio of how many good/bad ascii,utf8, etc. characters it seen. Also there are some signatures that are detected on the fly (not only at the beginning of the file), during a type0 scan: /* bigger numbers have higher priority (in o-t-f detection) */ CL_TYPE_HTML, /* on the fly */ CL_TYPE_MAIL, /* magic + on the fly */ CL_TYPE_SFX, /* foo SFX marker */ CL_TYPE_ZIPSFX, /* on the fly */ CL_TYPE_RARSFX, /* on the fly */ CL_TYPE_CABSFX, CL_TYPE_ARJSFX, CL_TYPE_NULSFT, /* on the fly */ CL_TYPE_AUTOIT, CL_TYPE_ISHIELD_MSI, These filetypes are used both to determine what signature to match, and what unpacker to run. And the mapping from CL_TYPE to signature targettypes is in matcher.h: { 0, "GENERIC", 0, 0, 1 }, { CL_TYPE_MSEXE, "PE", 1, 0, 1 }, { CL_TYPE_MSOLE2, "OLE2", 2, 1, 0 }, { CL_TYPE_HTML, "HTML", 3, 1, 0 }, { CL_TYPE_MAIL, "MAIL", 4, 1, 1 }, { CL_TYPE_GRAPHICS, "GRAPHICS", 5, 1, 0 }, { CL_TYPE_ELF, "ELF", 6, 1, 0 }, { CL_TYPE_TEXT_ASCII, "ASCII", 7, 1, 1 }, /* note that this actually inclludes utf8, utf16be, and utf16le too! */ { CL_TYPE_ERROR, "NOT USED", 8, 1, 0 }, { CL_TYPE_MACHO, "MACH-O", 9, 1, 0 } Best regards, --Edwin _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml