Zentara wrote:
> 
> On Fri, 21 Jun 2002 13:37:57 -0700, [EMAIL PROTECTED] (John W. Krahn)
> wrote:
> 
> >Sorry, it was late and I didn't test it.  :-(   The correct code should
> >be
> >
> >my $bin = pack 'H*', $hextest;
> 
> Thanks John , I thought I was losing my mind. :-)
> 
> >Sorry, my understanding is that the hex string is just an ASCII
> >representation of the binary data to search for.  Virus files don't have
> >actual "hex strings" in them but are compiled executables.
> 
> Yeah I see the misunderstanding now. You were looking at doing
> a regex testing a binary value on the binary file.
> Can perl do "binary regexes"?

Yes, that is why I suggested quotemeta.

perldoc -q "binary data"

Found in /usr/lib/perl5/5.6.0/pod/perlfaq4.pod
       How do I handle binary data correctly?

       Perl is binary clean, so this shouldn't be a problem.  For
       example, this works fine (assuming the files are found):

           if (`cat /vmunix` =~ /gzip/) {
               print "Your kernel is GNU-zip enabled!\n";
           }

       On less elegant (read: Byzantine) systems, however, you
       have to play tedious games with "text" versus "binary"
       files.  See the binmode entry in the perlfunc manpage or
       the perlopentut manpage.  Most of these ancient-thinking
       systems are curses out of Microsoft, who seem to be
       committed to putting the backward into backward
       compatibility.

       If you're concerned about 8-bit ASCII data, then see the
       perllocale manpage.

       If you want to deal with multibyte characters, however,
       there are some gotchas.  See the section on Regular
       Expressions.



> I was looking at it the other way. I had the hex signature of the virus,
> so I converted  the binary file into a long hexstring. Then regexed the
> hex values.
> My first attempt is below. It works, but is incredibly slow.  I tested
> it against some commercial virus scanners like Trendmicro's vscan,
> and the H+BEDV scanner for linux.  I took some executables, hexedited
> them to put in some test signatures, and scanned them.
> The commercial scanners found the patterns in a micro-second.
> My scanner took about 1 second per megabyte of filedata.  Too
> slow for anything but the smallest files.
> 
> It's such a simple process, that I'm now toying with trying to do it
> with assembly.
> 
> Anyways here is what my slow kludge looks like.
> You get the virussignatures.txt file from
> http://www.openantivirus.org/VirusSignatures-latest.zip
> 
> This is what the signature file looks like:
> ....
> ....
> 10 past 3 (B)=ec020e1ff3a4b82125061fbab300cd21
> 10 past 3 (C)=b840008ed8a11300b106d3e02d00088e
> 100-Years=fe3a558bec50817e0400c0730c2ea147
> 1024-PrScr #1=8cc0488ec026a103002d800026a30300
> 1024-PrScr #2=a172041f3df0f07505a10301cd0526a1
> 1024-PrScr #3=00012ea30300b4400e1fba0004b90004e8e8007230
> 1024-PrScr #4=babf00b82125cd2133c08ec0b8f0f026
> 1210-Prudent=2f040175d00e0e1f07bed3042bc92e8a0446410ac0
> 1210=c474f02e803e2f040175
> 1241=8a4600a200018b4601a30101b8cc4bcd
> 1244=cd217252b91e00ba7d04b43fcd217246
> ....
> ....
> 
> This file has nearly 2000 entries, and I suspect that is why
> it is so slow to check all those values thru the regex.

You should either use index() or pre-compile the regex.


> #########################################################
> #!/usr/bin/perl
> use strict;
> use warnings;
> 
> my (@vs,@virname,@virsig,$numsigs,$i);
> open (VS,"< virussignatures.strings")
> or die "Cant open signature file",$!;
> @vs = <VS>;
> $numsigs = $#vs;
> close VS;
> 
> for ($i=0; $i <= $numsigs; $i++) {
>     chomp $vs[$i];
>     ($virname[$i],$virsig[$i])= split(/=/,$vs[$i]);
> }
> 
> $/ = undef;
> my $file = <>; #slurp binary file into 1 long string
> if (length $file eq 0){print "Empty File\n";exit}
> my $hexfilestring = unpack "H*", $file; #convert binary file to hex
> 
> for (my $i =0; $i <= $numsigs; $i++){
> if ($hexfilestring =~ m/$virsig[$i]/i){print "$virname[$i]
> found\n";exit;}
> }
> 
> print "file clean\n";
> exit;
> ###############################################################



This is about ten times faster than your version.  :-)


#!/usr/bin/perl
use strict;
use warnings;

open VS, 'virussignatures.strings'
    or die "Cant open virussignatures.strings: $!";

my @sigs = map { chomp;
                 ( $a, $b ) = split /=/;
                 $b = pack 'H*', $b;     # convert to binary
                 $b = qr/\Q$b\E/;        # pre-compile regex
                 [ $a, $b ]
                } <VS>;

FILE:
for my $file ( @ARGV ) {
    unless ( open FILE, $file ) {
        warn "Cannot open $file: $!";
        next FILE;
        }
    my $buffer;
    unless ( read FILE, $buffer, -s $file ) {
        warn "$file has no data.\n";
        next FILE;
        }
    for my $regex ( @sigs ) {
        if ( $buffer =~ /$regex->[1]/ ) {
            print "$regex->[0] found in $file\n";
            next FILE;
            }
        }
    }

__END__


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to