In v4.x, Unicode support will be better. That also means it may be easier
to make this sort of attack quieter in the future, as non-ASCII rules
won't be definitively wrong as they are now.
The question is whether non-ascii malicious rules could do anything more
damaging than simply failing to match on the obvious strings "visible" in
the rule, or alternately deliberately match on some string that should not
be matched, in some form of DOS attempt.
It's hard to see how someone could inject Perl (or any other) code with
screwy rules. There was a time Perl code was allowed in rules, that was
disallowed many years ago:
uri LW_PRINTIT /(^.*$)(?{ print "URI:\n$^N\nEnd URI\n\n" })/is
That was a real handy debugging rule once, but you can't get away with that
anymore.
Loren