*When adding debug to source like this:  *




















* if (exists $rule->{text}) {      next unless $info->{anchor_text};
my($op,$patt,$neg) = @{$rule->{text}};      my $match;      for my $text
(@{ $info->{anchor_text} }) {        if ( ($op eq '=~' && $text =~ $patt)
||             ($op eq '!~' && $text !~ $patt) ) {                dbg("uri:
Match found: text:%s matches the pattern:%s with operator:%s", $text,
$patt, $op);                $match = $text; last ;           } else {
          dbg("uri: Not match: text:%s not matches the pattern:%s with
operator:%s", $text, $patt, $op);           }      }      if ( $neg ) {
    next if defined $match;        dbg("uri: text negative matched: %s
/%s/", $op,$patt);      } else {        next unless defined $match;
dbg("uri: text matched: '%s' %s /%s/", $match,$op,$patt);      }    }*

*and the debug output as:*


dbg: uri: Not match:
text:\x{E0}\x{B8}\x{95}\x{E0}\x{B9}\x{88}\x{E0}\x{B8}\x{AD}\x{E0}\x{B8}\x{AD}\x{E0}\x{B8}\x{B2}\x{E0}\x{B8}\x{A2}\x{E0}\x{B8}\x{B8}\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B1}\x{E0}\x{B8}\x{99}\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B5}
not matches the
pattern:(?^aa:\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{97\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{B1\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{99\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{97\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{B5\\})
with operator:=~



On Sun, Feb 2, 2025 at 1:57 PM John Hardin <jhar...@impsec.org> wrote:

> On Sun, 2 Feb 2025, Jimmy wrote:
>
> > Hello,
> >
> > I am experiencing difficulties creating a rule to match UTF-8 anchor text
> > using the plugin, and I suspect there might be a bug related to UTF-8
> > matching.
> >
> > For example, I attempted to use the following rule:
> >
> > uri_detail UNICODE_LINK_TEXT text =~
> >
> /\\x{E0}\\x{B8}\\x{97}\\x{E0}\\x{B8}\\x{B1}\\x{E0}\\x{B8}\\x{99}\\x{E0}\\x{B8}\\x{97}\\x{E0}\\x{B8}\\x{B5}/
>
> ...do you alwo need to escape the curlies?
>
> /\\x\{E0\}\\x\{B8\}   etc...
>
>
> --
>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>   jhar...@impsec.org                         pgpk -a jhar...@impsec.org
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>    Are you a mildly tech-literate politico horrified by the level of
>    ignorance demonstrated by lawmakers gearing up to regulate online
>    technology they don't even begin to grasp? Cool. Now you have a
>    tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
> -----------------------------------------------------------------------
>   Today: the 22nd anniversary of the loss of STS-107 Columbia
>

Reply via email to