*When adding debug to source like this: *
* if (exists $rule->{text}) { next unless $info->{anchor_text}; my($op,$patt,$neg) = @{$rule->{text}}; my $match; for my $text (@{ $info->{anchor_text} }) { if ( ($op eq '=~' && $text =~ $patt) || ($op eq '!~' && $text !~ $patt) ) { dbg("uri: Match found: text:%s matches the pattern:%s with operator:%s", $text, $patt, $op); $match = $text; last ; } else { dbg("uri: Not match: text:%s not matches the pattern:%s with operator:%s", $text, $patt, $op); } } if ( $neg ) { next if defined $match; dbg("uri: text negative matched: %s /%s/", $op,$patt); } else { next unless defined $match; dbg("uri: text matched: '%s' %s /%s/", $match,$op,$patt); } }* *and the debug output as:* dbg: uri: Not match: text:\x{E0}\x{B8}\x{95}\x{E0}\x{B9}\x{88}\x{E0}\x{B8}\x{AD}\x{E0}\x{B8}\x{AD}\x{E0}\x{B8}\x{B2}\x{E0}\x{B8}\x{A2}\x{E0}\x{B8}\x{B8}\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B1}\x{E0}\x{B8}\x{99}\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B5} not matches the pattern:(?^aa:\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{97\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{B1\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{99\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{97\\}\\\\x\\{E0\\}\\\\x\\{B8\\}\\\\x\\{B5\\}) with operator:=~ On Sun, Feb 2, 2025 at 1:57 PM John Hardin <jhar...@impsec.org> wrote: > On Sun, 2 Feb 2025, Jimmy wrote: > > > Hello, > > > > I am experiencing difficulties creating a rule to match UTF-8 anchor text > > using the plugin, and I suspect there might be a bug related to UTF-8 > > matching. > > > > For example, I attempted to use the following rule: > > > > uri_detail UNICODE_LINK_TEXT text =~ > > > /\\x{E0}\\x{B8}\\x{97}\\x{E0}\\x{B8}\\x{B1}\\x{E0}\\x{B8}\\x{99}\\x{E0}\\x{B8}\\x{97}\\x{E0}\\x{B8}\\x{B5}/ > > ...do you alwo need to escape the curlies? > > /\\x\{E0\}\\x\{B8\} etc... > > > -- > John Hardin KA7OHZ http://www.impsec.org/~jhardin/ > jhar...@impsec.org pgpk -a jhar...@impsec.org > key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 > ----------------------------------------------------------------------- > Are you a mildly tech-literate politico horrified by the level of > ignorance demonstrated by lawmakers gearing up to regulate online > technology they don't even begin to grasp? Cool. Now you have a > tiny glimpse into a day in the life of a gun owner. -- Sean Davis > ----------------------------------------------------------------------- > Today: the 22nd anniversary of the loss of STS-107 Columbia >