Hi Todd, First I should apologize for one of my earlier posts. The first token was a bit of a jumble. I think now you just want the literal string "download" to start your capture.
As per usual I tried a few different approaches to your regex problem, and posted what I thought was the best one, However an older iteration crept into one of my email posts: it used `^` which is Raku's zero-width "start-of-string" regex token. If you use `^` you will capture from the start-of-string onward, in this case through the `.*?` any-character token and up to the \> angle. You may not want this as it actually means the word "download" isn't required for you to capture that sequence of characters. I'm not sure where you got the impression that `\...\` actually means anything specific in Raku. If you're asking for a match against alphanumeric characters in Raku you don't have to escape them. Anything else (e.g. punctuation) you'll have to escape. So this means if you're trying to match ">" the "greater-than" sign (angle), you'll have to escape it via a backslash (e.g. `\>`), or by quoting (e.g. ">"). For non-alphanumeric characters, an unescaped punctuation characters is reserved for special "metacharacter" purposes: for example an unescaped "." dot means "any-character". You'll also note backslashing used to denote characters that are difficult to represent otherwise. Think for example how `\n` means newline, `\t` means tab. There are others: `\s` means whitespace, `\h` means horizontal-whitespace, and `\v` means vertical whitespace. Also `\S` means non-whitespace, `\H` means non- horizontal-whitespace, and `\V` means non- vertical-whitespace. I've also posted direct links to Raku regex forms, such as `<?before ... >` (a positive lookahead) and `<?after ... >` (a positive lookbehind). You can try this in the REPL: [0] > my $a = "XYZ" XYZ [1] > say $a ~~ m/ <?after X > Y <?before Z > /; 「Y」 Try reading that out loud in English, "say $a smartmatching against a requested `m` match comprising after-X, Y, before-Z". If you read it that way, you'll understand why only the `「Y」` ends up in the match variable. You can also `andthen` the smartmatch, which will put the match in the `$_` topic variable for you, which can help with stringification: [1] > $a ~~ m/ <?after X > Y <?before Z > / andthen put $_.Str; Y I'll try to go through and correct what you wrote below. Best, Bill. > On Jan 12, 2025, at 03:11, ToddAndMargo via perl6-users > <perl6-us...@perl.org> wrote: > > Hi Bill, > > Please correct my notes. > > Many thanks, > -T > > > > Explanation: > my @y = $x ~~ m:g/ <?before ^ | download > .*? <?before \> | \h+ > /; > > `m:g` # match and global CORRECT > `\...\` # the constrains (beginning and end) of the match NO, backslashes are used to escape non-alphanumeric characters, denote invisible characters (e.g. `\n`), etc. > `<...>` # constraints of instructions inside the match NO, `<?after ... >` is a lookbehind and `<?before ... >` is a lookahead. > > > > First instruction: `<?before ^ | download >` NO, this should just be the literal string `download` (or `"download"`) > > `?download ^` # positive look-behind, match but don`t capture `download ` > # `^` means "look behind" > > `|` # This is logical "OR" > > `download ` # positive look-behind, match but don`t capture `download ` > > summary: capture everything behind `before ` or capture just `download` > > > Second instruction: `.*?` > `.*?` # any-character, one-or-more, frugal up to the third instruction YES, CORRECT > > > Third instruction: `<?before \> | \h+ >` NO, SIMPLIFY THIS TO `<?before \>` and the match will stop when it encounters ">" the "greater-than" sign (angle). Because you're using a lookahead (match characters and "lookahead" to find a pattern but don't capture, example ), the ">" angle doesn't get captured. > > `<?before \>` # positive look-ahead, match but don`t capture `download \>` KINDA, the actual construct is `<?before \> >` or (even more readable), `<?before ">" >` > # Note that the `\` in `\>` is escaping the `>` and is > removing KINDA, the `\` backslash in front of a non-alphanumeric is a rule in Raku. If it isn't backslashed Raku will try to interpret the non-alphanumeric as a metacharacter. > # the `>` from the instructions constraints and making is > part > # of the match The unescaped `>` is part of the lookahead/lookbehind construct, either `<?after ... >` (lookbehind) or `<?before ... >` (lookahead). > > `|` # This is logical "OR" YES > > `\h+ ` # one-or-more horizontal whitespace character YES > > summary: capture everything before `before` or one-or-more whitespace > characters KINDA. Match the previous tokens, and stop matching when (before) you find one-or-more whitespace characters. > > HTH.