I need a better regex with a literal in it
Hi All, Wish I had a Q[;;;] expression inside a regex, but I don't This is my notes on how to do a regex with a special characters in it: Regex with literals in it: $JsonAddr ~~ s| (';') .* ||; It "usually" works. Unfortunately this one hangs my program (I am slicing up a web page): $NewRev ~~ s/ .*? ('Release Notes V') //; I need a better way of doing the above. Many thanks, -T
Re: I need a better regex with a literal in it
Assuming that all you are trying to do is to delete everything at the start of a line up to and including that string, the first thing I would do is get rid of all the superfluous parts of the regex. Does $NewRev ~~ s/ .* 'Release Notes V' // have the problem? ".*" means any number of characters, including zero; thus the "?" is superfluous. The parentheses simply capture the data that is matched, Since the string is constant, there is not much use capturing it, and I doubt you are using the captured value (available as $0). Kevin. On Sun, 24 Oct 2021 at 10:44, ToddAndMargo via perl6-users < perl6-us...@perl.org> wrote: > Hi All, > > Wish I had a Q[;;;] expression inside a regex, but I don't > > This is my notes on how to do a regex with a special > characters in it: > > Regex with literals in it: > $JsonAddr ~~ s| (';') .* ||; > > It "usually" works. > > > Unfortunately this one hangs my program (I am slicing > up a web page): > > $NewRev ~~ s/ .*? ('Release Notes V') //; > > I need a better way of doing the above. > > Many thanks, > -T > > >
Re: I need a better regex with a literal in it
On 10/23/21 16:56, Kevin Pye wrote: Does $NewRev ~~ s/ .* 'Release Notes V' // have the problem? I fixed the hang. I was giving the regex a zero length file. $NewRev ~~ s/ .* "Release Notes V" // worked. Where I get into problems is with things like `` and `` and such. So what I am still looking for is a better way to put special characters into a regex -T
Re: I need a better regex with a literal in it
> On Oct 23, 2021, at 6:43 PM, ToddAndMargo via perl6-users > wrote: > > Hi All, > > Wish I had a Q[;;;] expression inside a regex, but I don't > > This is my notes on how to do a regex with a special > characters in it: > > Regex with literals in it: >$JsonAddr ~~ s| (';') .* ||; > > It "usually" works. > > > Unfortunately this one hangs my program (I am slicing > up a web page): > > $NewRev ~~ s/ .*? ('Release Notes V') //; > > I need a better way of doing the above. > > Many thanks, > -T > > Just anchor the start of the pattern, using `^` : $NewRev ~~ s/ ^ .*? ('Release Notes V') //; In the code below: * Target_V is matched by the original pattern, at high speed. * Target_V is matched by the anchored pattern, at high speed. * Target_Z hangs the original pattern. * Target_Z is correctly fails at match the anchored pattern, at high speed. my $target_V = 'Release Notes V'; # Will match pattern(s) below my $target_Z = 'Release Notes Z'; # Will not match pattern(s) below # Uncomment one of these two lines: my $target = $target_V; # my $target = $target_Z; # Simulate a big HTML page: my $NewRev = ('abcdefghijklmnopqrstuvwxyz' x 100) ~ $target ~ ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' x 100) ~ $target ~ ('12345678901234567890123456' x 100); # Uncomment one of these two lines: $NewRev ~~ s/ .*? ('Release Notes V') //; # Original # $NewRev ~~ s/ ^ .*? ('Release Notes V') //; # Anchored -- Hope this helps, Bruce Gray (Util of PerlMonks)
Re: I need a better regex with a literal in it
On 10/23/21 17:37, Bruce Gray wrote: On Oct 23, 2021, at 6:43 PM, ToddAndMargo via perl6-users wrote: Hi All, Wish I had a Q[;;;] expression inside a regex, but I don't This is my notes on how to do a regex with a special characters in it: Regex with literals in it: $JsonAddr ~~ s| (';') .* ||; It "usually" works. Unfortunately this one hangs my program (I am slicing up a web page): $NewRev ~~ s/ .*? ('Release Notes V') //; I need a better way of doing the above. Many thanks, -T Just anchor the start of the pattern, using `^` : $NewRev ~~ s/ ^ .*? ('Release Notes V') //; In the code below: * Target_V is matched by the original pattern, at high speed. * Target_V is matched by the anchored pattern, at high speed. * Target_Z hangs the original pattern. * Target_Z is correctly fails at match the anchored pattern, at high speed. my $target_V = 'Release Notes V'; # Will match pattern(s) below my $target_Z = 'Release Notes Z'; # Will not match pattern(s) below # Uncomment one of these two lines: my $target = $target_V; # my $target = $target_Z; # Simulate a big HTML page: my $NewRev = ('abcdefghijklmnopqrstuvwxyz' x 100) ~ $target ~ ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' x 100) ~ $target ~ ('12345678901234567890123456' x 100); # Uncomment one of these two lines: $NewRev ~~ s/ .*? ('Release Notes V') //; # Original # $NewRev ~~ s/ ^ .*? ('Release Notes V') //; # Anchored Hi Bruce, I did fix the hang. I was accidentally feeding the regex a nul string. I am not understanding what you mean by "anchor" -T
Re: I need a better regex with a literal in it
> On Oct 23, 2021, at 7:48 PM, ToddAndMargo via perl6-users > wrote: > > On 10/23/21 17:37, Bruce Gray wrote: >>> On Oct 23, 2021, at 6:43 PM, ToddAndMargo via perl6-users >>> wrote: >>> >>> Hi All, >>> >>> Wish I had a Q[;;;] expression inside a regex, but I don't >>> >>> This is my notes on how to do a regex with a special >>> characters in it: >>> >>> Regex with literals in it: >>>$JsonAddr ~~ s| (';') .* ||; >>> >>> It "usually" works. >>> >>> >>> Unfortunately this one hangs my program (I am slicing >>> up a web page): >>> >>> $NewRev ~~ s/ .*? ('Release Notes V') //; >>> >>> I need a better way of doing the above. >>> >>> Many thanks, >>> -T >>> >>> >> Just anchor the start of the pattern, using `^` : >> $NewRev ~~ s/ ^ .*? ('Release Notes V') //; >> In the code below: >> * Target_V is matched by the original pattern, at high speed. >> * Target_V is matched by the anchored pattern, at high speed. >> * Target_Z hangs the original pattern. >> * Target_Z is correctly fails at match the anchored pattern, at high >> speed. >> my $target_V = 'Release Notes V'; # Will match pattern(s) >> below >> my $target_Z = 'Release Notes Z'; # Will not match pattern(s) >> below >> # Uncomment one of these two lines: >> my $target = $target_V; >> # my $target = $target_Z; >> # Simulate a big HTML page: >> my $NewRev = ('abcdefghijklmnopqrstuvwxyz' x 100) ~ $target >>~ ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' x 100) ~ $target >>~ ('12345678901234567890123456' x 100); >> # Uncomment one of these two lines: >> $NewRev ~~ s/ .*? ('Release Notes V') //; # Original >> # $NewRev ~~ s/ ^ .*? ('Release Notes V') //; # Anchored > > > Hi Bruce, > > I did fix the hang. I was accidentally feeding the > regex a nul string. > > I am not understanding what you mean by "anchor" > > -T From https://docs.raku.org/language/regexes#Anchors : Regexes search an entire string for matches. Sometimes this is not what you want. Anchors match only at certain positions in the string, thereby anchoring the regex match to that position. The ^ anchor only matches at the start of the string. The $ anchor only matches at the end of the string. Code examples are given on that web page, but I (and, I am sure, other readers) can expound on anchors on request if the examples fail to suffice. As to your "null string", I am glad that you resolved your problem, but I cannot get this code to hang: $NewRev ~~ s/ ^ .*? ('Release Notes V') //; , just by preceding it with this line: $NewRev = ''; , so I may misunderstand the nature of your accident.
Re: I need a better regex with a literal in it
On 10/23/21 18:03, Bruce Gray wrote: As to your "null string", I am glad that you resolved your problem, but I cannot get this code to hang: $NewRev ~~ s/ ^ .*? ('Release Notes V') //; , just by preceding it with this line: $NewRev = ''; , so I may misunderstand the nature of your accident. It would be unusual for you to have been able to duplicate it. It had to travel through an external call to curl and be read back from STDIN. Come to think of it, whatever was in it (curl showed a zero length download), it probably was not a nul.
Re: I need a better regex with a literal in it
On 10/23/21 18:03, Bruce Gray wrote: ^ .*? Hi Bruce, I am not seeming the difference between ^ .*? and .*? `.*?` means to search to the first instance of the string. `^` means to start at the beginning. I do not see the difference.
Re: I need a better regex with a literal in it
On 10/23/21 19:02, ToddAndMargo via perl6-users wrote: On 10/23/21 18:03, Bruce Gray wrote: ^ .*? Hi Bruce, I am not seeming the difference between ^ .*? and .*? `.*?` means to search to the first instance of the string. `^` means to start at the beginning. I do not see the difference. I guess where the confusion is coming is is that when does regex NOT start reading at the beginning of the input data stream? Why does it need the ^
Re: I need a better regex with a literal in it
On 24/10/2021 12:59, ToddAndMargo via perl6-users wrote: > On 10/23/21 18:03, Bruce Gray wrote: >> As to your "null string", I am glad that you resolved your problem, >> but I cannot get this code to hang: >> $NewRev ~~ s/ ^ .*? ('Release Notes V') //; >> , just by preceding it with this line: >> $NewRev = ''; >> , so I may misunderstand the nature of your accident. > > It would be unusual for you to have been able to > duplicate it. It had to travel through an external > call to curl and be read back from STDIN. Come > to think of it, whatever was in it (curl showed a > zero length download), it probably was not a nul. you probably had a variable that was undefined, rather than containing an empty string, or perhaps you had something else in the variable as as Raku has many types but why anything would actually hang the program instead of returning an error I could not say. -- .~. In my life God comes way first /V\ but Linux is very important for me too :-D /( )\Francis (Grizzly) Smit ^^-^^http://www.smit.id.au/
Re: I need a better regex with a literal in it
On 10/23/21 21:31, Francis Grizzly Smit wrote: On 24/10/2021 12:59, ToddAndMargo via perl6-users wrote: On 10/23/21 18:03, Bruce Gray wrote: As to your "null string", I am glad that you resolved your problem, but I cannot get this code to hang: $NewRev ~~ s/ ^ .*? ('Release Notes V') //; , just by preceding it with this line: $NewRev = ''; , so I may misunderstand the nature of your accident. It would be unusual for you to have been able to duplicate it. It had to travel through an external call to curl and be read back from STDIN. Come to think of it, whatever was in it (curl showed a zero length download), it probably was not a nul. you probably had a variable that was undefined, rather than containing an empty string, or perhaps you had something else in the variable as as Raku has many types but why anything would actually hang the program instead of returning an error I could not say. That makes sense