Hi!

Does Pharo use the Regex11 package? If yes, has it already diverged from the 
version shipped with VisualWorks?


The reason I am asking is that I just pushed an update to the public store. It 
addresses a bug that prevented $[ to be used in a character class. For details, 
see the excerpt below. Furthermore, you might have an opinion on allowing more 
escape sequences in character classes, don't you?


Kind regards,
Steffen


----- Weitergeleitete Nachricht -----
Von: Steffen Märcker <merk...@web.de>
An: 'VWNC' <v...@cs.uiuc.edu>
Betreff: Re: [vwnc] Exception in Regex11 1.4.6
Datum: Thu Jun 24 2021 18:47:01 GMT+0200 (Mitteleuropäische Sommerzeit)


Hi!


I just published Regex11 version 1.4.7 with the following changes:


1. Fix: Character sets could not contain an opening bracket $[.
2. Fix: Character sets could not contain the characters '[:', e.g. as in 
'[[:something]' asRegex.


I also provided additional tests for the improved functionality. I might tidy 
the code a bit later in a minor version bump.



Just to note that Regex11 uses [[:xxx:]] as a special syntax, which might 
interfere with attempts to allow [[] and []]. 



Indeed. If I did no mistake, the new version does not break this.



I agree with the idea to allow backslash escaping in character classes too, 
with the default being that backslash followed by any character is parsed as 
that character.

I also like the idea of allowing more backslash escaping in character classes. 
However, I still have the bad feeling that this might change the semantics of 
existing code. Hence, I refrained from implementing this right away until I am 
more confident that this does not break other peoples stuff.



Currently only a few explicitly defined backslash escapes are recognized, 
forcing the user to remember whether a given character can be used as-is in a 
given context, or must be escaped.

 

A couple of gotchas (probably not applicable in a character set?):

\<           an empty string at the beginning of a word

\>           an empty string at the end of a word

Thanks, I'll keep them in mind and consider them when I decide to implement the 
changes.


OT: I also noticed that repetition, e.g. '.{5}' behaves strange. For instance, 
'.{{5}}' should match 'a{{{{{}' but it doesn't. Has anyone an opinion on that 
one?


Best regards, Steffen

-- 
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos von vivaldi.com herunter.

Reply via email to