Regardless whether things like casing should be supported, the problem is that full Unicode support is required. APL is one of those languages where you just can't get away with not supporting it. PCRE does support it, and unfortunately I don't think POSIX regexp does.
Are there any any alternatives? Regards, Elias On 20 September 2017 at 18:27, Giuseppe Cocomazzi <sbude...@gmail.com> wrote: > Hi, > I also think that adding the support would be very useful. However, I > would definitely avoid PCRE and backreference support. I think the > best solution would be to just add a basic and efficient NFA-based > implementation (the defacto original implementation for Unix). For > more information about the correct way to implement RE: > https://swtch.com/~rsc/regexp/ > > As for the API itself, I agree with Elias that maybe a simple > interface is the way to go. I would also prefer not to have any > support for modifiers (not even IGNORECASE) and definitely avoid the > MULTILINE horror. If we opt for the NFA implementation then, the > builtin ⎕Regex (or ⎕RE) could be universally used not only for strings > but for numeric data as well. That, in conjuction with APL arrays, > would ultimately be a killer feature (I am not aware of such a feature > in other languages). > > Best, > > Giuseppe Cocomazzi > http://sbudella.altervista.org > > > On Wed, Sep 20, 2017 at 5:59 AM, Elias Mårtenson <loke...@gmail.com> > wrote: > > On several occasions, I have felt that built-in regex support in GNU APL > > would be very helpful. > > > > Implementing it should be rather simple, but I'd like to discuss how > such an > > API should look in order for it to be as useful as possible. > > > > I was thinking of the following form: > > > > regex ⎕Regex string > > > > The way I envision this to work, is to have the function return ⍬ if > there > > is no match, or a string containing the match, if there is one: > > > > 'f..' ⎕Regex 'xzooy' > > ┏⊖┓ > > ┃0┃ > > ┗━┛ > > 'f..' ⎕Regex 'xfooy' > > 'foo' > > > > If the regex has subexpressions, those matches should be returned as > > individual strings: > > > > '([0-9]+)-([0-9]+)-([0-9]+) '⎕Regex '2017-01-02' > > ┏→━━━━━━━━━━━━━━━┓ > > ┃"2017" "01" "02"┃ > > ┗∊━━━━━━━━━━━━━━━┛ > > > > This would be a very useful API, and reasonably easy to implement by > simply > > calling into the standard regcomp() call: > > http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html > > > > What do you think? Is this a reasonable way to implement it? Any > suggestions > > about alternative API's? > > > > Regards, > > Elias >