Some progress: The behaviour I described earlier still works, but now has the ability to work N-dimensional arrays of strings, compiling the regex only once and then applying it on all the cells.
In addition to this, I have now also added a flag "B" (meaning "bitmap") that creates a bitmap of all matches and can be used in conjunction with ⊂ to split strings by regex. Here's an example: * " +" ⎕RE["B"] "this is a test"* ┏→━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃0 0 0 0 1 0 0 2 2 2 0 3 3 3 3 3 0 0 0 0┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ This matches any sequence of spaces, and we can easily use ⊂ to split the string: * {⍵ ⊂⍨ 0=" +" ⎕RE["B"] ⍵} "this is a test"* ┏→━━━━━━━━━━━━━━━━━━━━━┓ ┃"this" "is" "a" "test"┃ ┗∊━━━━━━━━━━━━━━━━━━━━━┛ However, I'm not sure if the value returned from the function are ideal. The idea of the increasing numbers is to be able to differentiate between the result of: * " " ⎕RE["B"] " "* ┏→━━━━━━┓ ┃1 2 3 4┃ ┗━━━━━━━┛ vs: * " +" ⎕RE["B"] " "* ┏→━━━━━━┓ ┃1 1 1 1┃ ┗━━━━━━━┛ Should it be left like this, or should it be done in some other way? Regards, Elias On 25 September 2017 at 20:10, Juergen Sauermann < juergen.sauerm...@t-online.de> wrote: > Hi Elias, > > making a quad function an operator is simple if the function argument(s) > is/are primitive functions > and a little more complicated if not. > > First of all you have to implement (read: overload) some of the eval_XXX() > function that have function > arguments. For monadic operators these eval_XXX() functions areare: > > virtual Token eval_ALB(Value_P A, Token & LO, Value_P B) > virtual Token eval_ALXB(Value_P A, Token & LO, Value_P X, Value_P B) > virtual Token eval_LB(Token & LO, Value_P B) > virtual Token eval_LXB(Token & LO, Value_P X, Value_P B) > > where L resp. LO stands for the left function argument. For a dyadic > operators they are: > > virtual Token eval_ALRB(Value_P A, Token & LO, Token & RO, Value_P B) > virtual Token eval_ALRXB(Value_P A, Token & LO, Token & RO, Value_P X, > Value_P B) > virtual Token eval_LRB(Token & LO, Token & RO, Value_P B) > virtual Token eval_LRXB(Token & LO, Token & RO, Value_P X, Value_P B) > > where L resp. LO and R resp. RO stand for the left and right function > argument(s), A and B > are the value arguments, and X the axis. > > Not all of them need to be implemented only those that have function > signatures that > are supported by the operator (mainly in terms of allowing an axis > argument X or a > left value argument A). > > If an operator supports defined functions (as opposed to primitive > functions) then it will typically > implement the operator itself as a macro, which means that the > implementation is written in APL > rather than in C++ (similar to "magic functions" in NARS). This is needed > because primitive functions > are atomic (they either succeed or fail, but cannot be continued after a > failure) while defined functions > (and operators) can continue at the point of interruption after having > fixed the values that have cause > the fault. > > Some of the build-in operators in GNU APL have both a primitive > implementation (which is used when > the function arguments are primitive) and a macro based implementation if > not. This is for performance > reasons so that the ability to take defined functions as arguments does > not performance-wise harm the > cases where the function arguments are primitive. > > The Macro definitions are contained in Macro.def > > Please note that in GNU APL functions cannot return functions, which may > or may not be a problem > in your case, depending on whether the function argument(s) of the > ⎕-operator is/are primitive or not. > In standard APL you cannot assign a function to a name. The usual > work-around return a string and ⍎ it. > > My guts feeling is that if you need function arguments for implementing > regular expressions then > something has been going into the wrong direction somewhere else. > > Best Regards, > /// Jürgen > > > > On 09/25/2017 05:18 AM, Elias Mårtenson wrote: > >> Dyalog's implementation is much more expressive than what I had proposed. >> >> There are technical reasons why we have no hope of replicating their >> functionality (in particular, GNU APL does not have support for namespaces). >> >> Their function takes arguments and returns a function, which is a matcher >> function that can be reused, which is useful since you'd only compile the >> regexp once. Jürgen, how can I make a quad-function behave like below? It >> seems to be similar in behaviour to ⍤ and ⍣. >> >> * ('.at' ⎕R '\u0') 'The cat sat on the mat' * >> The CAT SAT on the MAT >> >> It can also accept a function, in which case the function is called for >> each match, to return a replacement string. Can you explain how to make a >> quad-function an operator? >> * >> * >> * ('\w+' ⎕R {⌽⍵.Match}) 'The cat sat on the mat'* >> ehT tac tas no eht tam >> >> As you can see, they leverage namespaces in order to pass a lot of >> different fields to the replace-function. If we want to do something >> similar, ⍵ would probably have to be the match string, and we'll have to >> live without the remaining fields. >> >> Regards, >> Elias >> >> >> On 23 September 2017 at 00:08, Juergen Sauermann < >> juergen.sauerm...@t-online.de <mailto:juergen.sauerm...@t-online.de>> >> wrote: >> >> Hi, >> >> I have not looked into Dyalogs implementation myself, but if they >> have it then we should aim at being as compatible as it makes sense. >> No problem if some of their capabilities are not supported (please >> avoid >> going over the top in the GNU APL implementation) >> >> Unfortunately ⎕R is already occupied in GNU APL (inherited from >> IBM APL2), >> so some other name(s) are needed. >> >> Before implementing too much in advance, it would be good to >> present the >> intended syntax and semantics on bug-apl and solicit opinions. >> >> /// Jürgen >> >> >> On 09/22/2017 04:59 PM, Elias Mårtenson wrote: >> >>> I did not know this. I took a look at Dyalog's API and it's not >>> possible to implement it fully, as it relies on their object >>> oriented features. However, the basic functionality wouldn't be >>> hard to replicate, if that is something that is desired. >>> >>> Jürgen, what is your opinion on this? >>> >>> On 22 September 2017 at 20:21, Jay Foad <jay.f...@gmail.com >>> <mailto:jay.f...@gmail.com>> wrote: >>> >>> FYI Dyalog has operators ⎕S (search) and ⎕R (replace) which >>> are implemented with PCRE: >>> >>> ('[Aa]..'⎕S'&')'Dyalog APL' >>> ┌───┬───┐ >>> │alo│APL│ >>> └───┴───┘ >>> ('red' 'green'⎕R'green' 'blue')'red orange yellow green blue' >>> green orange yellow blue blue >>> >>> http://help.dyalog.com/16.0/Content/Language/System%20Functi >>> ons/r.htm >>> <http://help.dyalog.com/16.0/Content/Language/System%20Funct >>> ions/r.htm> >>> >>> Jay. >>> >>> >>> >> >> >