In playing around with this, I realise that the "B" mode is quite useful. So much so, in fact, that I'm wondering if it's warranted to have a dedicated quad-function for this specific behaviour.
Here's an example of extracting sequences of 4 characters: * {⍵ ⊂⍨ "[a-z]{4}" ⎕RE['B'] ⍵} 'abcdef45abchello9'* ┏→━━━━━━━━━━━━━━━━━━━┓ ┃"abcd" "abch" "ello"┃ ┗∊━━━━━━━━━━━━━━━━━━━┛ Regards, Elias On 2 October 2017 at 16:27, Elias Mårtenson <loke...@gmail.com> wrote: > Some progress: > > The behaviour I described earlier still works, but now has the ability to > work N-dimensional arrays of strings, compiling the regex only once and > then applying it on all the cells. > > In addition to this, I have now also added a flag "B" (meaning "bitmap") > that creates a bitmap of all matches and can be used in conjunction with ⊂ > to split strings by regex. > > Here's an example: > > * " +" ⎕RE["B"] "this is a test"* > ┏→━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ > ┃0 0 0 0 1 0 0 2 2 2 0 3 3 3 3 3 0 0 0 0┃ > ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ > > This matches any sequence of spaces, and we can easily use ⊂ to split the > string: > > * {⍵ ⊂⍨ 0=" +" ⎕RE["B"] ⍵} "this is a test"* > ┏→━━━━━━━━━━━━━━━━━━━━━┓ > ┃"this" "is" "a" "test"┃ > ┗∊━━━━━━━━━━━━━━━━━━━━━┛ > > However, I'm not sure if the value returned from the function are ideal. > The idea of the increasing numbers is to be able to differentiate between > the result of: > > * " " ⎕RE["B"] " "* > ┏→━━━━━━┓ > ┃1 2 3 4┃ > ┗━━━━━━━┛ > > vs: > > * " +" ⎕RE["B"] " "* > ┏→━━━━━━┓ > ┃1 1 1 1┃ > ┗━━━━━━━┛ > > Should it be left like this, or should it be done in some other way? > > Regards, > Elias > > On 25 September 2017 at 20:10, Juergen Sauermann < > juergen.sauerm...@t-online.de> wrote: > >> Hi Elias, >> >> making a quad function an operator is simple if the function argument(s) >> is/are primitive functions >> and a little more complicated if not. >> >> First of all you have to implement (read: overload) some of the >> eval_XXX() function that have function >> arguments. For monadic operators these eval_XXX() functions areare: >> >> virtual Token eval_ALB(Value_P A, Token & LO, Value_P B) >> virtual Token eval_ALXB(Value_P A, Token & LO, Value_P X, Value_P B) >> virtual Token eval_LB(Token & LO, Value_P B) >> virtual Token eval_LXB(Token & LO, Value_P X, Value_P B) >> >> where L resp. LO stands for the left function argument. For a dyadic >> operators they are: >> >> virtual Token eval_ALRB(Value_P A, Token & LO, Token & RO, Value_P B) >> virtual Token eval_ALRXB(Value_P A, Token & LO, Token & RO, Value_P X, >> Value_P B) >> virtual Token eval_LRB(Token & LO, Token & RO, Value_P B) >> virtual Token eval_LRXB(Token & LO, Token & RO, Value_P X, Value_P B) >> >> where L resp. LO and R resp. RO stand for the left and right function >> argument(s), A and B >> are the value arguments, and X the axis. >> >> Not all of them need to be implemented only those that have function >> signatures that >> are supported by the operator (mainly in terms of allowing an axis >> argument X or a >> left value argument A). >> >> If an operator supports defined functions (as opposed to primitive >> functions) then it will typically >> implement the operator itself as a macro, which means that the >> implementation is written in APL >> rather than in C++ (similar to "magic functions" in NARS). This is needed >> because primitive functions >> are atomic (they either succeed or fail, but cannot be continued after a >> failure) while defined functions >> (and operators) can continue at the point of interruption after having >> fixed the values that have cause >> the fault. >> >> Some of the build-in operators in GNU APL have both a primitive >> implementation (which is used when >> the function arguments are primitive) and a macro based implementation if >> not. This is for performance >> reasons so that the ability to take defined functions as arguments does >> not performance-wise harm the >> cases where the function arguments are primitive. >> >> The Macro definitions are contained in Macro.def >> >> Please note that in GNU APL functions cannot return functions, which may >> or may not be a problem >> in your case, depending on whether the function argument(s) of the >> ⎕-operator is/are primitive or not. >> In standard APL you cannot assign a function to a name. The usual >> work-around return a string and ⍎ it. >> >> My guts feeling is that if you need function arguments for implementing >> regular expressions then >> something has been going into the wrong direction somewhere else. >> >> Best Regards, >> /// Jürgen >> >> >> >> On 09/25/2017 05:18 AM, Elias Mårtenson wrote: >> >>> Dyalog's implementation is much more expressive than what I had proposed. >>> >>> There are technical reasons why we have no hope of replicating their >>> functionality (in particular, GNU APL does not have support for namespaces). >>> >>> Their function takes arguments and returns a function, which is a >>> matcher function that can be reused, which is useful since you'd only >>> compile the regexp once. Jürgen, how can I make a quad-function behave like >>> below? It seems to be similar in behaviour to ⍤ and ⍣. >>> >>> * ('.at' ⎕R '\u0') 'The cat sat on the mat' * >>> The CAT SAT on the MAT >>> >>> It can also accept a function, in which case the function is called for >>> each match, to return a replacement string. Can you explain how to make a >>> quad-function an operator? >>> * >>> * >>> * ('\w+' ⎕R {⌽⍵.Match}) 'The cat sat on the mat'* >>> ehT tac tas no eht tam >>> >>> As you can see, they leverage namespaces in order to pass a lot of >>> different fields to the replace-function. If we want to do something >>> similar, ⍵ would probably have to be the match string, and we'll have to >>> live without the remaining fields. >>> >>> Regards, >>> Elias >>> >>> >>> On 23 September 2017 at 00:08, Juergen Sauermann < >>> juergen.sauerm...@t-online.de <mailto:juergen.sauerm...@t-online.de>> >>> wrote: >>> >>> Hi, >>> >>> I have not looked into Dyalogs implementation myself, but if they >>> have it then we should aim at being as compatible as it makes sense. >>> No problem if some of their capabilities are not supported (please >>> avoid >>> going over the top in the GNU APL implementation) >>> >>> Unfortunately ⎕R is already occupied in GNU APL (inherited from >>> IBM APL2), >>> so some other name(s) are needed. >>> >>> Before implementing too much in advance, it would be good to >>> present the >>> intended syntax and semantics on bug-apl and solicit opinions. >>> >>> /// Jürgen >>> >>> >>> On 09/22/2017 04:59 PM, Elias Mårtenson wrote: >>> >>>> I did not know this. I took a look at Dyalog's API and it's not >>>> possible to implement it fully, as it relies on their object >>>> oriented features. However, the basic functionality wouldn't be >>>> hard to replicate, if that is something that is desired. >>>> >>>> Jürgen, what is your opinion on this? >>>> >>>> On 22 September 2017 at 20:21, Jay Foad <jay.f...@gmail.com >>>> <mailto:jay.f...@gmail.com>> wrote: >>>> >>>> FYI Dyalog has operators ⎕S (search) and ⎕R (replace) which >>>> are implemented with PCRE: >>>> >>>> ('[Aa]..'⎕S'&')'Dyalog APL' >>>> ┌───┬───┐ >>>> │alo│APL│ >>>> └───┴───┘ >>>> ('red' 'green'⎕R'green' 'blue')'red orange yellow green blue' >>>> green orange yellow blue blue >>>> >>>> http://help.dyalog.com/16.0/Content/Language/System%20Functi >>>> ons/r.htm >>>> <http://help.dyalog.com/16.0/Content/Language/System%20Funct >>>> ions/r.htm> >>>> >>>> Jay. >>>> >>>> >>>> >>> >>> >> >