On Sun, Oct 28, 2018 at 09:43:27PM +0100, Karsten Hilbert wrote:

> Let my try to explain the expression I am actually after
> (assuming .compile with re.VERBOSE):
> 
> rx_works = '
>       \$<                             # start of match is literal '$<' 
> anywhere inside string
>       [^<:]+?::               # followed by at least one "character", except 
> '<' or ':', until the next '::'          (this is the placeholder "name")
>       .*?::                   # followed by any number of any "character", 
> until the next '::'                                        (this is the 
> placeholder "options")
>       \d*?                    # followed by any number of digits              
>                                                                               
>           (the max length of placeholder output)
>       >\$                             # followed by '>$'
>       |                               # -- OR (in *either* order) --
>       \$<                             # start of match is literal '$<' 
> anywhere inside string
>       [^<:]+?::               # followed by at least one "character", except 
> '<' or ':', until the next '::'          (this is the placeholder "name")
>       .*?::                   # followed by any number of any "character", 
> until the next '::'                                        (this is the 
> placeholder "options")
>                                       # now the difference:
>       \d+-\d+                 # followed by one-or-many digits, a '-', and 
> one-or-many digits                                         (this is the 
> *range* from with placeholder output)
>       >\$'                    # followed by '>$'

Another try:

- lines can contain several placeholders

- placeholders start and end with '$'

- placeholders are parsed in three passes

- the pass in which a placeholder is parsed is denoted by the number of '<' and 
'>' next to the '$':

        $<...>$ / $<<...>>$ / $<<<...>>>$

- placeholders for different parsing passes must be nestable:

        $<<<...$<...>$...>>>$
        ....
        (lower=earlier parsing passes will be inside)

- the internal structure is "name::options::range"

        $<name::options::range>$

- name will *not* contain '$' '<' '>' ':'

- range can be either a length or a "from-until"

- a length will be a positive integer (no bounds checking)

- "from-until" is: a positive integer, a '-', and a positive integer (no sanity 
checking)

- options needs to be able to contain nearly anything, except '::'


Is that sufficiently defined and helpful to design the regular expression ?

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to