hi Daniel! Very glad to see your reply.
1. I also think the order: (regexp str) is strange. But it's according to
python version.
And I think the 'string-match' also put regexp before str. Anyway, that's
an easy mend.
2. I think it's a little different to implement a flag as python version.
Since "ignorecase" flag must
be passed to make-regexp. So we can't use fold-matches.
Hmm...let me see what I can do...

On Fri, Dec 30, 2011 at 1:34 PM, Daniel Hartwig <mand...@gmail.com> wrote:

> Hello
>
> >>> On Thu, Dec 29, 2011 at 5:32 PM, Nala Ginrut <nalagin...@gmail.com>
> >>> wrote:
> >>>>
> >>>> hi guilers!
> >>>> It seems like there's no "regexp-split" procedure in Guile.
> >>>> What we have is "string-split" which accepted Char only.
> >>>> So I wrote one for myself.
> >>>>
> >>>> ------python code-----
> >>>> >>> import re
> >>>> >>> re.split("([^0-9])", "123+456*/")
> >>>> [’123’, ’+’, ’456’, ’*’, ’’, ’/’, ’’]
> >>>> --------code end-------
> >>>>
> >>>> The Guile version:
> >>>>
> >>>> ----------guile code-------
> >>>> (regexp-split "([^0-9])"  "123+456*/")
> >>>> ==>("123" "+" "456" "*" "" "/" "")
> >>>> ----------code end--------
> >>>>
> >>>> Anyone interested in it?
> >>>>
>
> Nice work!  I have a couple of comments :-)
>
>
> The matched pattern/deliminator is included in the output:
>
> scheme@(guile-user)> (regexp-split "(\\W+)" "Words, words, words.")
> $21 = ("Words" ", " "words" ", " "words" "." "")
> scheme@(guile-user)> (regexp-split "\\W+" "Words, words, words.")
> $22 = ("Words" ", " "words" ", " "words" "." "")
>
> However, a user is not always interested in the deliminator.  Consider
> the example given for string-split:
>
> scheme@(guile-user)> (string-split "root:x:0:0:root:/root:/bin/bash" #\:)
> $23 = ("root" "x" "0" "0" "root" "/root" "/bin/bash")
>
> This behaviour can be obtained with list-matches on the complement of
> REGEXP.
>
> scheme@(guile-user)> (map match:substring
>                          (list-matches "\\w+" "Words, words, words."))
> $24 = ("Words" "words" "words")
>
> I would like to see your version support the Python semantics [1]:
>
> > If capturing parentheses are used in pattern, then the text of
> > all groups in the pattern are also returned as part of the resulting
> > list.
> [...]
> > >>> re.split('\W+', 'Words, words, words.')
> > ['Words', 'words', 'words', '']
> > >>> re.split('(\W+)', 'Words, words, words.')
> > ['Words', ', ', 'words', ', ', 'words', '.', '']
>
> >>> re.split('((,)?\W+?)', 'Words, words, words.')
> ['Words', ', ', ',', 'words', ', ', ',', 'words', '.', None, '']
>
>
> For the sake of consistency with the rest of the module perhaps
> support the `flags' option (just pass it to fold-matches) and use the
> same variable names, etc.:
>
> (define* (regexp-split regexp string #:optional (flags 0))
>  ...
>
> instead of:
>
> (define regexp-split
>  (lambda (regex str)
>  ...
>
>
> Also, to me the name seems unintuitive -- it is STR being split, not
> RE -- perhaps this can be folded in to the existing string-split
> function.
>
>
> A nice patch none-the-less!
>
>
> [1] http://docs.python.org/library/re.html#re.split
>

Reply via email to