Well, I realized it's a mistake. We can use fold-matches anyway. On Fri, Dec 30, 2011 at 4:46 PM, Nala Ginrut <nalagin...@gmail.com> wrote:
> hi Daniel! Very glad to see your reply. > 1. I also think the order: (regexp str) is strange. But it's according to > python version. > And I think the 'string-match' also put regexp before str. Anyway, that's > an easy mend. > 2. I think it's a little different to implement a flag as python version. > Since "ignorecase" flag must > be passed to make-regexp. So we can't use fold-matches. > Hmm...let me see what I can do... > > On Fri, Dec 30, 2011 at 1:34 PM, Daniel Hartwig <mand...@gmail.com> wrote: > >> Hello >> >> >>> On Thu, Dec 29, 2011 at 5:32 PM, Nala Ginrut <nalagin...@gmail.com> >> >>> wrote: >> >>>> >> >>>> hi guilers! >> >>>> It seems like there's no "regexp-split" procedure in Guile. >> >>>> What we have is "string-split" which accepted Char only. >> >>>> So I wrote one for myself. >> >>>> >> >>>> ------python code----- >> >>>> >>> import re >> >>>> >>> re.split("([^0-9])", "123+456*/") >> >>>> [’123’, ’+’, ’456’, ’*’, ’’, ’/’, ’’] >> >>>> --------code end------- >> >>>> >> >>>> The Guile version: >> >>>> >> >>>> ----------guile code------- >> >>>> (regexp-split "([^0-9])" "123+456*/") >> >>>> ==>("123" "+" "456" "*" "" "/" "") >> >>>> ----------code end-------- >> >>>> >> >>>> Anyone interested in it? >> >>>> >> >> Nice work! I have a couple of comments :-) >> >> >> The matched pattern/deliminator is included in the output: >> >> scheme@(guile-user)> (regexp-split "(\\W+)" "Words, words, words.") >> $21 = ("Words" ", " "words" ", " "words" "." "") >> scheme@(guile-user)> (regexp-split "\\W+" "Words, words, words.") >> $22 = ("Words" ", " "words" ", " "words" "." "") >> >> However, a user is not always interested in the deliminator. Consider >> the example given for string-split: >> >> scheme@(guile-user)> (string-split "root:x:0:0:root:/root:/bin/bash" #\:) >> $23 = ("root" "x" "0" "0" "root" "/root" "/bin/bash") >> >> This behaviour can be obtained with list-matches on the complement of >> REGEXP. >> >> scheme@(guile-user)> (map match:substring >> (list-matches "\\w+" "Words, words, words.")) >> $24 = ("Words" "words" "words") >> >> I would like to see your version support the Python semantics [1]: >> >> > If capturing parentheses are used in pattern, then the text of >> > all groups in the pattern are also returned as part of the resulting >> > list. >> [...] >> > >>> re.split('\W+', 'Words, words, words.') >> > ['Words', 'words', 'words', ''] >> > >>> re.split('(\W+)', 'Words, words, words.') >> > ['Words', ', ', 'words', ', ', 'words', '.', ''] >> >> >>> re.split('((,)?\W+?)', 'Words, words, words.') >> ['Words', ', ', ',', 'words', ', ', ',', 'words', '.', None, ''] >> >> >> For the sake of consistency with the rest of the module perhaps >> support the `flags' option (just pass it to fold-matches) and use the >> same variable names, etc.: >> >> (define* (regexp-split regexp string #:optional (flags 0)) >> ... >> >> instead of: >> >> (define regexp-split >> (lambda (regex str) >> ... >> >> >> Also, to me the name seems unintuitive -- it is STR being split, not >> RE -- perhaps this can be folded in to the existing string-split >> function. >> >> >> A nice patch none-the-less! >> >> >> [1] http://docs.python.org/library/re.html#re.split >> > >