Re: [PATCH] add regexp-split

Nala Ginrut Fri, 30 Dec 2011 03:40:19 -0800

Great! It's better now.
Here's the brand new patch~

On Fri, Dec 30, 2011 at 5:42 PM, Daniel Hartwig <mand...@gmail.com> wrote:


> On 30 December 2011 16:46, Nala Ginrut <nalagin...@gmail.com> wrote:
> > hi Daniel! Very glad to see your reply.
> > 1. I also think the order: (regexp str) is strange. But it's according to
> > python version.
> > And I think the 'string-match' also put regexp before str. Anyway,
> that's an
> > easy mend.
>
> `regexp string' is also the same order as `list-matches' and
> `fold-matches'.  Probably best to keep it that way if this is in the
> regex module.
>
>
> >> I would like to see your version support the Python semantics [1]:
> >>
> >> > If capturing parentheses are used in pattern, then the text of
> >> > all groups in the pattern are also returned as part of the resulting
> >> > list.
> >> [...]
> >> > >>> re.split('\W+', 'Words, words, words.')
> >> > ['Words', 'words', 'words', '']
> >> > >>> re.split('(\W+)', 'Words, words, words.')
> >> > ['Words', ', ', 'words', ', ', 'words', '.', '']
> >>
> >> >>> re.split('((,)?\W+?)', 'Words, words, words.')
> >> ['Words', ', ', ',', 'words', ', ', ',', 'words', '.', None, '']
>
> FYI this can be achieved by changing the inner part to:
>
>    (let* ...
>           (s (substring string start end))
>           (groups (map (lambda (n) (match:substring m n))
>                        (iota (1- (match:count m)) 1))))
>      (list `(,@ll ,s ,@groups) (match:end m) tail)))
>
> Note: using srfi-1 iota
>
>

From b738a8b890f41bf684c0556ca79af2d7c14b6df5 Mon Sep 17 00:00:00 2001
From: NalaGinrut <nalagin...@gmail.com>
Date: Fri, 30 Dec 2011 19:38:38 +0800
Subject: [PATCH] ADD regexp-split

---
 module/ice-9/regex.scm |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/module/ice-9/regex.scm b/module/ice-9/regex.scm
index f7b94b7..b5f6149 100644
--- a/module/ice-9/regex.scm
+++ b/module/ice-9/regex.scm
@@ -41,7 +41,7 @@
   #:export (match:count match:string match:prefix match:suffix
            regexp-match? regexp-quote match:start match:end match:substring
            string-match regexp-substitute fold-matches list-matches
-           regexp-substitute/global))
+           regexp-substitute/global regexp-split))
 
 ;; References:
 ;;
@@ -226,3 +226,19 @@
                         (begin
                           (do-item (car items)) ; This is not.
                           (next-item (cdr items)))))))))))
+                          
+(define* (regexp-split regex str #:optional (flags 0))
+  (let ((ret (fold-matches 
+	      regex str (list '() 0 '(""))
+	      (lambda (m prev)
+		(let* ((ll (car prev))
+		       (start (cadr prev))
+		       (tail (match:suffix m))
+		       (end (match:start m))
+		       (s (substring/shared str start end))
+		       (groups (map (lambda (n) (match:substring m n))
+				    (iota (1- (match:count m))))))
+		  (list `(,@ll ,s ,@groups) (match:end m) tail)))
+	      flags)))
+    `(,@(car ret) ,(caddr ret))))
+
-- 
1.7.0.4

Re: [PATCH] add regexp-split

Reply via email to