On 30 December 2011 21:23, Marijn <hk...@gentoo.org> wrote: > Group capturing is useful, but the question is whether it is useful in > the context of regexp-split. Maybe it is, maybe it isn't. Racket seems > to be doing it differently than python, so I think that constitutes > reason to look more closely. Certainly guile should follow racket over > python, everything else being equal, but usually everything isn't > equal if only one has a look and I'm saying that we should look at > least at other schemes for inspiration. > If you're so convinced that python is doing it right here and should > be followed, then perhaps you can give some examples of how capturing > groups are useful in a function that is supposed to split strings at > regexps.
Having the *option* to return the captured groups in `regexp-split' is certainly useful -- consider implementing a parser [1]. If the captured groups are not desired, then simply omit the grouping parens from the expression. [1] http://80.68.89.23/2003/Oct/26/reSplit/ > > Another data point: > > [14:17] <hkBst> what does chicken return for (irregex-split "([^0-9])" > "123+456*/") ? > [14:18] <sjamaan> ("123" "456") > > Looks like chicken doesn't do capturing groups in their version, but > they don't have the empty matches either. How about that... For tokenizing I think you want to keep any empty strings, otherwise you lose track of which `field' you are in (consider /etc/passwd entries). This also matches the existing behaviour of `string-split'.