Below, I've added a patch for a few examples of using the regular expression functions. I hope it is useful.
(guile contribution papers already signed). Ian. * slib.texi: Fixed double `the' in sentence. * scheme-data.texi: Added some examples for regular expression usage. Index: slib.texi =================================================================== RCS file: /cvsroot/guile/guile/guile-core/doc/ref/slib.texi,v retrieving revision 1.3 diff -u -r1.3 slib.texi --- slib.texi 8 Jan 2002 08:29:00 -0000 1.3 +++ slib.texi 22 Sep 2002 20:55:07 -0000 @@ -2,7 +2,7 @@ @node SLIB @chapter SLIB -Before the the SLIB facilities can be used, the following Scheme +Before the SLIB facilities can be used, the following Scheme expression must be executed: @smalllisp Index: scheme-data.texi =================================================================== RCS file: /cvsroot/guile/guile/guile-core/doc/ref/scheme-data.texi,v retrieving revision 1.22 diff -u -r1.22 scheme-data.texi --- scheme-data.texi 16 Sep 2002 20:01:34 -0000 1.22 +++ scheme-data.texi 22 Sep 2002 20:55:22 -0000 @@ -1901,6 +1901,11 @@ @deffnx {C Function} scm_string_append (args) Return a newly allocated string whose characters form the concatenation of the given strings, @var{args}. +@lisp +(define h "hello ") +(string-append h "world") +@result{} "hello world" +@end lisp @end deffn @@ -1947,6 +1952,13 @@ implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions. +Before the regular expression facilities can be used in a script, +the following expression must be executed: + +@lisp +(use-modules (ice-9 regex)) +@end lisp + @c begin (scm-doc-string "regex.scm" "string-match") @deffn {Scheme Procedure} string-match pattern str [start] Compile the string @var{pattern} into a regular expression and compare @@ -1959,6 +1971,18 @@ @var{pattern} at all, @code{string-match} returns @code{#f}. @end deffn +Two examples of a match are given below. +The first example matches the four digits in the string, +and the second example matches nothing. + +@lisp +(string-match "[0-9][0-9][0-9][0-9]" "blah2002") +@result{} #("blah2002" (4 . 8)) + +(string-match "[A-Za-z]" "123456") +@result{} #f +@end lisp + Each time @code{string-match} is called, it must compile its @var{pattern} argument into a regular expression structure. This operation is expensive, which makes @code{string-match} inefficient if @@ -2030,6 +2054,23 @@ @end table @end deffn +@lisp +;; Regexp to match uppercase letters +(define r (make-regexp "[A-Z]*")) + +;; Regexp to match letters, ignoring case +(define ri (make-regexp "[A-Z]*" regexp/icase)) + +;; Search for bob using regexp r +(match:substring (regexp-exec r "bob")) +@result{} "" (no match) + +;; Search for bob using regexp ri +(match:substring (regexp-exec ri "Bob")) +@result{} "Bob" (matched case insensitive) +@end lisp + + @deffn {Scheme Procedure} regexp? obj @deffnx {C Function} scm_regexp_p (obj) Return @code{#t} if @var{obj} is a compiled regular expression, @@ -2061,9 +2102,25 @@ the regexp match is written. @end itemize -@var{port} may be @code{#f}, in which case nothing is written; instead, +The @var{port} argument may be @code{#f}, in which case +nothing is written; instead, @code{regexp-substitute} constructs a string from the specified @var{item}s and returns that. + +The following example take a regular expression +that matches a standard YYYYMMDD date such as 20020828. +The @code{regexp-substitute} then returns a string from the +match structure containing the fields and text from +the original string re-ordered and split out. + +@lisp +(define datere "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") +(define s "Date 20020429 12am.") +(define sm (string-match datere s)) +(regexp-substitute #f sm 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") +@result{} "Date 04-29-2002 12am. (20020429)" +@end lisp + @end deffn @c begin (scm-doc-string "regex.scm" "regexp-substitute") @@ -2090,6 +2147,17 @@ present among the @var{item}s, then @code{regexp-substitute/global} will return after processing a single match. @end itemize + +So, the example above for @code{regexp-substitute} could be re-written +to remove the @code{string-match} stage. + +@lisp +(define datere "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") +(define s "Date 20020429 12am.") +(regexp-substitute/global #f datere s + 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") +@result{} "Date 04-29-2002 12am. (20020429)" +@end lisp @end deffn @node Match Structures @@ -2124,26 +2192,68 @@ @var{n}. Submatch 0 (the default) represents the entire regexp match. If the regular expression as a whole matched, but the subexpression number @var{n} did not match, return @code{#f}. + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:substring s) +@result{} "2002" + +;; match starting at offset 6 in the string +(match:substring + (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6)) +@result{} "7654" +@end lisp + @end deffn @c begin (scm-doc-string "regex.scm" "match:start") @deffn {Scheme Procedure} match:start match [n] Return the starting position of submatch number @var{n}. + +In the following example, the result is four since the +match started at character index four. + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:start s) +@result{} 4 +@end lisp @end deffn @c begin (scm-doc-string "regex.scm" "match:end") @deffn {Scheme Procedure} match:end match [n] Return the ending position of submatch number @var{n}. + +In the following example, the result is eight since the match +is between characters four and eight (i.e., the 2002). + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:end s) +@result{} 8 +@end lisp @end deffn @c begin (scm-doc-string "regex.scm" "match:prefix") @deffn {Scheme Procedure} match:prefix match Return the unmatched portion of @var{target} preceding the regexp match. + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:prefix s) +@result{} "blah" +@end lisp @end deffn @c begin (scm-doc-string "regex.scm" "match:suffix") @deffn {Scheme Procedure} match:suffix match Return the unmatched portion of @var{target} following the regexp match. + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:suffix s) +@result{} "foo" +@end lisp @end deffn @c begin (scm-doc-string "regex.scm" "match:count") @@ -2156,6 +2266,12 @@ @c begin (scm-doc-string "regex.scm" "match:string") @deffn {Scheme Procedure} match:string match Return the original @var{target} string. + +@lisp +(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) +(match:string s) +@result{} "blah2002foo" +@end lisp @end deffn @node Backslash Escapes _______________________________________________ Bug-guile mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-guile