I love it!

On Tue, Oct 7, 2008 at 4:37 PM, Chouser <[EMAIL PROTECTED]> wrote:

> Ok, I know we've been over this before, but nothing was actually done.
>
> For the record:
>
> http://groups.google.com/group/clojure/browse_thread/thread/81b361a4e82602b7/0313c224a480a161
>
> So here is my attempt formalize a simple proposal.
>
> The reader should take the literal contents of #"..." and pass to
> Pattern.compile as a raw string, making no changes to the contents.
> That means all backslashes (\) and double quotes (") would be passed
> right in.  The only other thing the reader need concern itself with,
> is that when it sees a \" it should not treat that double-quote as the
> end of the pattern, but rather keep on doing until it sees a
> double-quote that is not preceded by a backslash.  Nevertheless is
> would pass both the quoting \ and the following " to Pattern.compile.
>
> That's it. Simple. It works because Java's Pattern itself understands
> backslash quoting, including literal chars like backslash and double
> quote, hex and octal patterns, as well as other regex patterns.
>
> Some examples:
>
> 1. Simple text
> (re-find #"foo" "foo") --> "foo"
>
> 2. Pre-defined character class
> (re-find #"\w*" "[EMAIL PROTECTED]") --> "foo"
>
> 3. Special character (regex and string)
> (re-find #"\t" "\t") --> "\t"
>
> 4. Scary special character (regex only)
> Note that the escape sequences available inside #"" are Java Pattern
> escape sequences, and therefore by definition different from Clojure
> String escape sequences.  Of course this is what you need for \w and
> such to work:
> (re-find #"\a" "\u0007") --> beep ""
>
> 5. Special character (string only)
> The revere of the previous example -- Clojure strings understand "\b"
> as (str \backspace), but Java patterns do not, so this example uses
> hex instead:
> (re-find #"\x08" "\b") --> "\b"
>
> 6. Hex
> (re-find #"\x31" "1") --> "1"
>
> 7. Octal
> (re-find #"\061" "1") --> "1"
>
> 8. Word boundary:
> (re-find #"\bfoo" "foo") --> "foo"
>
> 9. Quoting fun -- double quote, a single character:
> (re-find #"\"" "\"") --> "\""
>
> 10. Quoting fun -- backslash, a single character:
> (re-find #"\\" "\\") --> "\\"
>
> 11. Open paren
> (re-find #"\(" "(") --> "("
>
> I think this demonstrates you can create any pattern you might need.
> For reference, here are the above patterns expressed in the current
> (not the proposed) reader syntax:
>
> 1. #"foo"
> 2. #"\\w*"
> 3. #"\t" or #"\\t"
> 4. #"\\a" (but #"\a" makes the reader blow up)
> 5. #"\\x08"
> 6. #"\\x31"
> 7. #"\061" or #"\\061"
> 8. #"\\bfoo" (note #"\bfoo" is legal, but doesn't do what you want)
> 9. #"\"" or #"\\\"" (but #"\\"" blows up the reader)
> 10. #"\\\\" (but #"\\" is illegal)
> 11. #"\\(" (but #"\(" is illegal)
>
> Somehow I'm not sure that communicates how much I dislike the current
> syntax.  Oh well, maybe others can chime in on that point. I
> implemented this to provide the examples above, not because I think
> this is a done deal or anything.  Please comment!
>
> Here is a new print method to match the attached patch to LispReader:
>
> (defmethod print-method java.util.regex.Pattern [p w]
>  (.write w "#\"")
>  (.write w (.pattern p))
>  (.write w "\""))
>
> That print method will take a bit more work to properly quote some
> Patterns that could be created by means other than the Clojure
> literal.
>
> --Chouser
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to