On Sat, May 24, 2014 at 3:14 PM, Benjamin R. Haskell <cloj...@benizi.com> wrote: > On Sat, May 24, 2014 at 3:09 PM, Gregg Reynolds <d...@mobileink.com> wrote: >> >> Hi, >> >> In working on an ANTLR grammar for Clojure I came across this regex in >> clojure.lang.LispReader which is used in matchSymbol: >> >> symbolPat == [:]?([\\D&&[^/]].*/)?(/|[\\D&&[^/]][^/]*) >> >> Look at the first part of the second group: >> >> /|[\\D&&[^/]] >> >> Am I missing something or is that equal to \\D? > > > That would be equal to \\D, but you're missing that /|[\\D&&[^/]][^/]* is > the alternation of / and [\\D&&[^/]][^/]* rather than ( the alternation of / > and [\\D&&[^/]] ) concatenated with [^/]*
Aha. So concatenation of [] binds more tightly than '|'? Or maybe it follows from greedy matching on the second alternative. In any case I made the opposite assumption. > > '/' is special-cased as a symbol. It can only be used (with an optional > namespace) if it's the only character in the name. Something doesn't look right. user=> :a/b/c/d :a/b/c/d user=> (namespace :a/b/c/d) "a" user=> (name :a/b/c/d) "b/c/d" user=> (symbol "x/y/z" "foo") x/y/z/foo user=> (type 'x/y/z/foo) clojure.lang.Symbol user=> (namespace (symbol "x/y/z" "foo")) "x/y/z" user=> (namespace 'x/y/z/foo) "x" user=> (name 'x/y/z/foo) "y/z/foo" user=> (name (symbol "x/y/z" "foo")) "foo" ...etc... The sym regex gets it right - '/' chars are part of the ns string: user> (def longrgx (re-pattern "[:]?([\\D&&[^/]].*/)?(/|[\\D&&[^/]][^/]*)")) user> (re-find longrgx "x/y/z/foo") ["x/y/z/foo" "x/y/z/" "foo"] But it doesn't match a final '/' preceded by a namespace string: user> (re-find longrgx "x/y/z/") ["x/y/z" "x/y/" "z"] unless its doubled: user> (re-find longrgx "x/y/z//") ["x/y/z//" "x/y/z/" "/"] So if the name portion of a symbol cannot contain '/' then why user=> (name 'x/y/z/foo) "y/z/foo" user=> (name (symbol "x" "y/z/foo")) "y/z/foo" Ok, clojure.core/name calls clojure.lang.Named/getName, implemented by clojure.lang.Symbol. Conclusion: it looks like there is an inconsistency between the Symbol regex and matchSymbol processing, on the one hand, and Symbol.intern, which is called by matchSymbol and analyzes the passed string to set its name and namespace fields: /* in clojure.lang.Symbol static public Symbol intern(String nsname){ int i = nsname.indexOf('/'); if(i == -1 || nsname.equals("/")) return new Symbol(null, nsname.intern()); else return new Symbol(nsname.substring(0, i).intern(), nsname.substring(i + 1).intern()); } I guess this isn't a big problem, since programming continues apace, but it is confusing and sure looks like a bug from here. By now I would guess lots of code depends on (name foo) returning a string with embedded '/'. Or maybe not. Thanks, Gregg -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.