Let's say I want to make a small parser for a small language. The
language is made up of values of two kinds: arrays and bits. An array
can only be of the form "[x,y]", where x and y are values. A bit can
only be of the form "0" or "1". In other words, the language's strings
can look like this:
- 0
- 1
- [0,0]
- [1,0]
- [1,1]
- [[0,0],1]

Now, to create a parser I can create three metafunctions that can
create rules. Rules are functions that take a collection of tokens (in
this case, characters), and pass around an array made up of the
current parsed structure and the remaining tokens. If a rule receives
tokens that are invalid for the rule, it returns nil instead, which
would propagate back up the stack of function calls.

Clojure
user=> (defn conc [& subrules]
        (fn [tokens]
                (loop [subrule-queue (seq subrules), remaining-tokens (seq 
tokens),
products []]
                        (if (nil? subrule-queue)
                                [products remaining-tokens]
                                (let [[subrule-products subrule-remainder :as 
subrule-result]
((first subrule-queue) remaining-tokens)]
                                        (when-not (nil? subrule-result)
                                                (recur (rest subrule-queue) 
subrule-remainder (conj products
subrule-products))))))))
#'user/conc
user=> (defn alt [& subrules]
        (fn [tokens]
                (some #(% tokens) subrules)))
#'user/alt
user=> (defn literal [literal-token]
        (fn [tokens]
                (let [first-token (first tokens), remainder (rest tokens)]
                        (when (= first-token literal-token)
                                [first-token remainder]))))
#'user/literal
user=> (def on (literal \1))
#'user/on
user=> (def off (literal \0))
#'user/off
user=> (def bit (alt on off))
#'user/bit
user=> (bit (seq "1, 0"))
[\1 (\, \space \0)]
user=> (bit (seq "starst"))
nil
user=> (def array-start (literal \[))
#'user/array-start
user=> (array-start (seq "[1, 2, 3]"))
[\[ (\1 \, \space \2 \, \space \3 \])]
user=> (def array-end (literal \]))
#'user/array-end
user=> (def array-sep (literal \,))
#'user/array-sep

Now, value and array are recursive into each other. Because Clojure
substitutes variables immediately outside of functions—and barring
macros, I have to put either the value rule or the array rule into a
wrapper function. This does not behave as I expect, though:

user=> (declare value)
#'user/value
user=> (defn array []
  ; I'm defining it as a function because otherwise I'd get an unbound
variable exception.
        (conc array-start value array-sep value array-end))
#'user/array
user=> (def value (conc (array) bit))
#'user/value
user=> (def value (alt (array) bit))
#'user/value
user=> (value (seq "0"))
[\0 nil]
user=> (value (seq "[1,0]"))
nil
user=> (value (seq "[0,0]")) ; It should accept it, because (array)
accepts it. But it doesn't:
nil
user=> ((array) (seq "[0,0]")) ; This works as intended:
[[\[ \0 \, \0 \]] nil]
user=> (value (seq "[0,3]")) ; This should return nil, but a weird
argument exception is raised instead:
java.lang.IllegalArgumentException: Key must be integer
(NO_SOURCE_FILE:0)
user=> ((array) (seq "[0,3]")) ; This is what I want:
nil

Can anybody shed light on why (value (seq "[0,0]")) and (value (seq
"[0,3]")) do not work as intended? This is a big problem with my
parser library, since dealing with two or more rules that refer to
each other is a huge pain. I'm considering switching my parser library
to macros if that could fix it more easily somehow.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to