Hi folks,

First let me say 'Thankyou very much' for Clojure - it has enabled me to 
finally take the plunge into learning a Lisp without feeling like I'm 
abandoning a ten year investment in the Java platform and its libraries. 
I bought 'Practical Common Lisp' about eighteen months ago and read it 
half heartedly, never making it to a REPL; however I now have the luxury 
of working on a project of my own and decided the time was right to 
revisit the promised land of Lisp. I have been aware of Clojure for 
about six months, and, given my experience of Java, it seemed like the 
obvious place to start. I've spent the past couple of weeks devouring 
lots of general Lisp related documentation, Rich's Clojure webcasts and 
Stuart's 'Programming Clojure' book. I am pleased to report that my Lisp 
epiphany occurred yesterday when I found I could understand this macro 
pasted to lisp.org by Rich:

(defmacro defnk [sym args & body]
 (let [[ps keys] (split-with (complement keyword?) args)
       ks (apply array-map keys)
       gkeys (gensym "gkeys__")
       letk (fn [[k v]]
              (let [kname (symbol (name k))]
                `(~kname (~gkeys ~k ~v))))]
   `(defn ~sym [...@ps & k#]
      (let [~gkeys (apply hash-map k#)
            ~@(mapcat letk ks)]
        ~...@body))))

I am now spoiled forever, and although my powers are weak, I realise I 
have become one of those smug Lisp types condemned to look down on all 
other programming languages for the rest of their lives. Thankyou's and 
cliched tales of enlightenment dispensed with, I can now move on to the 
plea for aid :)

My current project involves semantic web technologies, at this point 
specifically SPARQL queries over RDF triple stores (for those unfamiliar 
with SPARQL it will suffice to say that it's a straightforward query 
language modelled after SQL, but operating over graphs rather than 
tables). Like their SQL counterpart, the Java APIs for performing these 
queries are verbose and cumbersome, and I am hoping that they can be 
hidden behind some cunning Lisp macros to provide an expressive and 
tightly integrated semantic query capability (similar to Microsoft's 
LINQ). Unfortunately my ambitions far exceed my skills at this point, 
and I am hoping to garner some gentle mentoring to steer me in the right 
direction from the outset.

My first inclination is to start with the simplest thing which will 
work, which is to create a function which takes a SPARQL query string as 
an argument and returns a list of results:

(sparql "
PREFIX foaf: <http://xmlns.com/foaf/0.1>
SELECT ?name
WHERE
{
   ?person foaf:mbox \"mailto:adam-cloj...@antispin.org\"; .
   ?person foaf:name ?name
}")

However this style is poor for several reasons: it looks ugly, quotation 
marks have to be escaped manually, and interpolation of variables into 
the query string is a chore which further decreases readability. What I 
really want is a nice DSL:

(let [mbox "mailto:adam-cloj...@atispin.org";]
 (sparql
   (with-prefixes [foaf "http://xmlns.com/foaf/0.1";]
     (select ?name
       (where
         (?person foaf:mbox mbox)
         (?person foaf:name ?name)))))

Clearly this is going to require a macro, because for a start I don't 
want the symbols representing SPARQL capture variables (the ones 
starting with '?') to be evaluated - I want to take the name of the 
symbol, '?' and all, and embed it into the query string which this DSL 
will ultimately generate before calling into the Java API. On the other 
hand, I do want some things evaluated - I want to embed the value 
('mailto:adam-clog...@antispin.org') bound to the symbol 'mbox' in the 
query string, not the name of the symbol.

 From my position of total ignorance, I can see two broad approaches to 
tackling this. The first is to implement (sparql ...) as a macro which 
is responsible for interpreting the entire subtree of forms below it, 
building a query string by selectively evaluating some forms whilst 
using others as navigational markers which give context. It would honour 
the grammar which defines SPARQL queries, and either signal an error or 
be guaranteed to generate syntactically correct queries. The macro would 
also have insight into the format of the data which would be returned 
(gleaned from the 'select' part) and so could return something useful 
like a list of maps where the keys are the names of the capture 
variables that appear in the select clause. I have no idea how to do 
this, but it feels like the 'right' way.

The other approach, which is IMO a bit hacky, but within my reach, is to 
define 'with-prefixes', 'select' and 'where' as individual macros whose 
first arguments are expanded into the relevant subcomponent of the query 
string and whose final argument is a string to be appended to the end. 
You then compose these together into the right order to get the compete 
query string:

(with-prefix [foaf "http://xmlns.com/foaf/0.1";] "...the select 
statement") would evaluate to "PREFIX foaf: <http://xmlns.com/foaf/0.1> 
...the select statement"

(select ?name "... the where clause") would evaluate to "SELECT ?name 
WHERE ...the where clause"

and so on. In this case 'sparql' would remain a function which takes a 
string argument, and you build that query string recursively by nesting 
calls to with-prefix, select & where in the correct order. Whist this is 
much easier to implement, it has some serious drawbacks - it is up to 
the user to compose these things in the correct order (since everything 
gets flattened to a string at each level of recursion, we can't check 
for errors), and it's no longer possible for the 'sparql' function to 
glean the structure of its return value without reparsing the query 
string it is passed.

These problems aside, I decided to take a stab at implementing the 
'where' macro from the second approach because I have to start somewhere 
:) Here is what I have so far:

(defmacro where [& triples]
`(let [encode# (fn [x#] (cond (and (symbol? x#) (= (first (name x#)) 
\?)) (name x#)
                              (integer? x#) (str "\"" x# "\"^^xsd:integer")
                              (float? x#) (str "\"" x# "\"^^xsd:decimal")
                              (string? x#) (str "\"" x# "\"")))]
  (apply str
   (interpose " .\n"
     (for [triple# '~triples]
       (apply str
         (interpose " "
           (map encode# triple#))))))))

As you can see, so far it correctly encodes SPARQL capture variables, 
and literal strings, integers and floats:

user=> (print (where (?a ?b 1) (?a ?b 2.0) (?a ?b "string")))
?a ?b "1"^^xsd:integer .
?a ?b "2.0"^^xsd:decimal .
?a ?b "string"

I tried adding '(list? x#) (eval x#)' to the encode cond to make it cope 
with expressions like this:

(where (?a ?b (+ 1 2)))

Unfortunately that results in an unencoded literal '3' in the query 
string instead of the '"3"^^xsd:integer' I was looking for. I tried 
calling encode recursively (despite the obvious infinite recursion issue 
if the eval returns a list) '(list? x#) (encode# eval x#)' but I got an 
error at runtime:

user=> (where ((+ 1 2)))
java.lang.Exception: Unable to resolve symbol: encode__2605 in this context
clojure.lang.Compiler$CompilerException: NO_SOURCE_FILE:153: Unable to 
resolve symbol: encode__2605 in this context

Clealy this is due to encode# not being bound until after the function 
definition completes, but I have no idea how to fix it yet (is there a 
way to refer to a function during its definition?). I am also struggling 
to get access to variables bound outside the macro:

(let [v "a value"]
 (where (?a ?b v))

I tried adding '(symbol? x#) (eval x#)' to the encode cond but that gets 
me a complaint about 'v' being unresolvable in this context.

Another problem I face is that there is no enforcement that triples are 
passed - the macro just maps all the values through encode and 
interposes them with spaces, so no error is raised if you have too many 
or too few values to create a valid where clause. I have no idea what is 
the proper way of dealing with things like this in a functional language.

So as you can see, if you have made it heroically to the end of this 
email, I am keen but facing a steep learning curve :) I am aware that 
most of my troubles are well trodden issues that no doubt have thirty 
year old Lisp idiomatic solutions, and for bothering you with them on a 
Clojure specific mailing list I apologise. On the other hand, if you 
feel like sharing some wisdom with me, either directly or through 
pointers to resources I would be most grateful. I would also be most 
appreciative to receive comments on the design of the DSL itself, and 
suggestions for the most Lispish/Clojurish implementation thereof. In 
the meantime, it's back to the REPL for me :)

Best Regards,

Adam Harrison


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to