> ; First thing to learn is XML parsing with Clojure.

This is basically done. Use the xml library in core if you just need
to load XML into a map data structure. Use the zip library if you need
to navigate it. Use the xml library in contrib if you need to do xpath-
style navigation.

For the rest of it... it looks very well thought out. I'm not familiar
enough with the problem domain to comment specifically, but obviously
you're putting a lot of thought into it, so I'm pretty sure you won't
have any problems.

-Luke

On May 8, 12:50 pm, dhs827 <scheur...@gmail.com> wrote:
> I'm stealing knowledge left and right (just ask me :-) to design me an
> AIML pattern matcher. I've compiled a draft list of objects and
> behaviors, which I would like to see reviewed for plausibility:
>
> startup
>         - opens configuration file (e.g. startup.xml)
>         - passes configuration file to bot-loader object
> bot-loader
>         - loads general input substitutions (spelling, person)
>         - loads sentence splitters (.;!?)
>         - looks for enabled bot, takes first one it finds
>         - reads bot-id; uses it as key (for saving/loading variables and
> chatlogs)
>         - loads bot properties (global constants, e.g. name)
>         - passes control to aiml-loader object
> aiml-loader
>         - loads list of AIML files to load, and for each file
>                 - opens file
>                 - reads AIML categories (XML) one by one as they appear in 
> the file
>                         - parses and stores the content of the match path 
> (e.g."BOTID *
> INPUTPATTERN * CONTEXT1 * CONTEXT2 *")
>                         - when it reaches the end of the category - the 
> template, or leaf
> of this branch of the tree
>                                 - calls a method to store the elements of the 
> match path, together
> with the template, in the
> pattern-matcher-tree
>
> ; First thing to learn is XML parsing with Clojure.
>
> ; Though it is probably the easiest thing to do, it is not necessary
> for the templates to be stored along with the paths in the tree. They
> might as well be left on disc or in a database.
>
> ; A function like parser/scan must advance the parse to the next part
> of the document (element - element content - processing
> instruction...) and tokenize it. I can then use case/switch/if (must
> look at what Clojure offers) to make decisions/set variables/call
> methods.
>
> ; The whole path, with all components, gets created at load time. The
> loader combines all elements of the path (e.g. INPUTPATTERN * CONTEXT1
> * CONTEXT2 *) into one string, seperating the components using special
> context-id strings (e.g. <input>, <context1>, <context2>)
>
> ; The idea of the AIML graphmaster is: take this string, seperate it
> into words, then store these words as nodes in a tree.
>
> ; A variation of this idea: instead of keying the nodes by their
> values, key them first by context, then by value.
>
> ; Now that the bot is up and running, the user types something into
> the input box and hits Enter. The
>
> pre-processor
>         - protects sentences
>         - blocks common attack vectors, e.g. code injection, flooding
>         - eliminates common spelling mistakes
>                 - for each loaded substitution
>                         - finds and replaces it in the input string
>                 - alternatively, uses a tree to search for them
>         - removes redundant whitespace
>         - splits input into sentences (everything that follows is for each
> sentence)
> pattern-matcher
>         - combines INPUTPATTERN * CONTEXT1 * CONTEXT2 * into one string
>         - tokenizes the "path to be matched" into the individual words
> (nodes)
>         - traverses the tree from the root; first
>                 - tries matching underscore (_)wildcards
>                         - matching of wildcards is recursive
>                                 - match one word of the current path component
>                                 - try remainder against child node
>                                 - if the whole remaining input matches
>                                 - and if the last node is a leaf
>                                         - return the template
>                                 - else try 2 words, then 3
>                                 - if all words in the string are used up and 
> the current node is a
> leaf
>                                         - return the template
>                                 - else stop matching underscores, and
>                 - tries matching exact words in alphabetical order
>                         - if there is a childnode that equals to the input 
> word, recurse a
> level deeper
>                                 - if at the next level there is a leaf, 
> return the template
>                                 - else
>                 - tries matching the star (*) wildcard
>         - when a complete path was matched, creates a
> match-object
>         - holds information about the match
>                 - the input (sentence)
>                 - the template
>                 - the strings matched to the wildcards
>
> This first project should end there, with the template just returning
> the values in the match-object. From there, the non-AIML aspects - the
> new stuff - of the concept would be foregrounded.
>
> Does this make sense to the casual observer?
>
> Which known Clojure libraries should I be learning first?
>
> Other comments, tips, disses?
>
> Dirk
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to