> ; First thing to learn is XML parsing with Clojure. This is basically done. Use the xml library in core if you just need to load XML into a map data structure. Use the zip library if you need to navigate it. Use the xml library in contrib if you need to do xpath- style navigation.
For the rest of it... it looks very well thought out. I'm not familiar enough with the problem domain to comment specifically, but obviously you're putting a lot of thought into it, so I'm pretty sure you won't have any problems. -Luke On May 8, 12:50 pm, dhs827 <scheur...@gmail.com> wrote: > I'm stealing knowledge left and right (just ask me :-) to design me an > AIML pattern matcher. I've compiled a draft list of objects and > behaviors, which I would like to see reviewed for plausibility: > > startup > - opens configuration file (e.g. startup.xml) > - passes configuration file to bot-loader object > bot-loader > - loads general input substitutions (spelling, person) > - loads sentence splitters (.;!?) > - looks for enabled bot, takes first one it finds > - reads bot-id; uses it as key (for saving/loading variables and > chatlogs) > - loads bot properties (global constants, e.g. name) > - passes control to aiml-loader object > aiml-loader > - loads list of AIML files to load, and for each file > - opens file > - reads AIML categories (XML) one by one as they appear in > the file > - parses and stores the content of the match path > (e.g."BOTID * > INPUTPATTERN * CONTEXT1 * CONTEXT2 *") > - when it reaches the end of the category - the > template, or leaf > of this branch of the tree > - calls a method to store the elements of the > match path, together > with the template, in the > pattern-matcher-tree > > ; First thing to learn is XML parsing with Clojure. > > ; Though it is probably the easiest thing to do, it is not necessary > for the templates to be stored along with the paths in the tree. They > might as well be left on disc or in a database. > > ; A function like parser/scan must advance the parse to the next part > of the document (element - element content - processing > instruction...) and tokenize it. I can then use case/switch/if (must > look at what Clojure offers) to make decisions/set variables/call > methods. > > ; The whole path, with all components, gets created at load time. The > loader combines all elements of the path (e.g. INPUTPATTERN * CONTEXT1 > * CONTEXT2 *) into one string, seperating the components using special > context-id strings (e.g. <input>, <context1>, <context2>) > > ; The idea of the AIML graphmaster is: take this string, seperate it > into words, then store these words as nodes in a tree. > > ; A variation of this idea: instead of keying the nodes by their > values, key them first by context, then by value. > > ; Now that the bot is up and running, the user types something into > the input box and hits Enter. The > > pre-processor > - protects sentences > - blocks common attack vectors, e.g. code injection, flooding > - eliminates common spelling mistakes > - for each loaded substitution > - finds and replaces it in the input string > - alternatively, uses a tree to search for them > - removes redundant whitespace > - splits input into sentences (everything that follows is for each > sentence) > pattern-matcher > - combines INPUTPATTERN * CONTEXT1 * CONTEXT2 * into one string > - tokenizes the "path to be matched" into the individual words > (nodes) > - traverses the tree from the root; first > - tries matching underscore (_)wildcards > - matching of wildcards is recursive > - match one word of the current path component > - try remainder against child node > - if the whole remaining input matches > - and if the last node is a leaf > - return the template > - else try 2 words, then 3 > - if all words in the string are used up and > the current node is a > leaf > - return the template > - else stop matching underscores, and > - tries matching exact words in alphabetical order > - if there is a childnode that equals to the input > word, recurse a > level deeper > - if at the next level there is a leaf, > return the template > - else > - tries matching the star (*) wildcard > - when a complete path was matched, creates a > match-object > - holds information about the match > - the input (sentence) > - the template > - the strings matched to the wildcards > > This first project should end there, with the template just returning > the values in the match-object. From there, the non-AIML aspects - the > new stuff - of the concept would be foregrounded. > > Does this make sense to the casual observer? > > Which known Clojure libraries should I be learning first? > > Other comments, tips, disses? > > Dirk --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---