We're maintaining a large database of tagged images and had a need to perform "fuzzy search" of the database. The existing search tool takes exact queries only. So it was necessary to hack up a little tool to sit between the query source and the engine and transform the query into a "fuzzy query". You can think of it like this: the input query is something like x AND y and the output is something like (x AND (y OR y2 OR y3)) OR (y AND (x OR x2 OR x3)) where y2 and y3 are in some sense "near" y and x2 and x3 "near" x. The query transformation is simple enough, but it was really handy being able to test it at a REPL, and even hot-swap modifications and do live testing and tweaking of e.g. the neighborhood graphs used to "fuzzify" query terms.
(Right now it only allows "slop" in one query term; sort of a Hamming-distance-1 matcher. Extending it further would invite a combinatorical explosion as well as lead to noisier, less useful results. Distance 1 seems to be the "sweet spot" for our app.) The code takes the input query, takes apart the terms, and generates seqs of alternatives, with some (butlast (interleave foo (repeat "OR"))) type stuff here and there, and cobbles them together again using seq functions, str, and java.lang.String methods. (Some query terms need to be parsed, e.g. are a hyphenated entity the first part of which should be fuzzy-matchable while the second part should stay constant, etc.) Rather boring? Little things like this incrementally improve services. Two guys hacking like this in their garage went on to start Google. :) We don't have such lofty aspirations, but it's still something to keep in mind. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en