Re: Idea for personal Clojure project

2010-07-31 Thread Martin DeMello
On Sat, Jul 31, 2010 at 5:41 AM, Gregg Williams wrote: > > I've begun work on a visual front-end to display such infocards, using > Clojure and the Piccolo graphics library (http://piccolo2d.org/). If > you (or anybody else reading this) find this larger project > interesting, please contact me by

Re: Idea for personal Clojure project

2010-07-30 Thread Gregg Williams
Daniel (and anyone else reading this) I would like to correspond with you because I'm working on a project for which your "word graphing" is a subset. I invented a "standardized" electronic notecard (see http://infoml.org), with the idea that writers and others could dump "chunks" of information (

Re: Idea for personal Clojure project

2010-07-29 Thread Savanni D'Gerinel
On Thu, 2010-07-29 at 10:11 -0400, rob levy wrote: > Also, most of NLTK works in Jython*, and by extension in Jython > running in Clojure ( which is why I started writing a convenience > wrapper to make it easier to use python libraries: > http://code.google.com/p/clojure-python/ ). > > *Actuall

Re: Idea for personal Clojure project

2010-07-29 Thread rob levy
I think that a big part of the problem is that most approaches to word similarity (especially thesaurus-based approaches like Wordnet, but also the significantly better distributional approaches) use very impoverished representations of knowledge. As such, they are unable to make useful inferences

Re: Idea for personal Clojure project

2010-07-29 Thread rob levy
I think that a big part of the problem is that most approaches to word similarity (especially thesaurus-based approaches like Wordnet, but also the significantly better distributional approaches) use very impoverished representations of knowledge. As such, they are unable to make useful inferences

Re: Idea for personal Clojure project

2010-07-29 Thread Michael Harrison (goodmike)
As others have said, this is a difficult problem, but a fascinating one too. I'm currently nibbling on building some grouping-by- similarity algorithms for Clojure, although I'm sticking to numerical criteria for similarity or "distance". New developments in text analysis and the Learning by Readin

Re: Idea for personal Clojure project

2010-07-29 Thread bOR_
I think there were some talks about this on the conference I went to recently. Keywords might be "natural language processing". Linked is the abstracts of the conference, which you might find some use in. http://www.insna.org/PDF/Sunbelt/4_ProgramPDF.pdf One alternative I briefly considered is to

Re: Idea for personal Clojure project

2010-07-29 Thread lance bradley
I've done quite a lot of work in this area, although not in clojure. As Mark mentioned, wordnet is definitely a good place to start, but it's short on proper nouns, which reduces the utility of this when analyzing natural language. I ended up extending wordnet by data mining wikipedia dumps. The re

Re: Idea for personal Clojure project

2010-07-29 Thread Lee Hinman
On Wed, Jul 28, 2010 at 2:58 PM, Daniel wrote: > I want to write a clojure program that searches for similarities of > words in the english language and places them in a graph, where the > distance between nodes indicates their similarity.  I don't mean > syntactical similarity.  Related contextua

Re: Idea for personal Clojure project

2010-07-29 Thread Savanni D'Gerinel
What you describe is not clojure specific, so... Check out the NLTK project. It is all in Python, and all of the big book are written for learning to use the tools in Python. However, it also contains a lot of talk about Natural Language Processing in general. http://www.nltk.org/book I, mysel

Re: Idea for personal Clojure project

2010-07-29 Thread bOR_
Just went to a conference where some people were working on that, if I remember correctly. keywords like natural language processing are handy to know :-). http://www.insna.org/PDF/Sunbelt/4_ProgramPDF.pdf Anyway, for the practical part. I found using java processing library in combination with t

Re: Idea for personal Clojure project

2010-07-29 Thread Jonah Benton
As others have said, there isn't an algorithm that does this. Useful results depend on precise definitions of "context" and "similarity." The waters get deep quickly. As a clojure exercise, though, there are lots of good starting points. For instance: get a set of words, create all pairs from the

Re: Idea for personal Clojure project

2010-07-28 Thread Cameron Pulsford
A very good place to start searching about edit distances between words and some related stuff can be found on Peter Norvigs site at: http://norvig.com/spell-correct.html Also, try to find some wikipedia articles about the bm25 ranking algorithm, I used clojure for an assignment at school that

Re: Idea for personal Clojure project

2010-07-28 Thread Daniel E. Renfer
On 7/28/10 5:34 PM, Mark Engelberg wrote: > Wordnet is the main existing thing that comes to mind as related to your > idea. > You might also want to look into Freebase. Here's a Clojure client you can use to query their data. http://github.com/rnewman/clj-mql signature.asc Description: OpenPG

Re: Idea for personal Clojure project

2010-07-28 Thread Luke VanderHart
This is a hard problem. If you go by degrees and shades of synonymity, it can (and has been) done manually - see Visual Thesaurus (http:// www.visualthesaurus.com/). But for grouping based on the same semantic topics - that's pretty difficult. You could do it based on co-location in a corpus, but

Re: Idea for personal Clojure project

2010-07-28 Thread Mark Engelberg
Wordnet is the main existing thing that comes to mind as related to your idea. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be pat

Idea for personal Clojure project

2010-07-28 Thread Daniel
I want to write a clojure program that searches for similarities of words in the english language and places them in a graph, where the distance between nodes indicates their similarity. I don't mean syntactical similarity. Related contextual meaning is closer to the mark. For instance: "fish" a