Re: Clustering text-documents in bundles

2007-09-25 Thread exhuma.twn
On Sep 25, 7:52 pm, Paul Rubin wrote: > "exhuma.twn" <[EMAIL PROTECTED]> writes: > > Is it possible to calculate a distance between two chunks of text? I > > suppose one could simply do a simple word-count on the chunks > > (removing common noise words of course). And the

Re: Clustering text-documents in bundles

2007-09-25 Thread Paul Rubin
"exhuma.twn" <[EMAIL PROTECTED]> writes: > Is it possible to calculate a distance between two chunks of text? I > suppose one could simply do a simple word-count on the chunks > (removing common noise words of course). And then go from there. Maybe > even assigning different weighting to words. But

Re: Clustering text-documents in bundles

2007-09-25 Thread Paul Hankin
On Sep 25, 4:11 pm, "exhuma.twn" <[EMAIL PROTECTED]> wrote: > Is it possible to calculate a distance between two chunks of text? I > suppose one could simply do a simple word-count on the chunks > (removing common noise words of course). And then go from there. Maybe > even assigning different weig

Clustering text-documents in bundles

2007-09-25 Thread exhuma.twn
Hi, This *is* off-topic but with python being a language with a somewhat scientific audience, I might get lucky ;) I have a set of documents (helpdesk tickets in fact) and I would like to automatically collect them in bundles so I can visualise some statistics depending on content. A while ago I