On Jun 24, 2011, at 11:46 PM, Peter Alcibiades wrote: > It can be done statistically. Various methods have been proposed and used. > One general kind of measure is the probability of another word coming, as a > function of the past n words. Another is to measure the length of gap > between occurrences of pairs of a given word. There is technical literature > on it, and I guess LC would permit writing something to do it. Not that its > the best thing to do it in, that seems to be R, but its what I know. > > But it would be nice if someone had already done it, in any language. Save > a huge lot of work. > Peter
Don't know if anyone has already tackled this kind of thing in LC, but it should be fairly easy to do. (Whether the algorithms actually work to distinguish different authors is something I know nothing about.) The gap between pairs of a given word, in particular, is nearly trivial. The question would be speed, and since LC is blindingly fast at processing text strings, I'd be optimistic about that, unless you're talking really huge texts. -- Peter Peter M. Brigham [email protected] http://home.comcast.net/~pmbrig _______________________________________________ use-livecode mailing list [email protected] Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
