On 01/07/2011 05:27 PM, "Peter Alcibiades" <palcibiades-fi...@yahoo.co.uk> wrote:
> > The case which I'm looking to apply this to is a bit more like the literary > case. There a number of texts of which the authorship is definitely known > and not subject to dispute. There is then one text whose authorship is > unknown. The question is whether it is probably by one of the known > authors. > > We do also have a case like the Biblical case - where there are texts under > one signature that we suspect to have come from more than one author, and > perhaps from the author of the text of primary interest. It would be nice > to be able to discriminate between authors in this body of work as well. > One fairly simple approach that you could certainly implement in LiveCode involve compressing (zipping) chunks of text separately and combined and comparing their lengths. If two chunks of text have a relatively high degree of similarity then their combined compressed length will be less than for two equivalent but dissimilar chunks. So, in the case of authorship, if you have text from 3 known authors and one unknown author you combine the unknown one with each of the known ones and compare the zipped length of these combined text to the zipped length of the 3 individual texts. The combined text that has the smallest increase in length relative to the individual length of its know text is then most likely to have both texts authored by the same person (I hope that makes sense). Terry... -- Dr Terry Judd | Senior Lecturer in Medical Education Medical Education Unit Melbourne Medical School The University of Melbourne _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode