Adam, if you'd like to try these out I'd be very happy ;)
masses/bayes-testing/README in the SA svn repository
describes how we test new tokenization strategies, in order to
pick the ones that actually _work_. (It's quite counterintuitive
at times as to what really helps.)
also, there's experime
Adam Katz wrote:
>> vi'aqra pr,ofe'ssio,nal matters very much to your s.e,x
>> be self-satisfied - use vi'aqra sper act,i've
>> vi'aqra prfessional - never forget about your s'e.x
>> test s p a c e d words t w i c e in a line
>> this is an act--i've shown it 5 x, a record!
Ignore the missing /^