On Mon, 21 Dec 2015 08:56 pm, Christian Gollwitzer wrote: > Apfelkiste:Tests chris$ python score_my.py > -8.74 baby lions at play > -7.63 saturday_morning12 > -6.38 Fukushima > -5.72 ImpossibleFork > -10.6 xy39mGWbosjY > -12.9 9sjz7s8198ghwt > -12.1 rz4sdko-28dbRW00u > Apfelkiste:Tests chris$ python score_my.py 'bnsip atl ayba loy' > -9.43 bnsip atl ayba loy
Thanks Christian and Peter for the suggestion, I'll certainly investigate this further. But the scoring doesn't seem very good. "baby lions at play" is 100% English words, and ought to have a radically different score from (say) xy39mGWbosjY which is extremely non-English like. (How many English words do you know of with W, X, two Y, and J?) And yet they are only two units apart. "baby lions..." is a score almost as negative as the authentic gibberish, while Fukushima (a Japanese word) has a much less negative score. Using trigraphs doesn't change that: > -11.5 baby lions at play > -9.85 Fukushima > -13.4 xy39mGWbosjY So this test appears to find that English-like words are nearly as "random" as actual random strings. But it's certainly worth looking into. -- Steven -- https://mail.python.org/mailman/listinfo/python-list