On Tue, Nov 19, 2013 at 7:53 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote: > Aoilegpos for aidnoptg a cdocianorttry vwpiienot but, ttoheliacrley > spkeaing, lgitehnneng the words can mnartafucue an iocnuurgons > samenttet that is vlrtiauly isbpilechmoenrne.
isbpilechmoenrne. I totally want to find an excuse to use that word somewhere.. It just looks awesome. Paradoxically, it's actually more likely that a computer can figure out what you're saying here. In fact, I could easily write a little script that reads /usr/share/dict/words (or equivalent) and attempts to decode your paragraph. Hmm. You know what, I think I will. It's now 0958 UTC, let's see how long this takes me. Meh. I did something stupid and decided to use a regular expression. It's not 1020 UTC, so that's 21 minutes of figuring out what I was doing wrong with the regex and 1 minute solving the original problem. But here's your translated paragraph: -- cut -- Interestingly I'm studying this controversial phenomenon at the Department of Linguistics at Absytrytewh University and my extraordinary discoveries wholeheartedly contradict the picsbeliud findings regarding the relative difficulty of instantly translating sentences. My researchers developed a convenient contraption at hnasoa/tw.nartswdbvweos/utrtek:p./il that demonstrates that the hypothesis uniquely warrants credibility if the assumption that the preponderance of your words is not extended is unquestionable. Apologies for adopting a contradictory viewpoint but, theoretically speaking, lengthening the words can manufacture an incongruous statement that is virtually incomprehensible. -- cut -- It couldn't figure out "Absytrytewh", "picsbeliud", or "hnasoa/tw.nartswdbvweos/utrtek:p./il". That's not a bad result. (And as a human, I'm guessing that the second one isn't an English word - maybe it's Scots?) Here's the code: words = {} for word in open("/usr/share/dict/words"): word=word.strip().lower() transformed = word if len(word)==1 else word[0]+''.join(sorted(word[1:-1]))+word[-1] words.setdefault(transformed,set()).add(word) words.setdefault(transformed.capitalize(),set()).add(word.capitalize()) import re for line in open("input"): line=line.strip() for word in re.split("(\W+)",line): try: transformed = word if len(word)==1 else word[0]+''.join(sorted(word[1:-1]))+word[-1] realword=words[transformed] if len(realword)>1: realword=repr(realword) else: realword=next(iter(realword)) line=line.replace(word,realword) except LookupError: # catches three errors, all of which mean we shouldn't translate anything pass print(line) Yeah, it's not the greatest code, but it works :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list