> ISTM that because the output of strings is not discrete list of > potential words, but is instead a long list of concatenated > characters, this problem is really rather daunting. The output should > probably be first broken up into something resembling words by perhaps > breaking on non-alphabetic characters. That should do two things: 1) > get you somthing that resembles words to actually test and 2) somewhat > smaller set of "stuff" to check. > > This won't necessarily handle "compound" words though where two > word-like things are jammed together, or an actual word is embedded > within a string of nonsense. > > I think this problem is potentially rather harder than I thought when > I saw OP's original question. >
It does not need to be comprehensive. Would it be possible to only show lines that have "words" (continuous strings) of alpha characters that are all lowercase except for the first character? That would handle about 90% of the work by eliminating lines line these: pDuf #k0H}g) GoV5 rLeY1 TMlq,* -- Dotan Cohen http://what-is-what.com http://gibberish.co.il -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org