On Wed, 2 Jul 2014 12:22:02 -0400 Steve Litt <sl...@troubleshooters.com> wrote:
> Another thing to remember is that the wordlist is no longer ASCII, Excellent thing at the age of UTF-N. > cat /usr/share/dict/words | grep -i "$1" Simplify it: grep -i "$1" /usr/share/dict/words > If you look up ^smor.*rd$, you get nothing. But if you look up > ^sm.*rd$ you get smörgåsbord. What I'd like to do is get grep to > think "å" is a hit for "a" and report it, but report it as "å". > I'll let you know when I figure out how to do that, or do some > other thing that produces the same result. Prepending LC_ALL= > either C, C.UTF-8, en_US.utf8, or POSIX, to the grep command, > didn't do it either. You can't, 'cos these letters do not have the same code in either encoding. (But your case is interesting; may be a rewritten grep, including conversions, would be of interest). > If worst comes to worst and I can't find a way to get grep to do > this, I'll just put together a substitution table, > convert /usr/share/dict/words to words.ascii, line for line, search > words.ascii, get the line number, and pull that line out of words. > Crude, but effective. AFAIK, this is the only way to be able to perform what you want. -- To be is to do. -- I. Kant To do is to be. -- A. Sartre Do be a Do Bee! -- Miss Connie, Romper Room Do be do be do! -- F. Sinatra Yabba-Dabba-Doo! -- F. Flintstone
signature.asc
Description: PGP signature