On Mon, Jul 11, 2016 at 7:30 PM, Thomas Klausner <t...@giga.or.at> wrote: > On Mon, Jul 11, 2016 at 06:59:25PM +0530, Abhinav Upadhyay wrote: >> 1. If a word is not in /usr/share/dict/words, don't stem. >> 2. Look for .Tn macros (and probably other similar macros) and don't stem >> those. > > 3. Don't stem the file names? > cd /usr/share/man && find . -name *.[0-9] | sed -e "s,.*/,," -e > "s/.[0-9]$//" | sort -u |less >
Thanks, that would be a good starting point too. I guess we will still have to add few words to the list manually later, but it should be good to begin with. - Abhinav