I had a idea today. I wanted to know what are the top most frequently used functions in the emacs lisp language. I thought i can write a quick script that go thru all the elisp library locations and get a word-frequency report i want.
I started with a simple program: http://xahlee.org/p/titus/count_word_frequency.py and applied it to a Shakespeare text. Here's a sample result: http://xahlee.org/p/titus/word_frequency.html Then, i wrote a more elaborate one that recurse thru directories to work on elisp code treasury. The code is here: http://xahlee.org/x/count_word_frequency.py and i got a strange result. The word “the” appeared on the top, along with many other English words. I quickly realized that these are due to lisp function's doc strings. (not comments) At this point, it dawned on me that there's no easy way to work around this, Unless, i write this script in elisp which has functions that read lisp code and can easily filter out doc strings. Originally, i planned to use the word-frequency script on Perl, Python, as well as Java, as well as Elisp. However, now it seems to me this task is nigh impossible. Each of these lang has their own doc string syntax. It's gonna be a heavy undertaking if the word-frequency script is to work with all these langs, since that amounts to writing a parser for each lang. Alternatively, one can write multiple word-frequency scripts using each lang in question, since most lang has facilities to deal with its own syntax. However, this is still not trivial, and amounts to several programing efforts. Anyone would be interested in this problem? PS bpalmer on #emacs irc.freenode wrote a elisp quicky to deal with lisp, but that program is currently not fully working... see bottom http://paste.lisp.org/display/28840 Xah [EMAIL PROTECTED] ∑ http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list