"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes: > So I need to recursively grep a bunch of gzipped files. This can't be > easily done with grep, rgrep or zgrep. (I'm sure given the right > pipeline including using the find command it could be done....but > seems like a hassle). > > So I figured I'd find a fancy next generation grep tool. Thirty > minutes of searching later I find a bunch in Perl, and even one in > Ruby. But I can't find anything that interesting or up to date for > Python. Does anyone know of something? > > Thanks
There must be a million of these scripts out there, maybe one per programmer :-) Here's mine: http://codespeak.net/svn/user/jjlee/trunk/pygrep/ It doesn't do zip files. It has the usual file / dir blacklisting feature (for avoiding backup files, etc.). Oddities of this particular script are support for searching for Python tokens in .py files, doctests, doctest files, and preppy 2 .prep template files. It also outputs in a format that allows you to click on matches in emacs. A few years back I was going to release it in the hope that other people would write plugins for other templating systems, but then I stopped doing lots of web stuff. Actually, tokenizing based on a simple fixed "word boundary" rule seems to work as well in many cases (pygrep doesn't do that) -- though sometimes proper tokenization can be quite handy -- searching for a particular Python name, Python string or number can be just what's needed (pygrep does support that -- e.g. <no options>, -sep, -sebp, -nep). Most of the time I just use the -t option though, which is just substring match, just because it's fast and good enough for most cases (most search strings are longish and so don't give lots of false positives). The default is tokenized search for files it knows how to tokenize (.py, .prep, etc.) and substring match for every other file that's not blacklisted -- I find this good for small projects, but too slow (there's no caching) for large projects. Somebody at work has a nice little web-based tool that you can run as a local server, and turns tokens (e.g. Python names -- but it's based on some fast simple tokenizer that doesn't know about Python) into links you can click on. The CSS is written so the link styling doesn't show up until you hover the mouse over a token, IIRC. It seems very efficient for exploring/reading and navigating source code -- I only don't use it because it's not integrated with emacs. It would be great if somebody could do the same in emacs, with back / forward buttons :-) John -- http://mail.python.org/mailman/listinfo/python-list