On 20101109_071001, ~Stack~ wrote: > Hello everyone! > > I ran into a strange issue with grep and I was hoping someone could > explain what I feel is an oddity. > > I was trying to match a word that starts with either a _ or a letter > followed by any number of _, letters, or numbers. (eg: Good = Asdf1, > _aSD1. Bad: 9_asD ). My test text file is just those three examples, > each on a new line. > > I first tested with this: > [_a-zA-Z][_a-zA-Z0-9] > > But that would match against 9_asD which begins with a number (not what > I wanted). So I tried: > [_a-zA-Z][_a-zA-Z0-9]* > > I realize that the expression won't do what I mistakenly thought I > wanted it to do. What is puzzling to me is that my hard disk usage > peaked, my cpu jumped, and grep took almost two minutes to return an > exit code of 1 (no match). :-/ > > At first I thought it may be an issue with Debian Squeeze (current box) > so I tried it on Debian Lenny with similar results. Same for an Ubuntu > Lucid and Fedora 10. So I am pretty sure it is something with grep and > not just the version of grep. > > I was hoping someone might know why grep behaves so oddly with that > expression. If it was a monster file or something I could understand > the system utilization peak, but it is just three lines in a text file. > > Just so you know, I have a working solution. In my case, every instance > is on a new line so I have a working expression using: > ^[_a-zA-Z][_a-zA-Z0-9]*$
This last expression anchors the expression to the beginning of a line. To anchor an expression to the beginning of a word you need: \<[_a-zA-Z][_a-zA-Z0-9]*$ but this will only work if you agree with the implementers of grep as to what it is that defines the beginning of a word. What is your definition? Look in 'man grep' for clues as to where you can find the official grep implmenters definition. I found '\<' in 'man grep' under 'The Backslash Character and Special Expressions' HTH -- Paul E Condon pecon...@mesanetworks.net -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20101109162431.ga3...@big.lan.gnu