Thanks. I took a look at your site and book and found the chapter on look ahead. realized how much i was underutilizing them and they could have saved me alot of headaches. !!
> -----Original Message----- > From: Jeff 'japhy' Pinyan [mailto:[EMAIL PROTECTED]] > Sent: Friday, October 04, 2002 11:20 AM > To: Kipp, James > Cc: [EMAIL PROTECTED] > Subject: RE: Reg Exp > > > >> $dna =~ m{ > >> (?= > >> tag > >> (?: > >> .*? tag > >> # the substr(...) is there to avoid using $& > >> (?{ push @matches, substr($dna, $-[0], $+[0] - $-[0]) }) > >> )+ > >> ) > >> (?!) > >> }x; > > First of all, I haven't benchmarked, and I had thought of doing the > index() and substr() as approach that J. Krahn demonstrated. > > The regex uses (?= ... ) to look ahead, so it can match stuff without > consuming it. Here's an example of what I mean: if I have a string > "ABCADEFA", and I want all chunks of "A...A", if the regex actually > CONSUMES the "ABCADEFA", then it will have to start after the last A, > meaning I've missed embedded "ADEFA" chunk. By using a > look-ahead, I can > match text while staying where I am in the string. Compare: > > print "japhy" =~ /(..)/g; > > with > > print "japhy" =~ /(?=(..))/g; > > Next, to get all the "tag...tag" chunks of varying lengths, I use > > /tag(?:.*?tag)+/ > > which matches "tagAtag", "tagAtagBtag", "tagAtagBtagCtag", and so on. > > The real magic is the code block (?{ ... }) that does the dirty work. > First of all, substr($DNA, $-[0], $+[0] - $-[0]) is just a way of > accessing $& without incurring the penalties associated with > it. So let's > just use $& for now. The code (push @matches, $&) is > executed after every > point that the regex has matched up to an occurence of "tag", so in > > tagTHIStagTHATtagTHOSEtag > > it'll happen at: > > tagTHIStag X > tagTHIStagTHATtag X > tagTHIStagTHATtagTHOSEtag X > tagTHATtag X > tagTHATtagTHOSEtag X > tagTHOSEtag X > > those six locations. The last thing in the regex is the > (?!), which is a > negative look-ahead for nothing, which ALWAYS fails. This forces the > regex to backtrack, so I get all the matches. > > -- > Jeff "japhy" Pinyan [EMAIL PROTECTED] > http://www.pobox.com/~japhy/ > RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for "Regular Expressions in Perl" published by Manning, in 2002 ** <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]