On Sep 25, 4:33 pm, [EMAIL PROTECTED] (Rob Dixon) wrote: > Jonathan Lang wrote: > > Rob Dixon wrote: > >> Jonathan Lang wrote: > >>> I'm trying to devise a regex that matches from the first double-quote > >>> character found to the next double-quote character that isn't part of > >>> a pair; but for some reason, I'm having no luck. Here's what I tried: > > >>> /"(.*?)"(?!")/ > > >>> Sample text: > > >>> author: "Jonathan ""Dataweaver"" Lang" key=val > > >>> What I'm getting for $1 in the first match: > > >>> Jonathan " > > >>> What I'm looking for: > > >>> Jonathan ""Dataweaver"" Lang > > >>> What did I miss, and how can I most efficiently perform the desired match? > >> Your regex looks for the first double-quote and then captures everything > >> after > >> that up to the first subsequent double-quote that isn't followed > >> immediately by > >> another one. The second quote of the pair before 'Dataweaver' matches this > >> criterion so your regex captures up to the character before it. > > >> This: > > >> $str =~ /"((?:.*?"")*.*?)"/; > > >> should do what you want. After finding the first double-quote it captures > >> all > >> following sequences ending in a pair of double quotes, plus anything after > >> those up to the closing quote. > > > Ah. I had tried /"((.*?"")*.*?)"/ and hadn't gotten it to work; it > > never occurred to me to try the non-capturing group instead. > > That also works! (But is performing unnecessary and wasteful captures.) > > Rob > > use strict; > use warnings; > > my $str = q(author: "Jonathan ""Dataweaver"" Lang" key=val); > > $str =~ /"((.*?"")*.*?)"/; > print $1, "\n"; > > **OUTPUT** > > Jonathan ""Dataweaver"" Lang
use strict; use warnings; my $str = q(author: "Jonathan ""Dataweaver"" Lang" key=val fly-in- ointment: "Brian ""Nobull"" McCauley"); $str =~ /"((.*?"")*.*?)"/; print $1, "\n"; __END__ **OUTPUT** Jonathan ""Dataweaver"" Lang" key=val fly-in-ointment: "Brian ""Nobull"" McCaule y An alternative pattern would be /"((?:[^"]*"")*.*?)"/ although the behaviour or that may be counter-intuative if presented with bad input in which there's no closing quote. My perferred pattern would be much closer to Jonathan's original: /"((?:[^"]|"")*)"(?!")/ This has the advantage of failing to match if presented with input that lacks a closing quote. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/