Owen am Montag, 12. Dezember 2005 22.10: > Xavier Noria wrote: > > On Dec 12, 2005, at 11:10, Alexandre Checinski wrote: > >> I have a string that looks like this : > >> <counter id="183268" since="SDOPERFV16" aggr="Sum" > >> name="pcmTcuFaultOutOfService"/> > > > > m// in list contex may help: > > > > my ($id, $name) = $xml =~ m{id="([^"]*)".*name="([^"]*)"/>}; > > I despair of ever understanding REs
You will, just play around with it: for example, make a "quick and dirty" small script along the lines =start= use strict; use warnings; my $teststring='something to test'; my $ok=$teststring=~m~sts~; # <<< play around here print $ok ? 'yes, I got it!!!' : 'I despair of ever understanding REs'; =end= If you think a regex does something, test the something with above script. keep open some manuals: perldoc perlre perldoc perlretut perldoc perlrequick > > How does the above work > > m Match > { inside these braces (as the delimiter?) No; the {} are in place of the usual //. That's why the 'm' after '=~' is mandatory. Same holds for substitution. Sometimes the regexes are more readable if somethings else than '//' is used, for example when matching (unix) paths. > id=" the characters id=" > ( Start the capture for $id > [^"] The list of characters beginning with " Not exactly; The list of chars *not* matching '"', thus the caret just after '['. > But wasn't that done on line 3 where we > looked for a " > * any number of characters (including none) > ) end of capture for $id > " the end " for the data element captured > .* anything until more precicely: nothing or anything in "greedy"-mode until > name=" etc do it all again till > > } ending delimiter > > So I have trouble with [^"]* This means: none or more characters not being a '"'. > > What words describe that expression please my ($id, $name) = $xml =~ m{id="([^"]*)".*name="([^"]*)"/>}; Extract two values $id and $name from the string $xml. Do that by searching the literal string 'id="'; then look for someting between two doublequotes, whereby the thing between must not contain a doublequote, and catch it into $1; then skip everything until the literal string 'name='; then look for someting between two doublequotes, whereby the thing between must not contain a doublequote, and catch it into $2; then match a directly following literal string '/>'. Finally, assign ($1, $2) to the list ($id, $name). The regex could be improved a bit, I think: 1. it would be less restrictive to allow spaces around '=' and before '/>' 2. there is a problem with the '.*' in the middle: if there are several tags containing a name attribute, it will match the 'name=' of the last tag containing a name attribute. This is because '.*' is greedy. 3. I'm not sure, but I think there must be a space between an attribute value and the next attribute name This leads to m{id\s*=\s*"([^"]*)".+?name\s*=\s*"([^"]*)"\s*/>}; But even this version could be improved (f.e. it can't handle escaped doublequotes (\") within the attribute values. I'm not sure, but I think this is not allowed, but could be used to trick the regex doing the wrong thing) Somebody please correct me if I'm wrong, thanks, I'm overworked (beside not being a guru) hth, joe -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>