Canol Gökel schreef: > My problem is to match HTML tags with RegExp.
What is your personal definition of a tag? To match a tag, you could use "/<[^>]*>/" but that would also match "<>". Maybe you are just looking for "/<[A-Za-z]+>/"? > I managed to match > something like this, properly: > > la la la <p>a paragraph</p> bla bla bla <p>another paragraph</p> ya > ya ya > > But when nested, there arises problems: And that is what you should expect when you use the wrong tool, or use a tool in the wrong way. You are not matching tags there, but constructs delimited by tags. That is much easier to do with multiple passes. Use a parser. It is often simple to create one. > <p>a paragraph <p>bla bla bla</p> la la la</p> > > It matches > > <p>A paragraph <p>bla bla bla</p> > > instead of matching the most inner part: > > <p>bla bla bla</p> > > How can one write an expression to match always the most inner part? Don't focus on "an expression", your toolbox is bigger than that. my $atom = qr~<([a-z]+)>[^<>]+</\1>~; # meant to evolve ;-) > I couldn't write an expression like "match a non-greedy <p>.*</p> > which does not have a <p> inside. > > Note: Most probably there is a module for this but: > - I want to learn the logic, > - I don't use Perl in this project, > - Actually my problem is different than matching HTML tags but I > choose them to explain my problem, easily. That is very stupid. -- Affijn, Ruud "Gewoon is een tijger." -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/