On Wed, 22 Jan 2003, Rob Dixon wrote: > Hi George. I think you'd have had an answer by now if there was > one. I can't think of anything but I wasn't willing to post and say > 'it can't be done' without waiting for others' ideas. > > George P. wrote: > > But now, I need to check for all classes other than "text"; > > > > This has me stumped!! > > > > For eg: > > $str = '<TD class="text1">'; > > > > $class = 'text[0-9]+' > > if ($str =~ /class="$class"/) > > { > > print "TAG has this class\n"; > > } > > I still don't understand exactly why you cant use > > if ($str !~ /class="$class"/) { > print "TAG doesn't have this class\n"; > } > > or even something ugly like > > if ($str =~ /class="$class"/) { } > else { > print "TAG doesn't have this class\n"; > } > > If you can describe to us a circumstance where you 'need' this > functionality I'm sure we'll come up with an answer. >
I'll explain what I'm trying to do??? I'm writing a program that will parse an HTML file. This html file contains a text article that has been placed in between certain tags like (<TD>) which have a specific class name. So you can have something like <TABLE> <TR> <TD class="articletext"> This article is just an example. </TD> </TR> </TABLE> And, I have to pick "This article is just an example" from that file. What class name to pick differs in different files. So, although I have to pick all text within a TD tag having class name "articletext" for the previous example. I might have to pick all text within a SPAN tag having class name "anotherarticletext" in another HTML file. What class name to pick is decided by what file I'm parsing. So, what did I do?? I created a map file. This map file will contain the filename, and the tag-class combination which I have to pick. I then read the file, and checked if it has that tag-class combination. If it does I get the text that falls within that tag. Assuming, $str contains a tag specification. $str = "<TD class='articletext'>"; In order to check if that tag-class combination exists. I simply do: if ($str =~ /<$tag class='$class'>/i) { # Take the text } else { # Don't take the text } This code helps a lot when I want to pick up a specific type of class, like all those classes which start with the word "text" and have a number following it. This way the class name given in my map file will be "text[0-9]+" Other than this, I wanted to also remove a few tags-class combination that come in between the tags that I want to pick up. Eg: <TD class="articletext"> This text has to be picked up <SPAN class="removetext"> This text has to be ignored </SPAN> This text has to also be picked up. </TD> So I wrote a similiar code to find those tags that I want to remove, and if that tag-class combination matches, I ignore them. This code works fine, when you give proper classnames, and also works for regex class names like "text[0-9]+" But now, one more situation arose. I want to remove all classes other than the pick-up class. So, if I'm picking up text from class "articletext", I want to remove all classes other than "articletext". I wanted to use the current code setup, just change the removing class name to something like "[^(articletext)]" , and expect it to remove all classes other than "articletext", but this cannot happen. I think I'll just add one more parameter in the map file, which will tell me when to use "=~" and when to use "!~". Thanks for your help. bye, George . -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]