> "Sara" == Sara <[EMAIL PROTECTED]> writes:
Sara> I have a couple of text files with html code in them.. e.g.
Sara> -- Text File --
Sara>
Sara>
Sara> This is Test File
Sara>
Sara>
Sara> This is the test file contents
Sara>
Sara> blah blah blah
On Thursday, Sep 4, 2003, at 17:55 US/Pacific, Hanson, Rob wrote:
$text =~ s|().*?.*?.*?()|$1$2$3|s;
actually that should be:
$text =~ s|().*?(.*?).*?()|$1$2$3|s;
way stylish! I actually like.
But assumes that there will be a title element - otherwise it
will fail and not clear out the other s
drieux wrote:
It could just be my OCD, but if I could have hammered
flat every FROOOTLOOP who wanted merely a 'quick and dirty'
one time only fix, 'honest, it's just this one time', rather
than actually cure the root cause problem, WE would be on a
flat earth from all the pounding
That or we
x27;Anconia'" <[EMAIL PROTECTED]>; "'Sara'"
<[EMAIL PROTECTED]>
Cc: "beginperl" <[EMAIL PROTECTED]>
Sent: Friday, September 05, 2003 5:55 AM
Subject: RE: Stripping HTML from a text file.
: > Or maybe I misunderstood the question
:
: Or maybe I d
On Thursday, Sep 4, 2003, at 17:55 US/Pacific, Hanson, Rob wrote:
[..]
I agree... but only if you are looking for a strong permanant solution.
The regex way is good for quick and dirty HTML work.
[..]
technically we agree right up to the 'quick and dirty' part...
I mean, how many times have we wa
04, 2003 8:48 PM
To: 'Sara'
Cc: beginperl
Subject: Re: Stripping HTML from a text file.
Won't this remove *everything* between the given tags? Or maybe I
misunderstood the question, I thought she wanted to remove the "code"
from all of the contents between two tags?
Becaus
On Wednesday, Sep 3, 2003, at 03:32 US/Pacific, Sara wrote:
[..]
What I want to do is to remove/delete HTML code from the text file
from a certain tag upto certain tag.
For example; I want to delete the code completely that comes in
between and (including any style tags and embedded
javascrip
Won't this remove *everything* between the given tags? Or maybe I
misunderstood the question, I thought she wanted to remove the "code"
from all of the contents between two tags?
Because of the complexity and variety of HTML code, the number of
different tags, etc. I would suggest using an HTML
A simple regex will do the trick...
# untested
$text = "...";
$text =~ s|.*?||s;
Or something more generic...
# untested
$tag = "head";
$text =~ s|<$tag[^>]*?>.*?||s;
This second one also allows for possible attributes in the start tag. You
may need more than this if the HTML isn't well formed