What is generating the source file? If you are sure that all the html is
standardized, then you can be more strict.  Maybe if you could send a
section (20 lines or so) so we could see what you are working with...

-----Original Message-----
From: James Edward Gray II [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, September 17, 2002 12:09 PM
To: Ian
Cc: [EMAIL PROTECTED]
Subject: Re: Extracting text from a phrase


That's because my match isn't matching anything.  It's not very 
forgiving and anything so much as a space or case change in the wrong 
place could throw it off.  Can you alter the match a little so it will 
catch the actual lines?

James

On Tuesday, September 17, 2002, at 10:58  AM, Ian wrote:

> Hi,
>
> Thank you but when I try and run that by doing a
>
> Perl script.pl file.shtml >newfile.txt
>
> I am getting a blank output.
>
> Sorry if I did not explain myself correctly.
>
> There are multiple instances of this line in the one page, and I need 
> to
> generate a simple text file to use for another application, which
> happens to be a news ticker.
>
> Thanks again for your help.
>
> Ian
>
>
>> Why not try grabbing all the important stuff right out of the
>> pattern,
>> like my example below.  Note:  Your pattern may need changes if I
>> assumed too much, from your examples.
>>
>> #!/usr/bin/perl
>> while (<>) {
>> if (m!<font class="fontclassz"><a
>> href="([^"]+)">([^<]+)</a></font>!) { print qq(<a href="$1"
>> target="_blank" value="$2">); } }
>>
>> On Tuesday, September 17, 2002, at 10:05  AM, Ian wrote:
>>
>>> Hi,
>>>
>>> Please excuse this newbie question, but I am getting confused :(
>>>
>>> I need at have a small script that will extract selected
>> words from a
>>> phrase and then insert them into a new string. I have an html page
>>> that I need to
>>> extract both urls & keywords from and put them into a new
>> file. Should
>>> be
>>> fairly simple stuff - but it is beating me!!
>>>
>>> A typical example is
>>> <font class="fontclassz"><a
>> href="http://domain.name";>text</a></font>
>>>
>>> The output needs to be as
>>> <a href="http://domain.name"; target="_blank" value="text">
>>>
>>> The first part and the last bit appear to be working
>> (just), but it is
>>> the middle that I am stuck on.
>>>
>>> What I have put together so far is....
>>>
>>> #!/usr/bin/perl
>>> while ($line=<>) {
>>> if ($line =~ m|<font class="fontclassz">(.*?)</font>|) {
>> $headline =
>>> $line; $url = $line;
>>> $line = $1;
>>> $headline =~ s|<[^>]*>||g;
>>> $url =~ s|<[^>]*>||g;
>>> $line = <<END;
>>> <a href="$url" target="_blank" value="$headline">
>>> END
>>> print "$line\n";
>>> }
>>> }
>>>
>>> I realise that both $headline and $url are the same at the
>> moment, but
>>> having tried various alternatives, I am just getting more and more
>>> confused. Any assistance would be gratefully received.
>>>
>>> Ian
>>>
>>>
>>>
>>> --
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>
>>
>> -- 
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>
>
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to