Hello everybody,

I need some help I am working on a searching and extarction secipt, where i
need to extract information from HTML and XML files.

Its simple I need to find some know tags and gett the information enclosed
between them. I faced with a problem when there are multiple ocurances of
same tags.

I am only getting the last occurance of the pattern and not all.
for example if the lines in the files are

<product>Product 1</product>
<product>Product 2</product>
<product>Product 3</product>
<product>Product 4</product>

What i am getting is only the last match.

Kindly let me know how I can get the first match.

if done I can then do a loop to extract the other ocurances.

The code I am useing is as follows.

open(FILE,"test.html");
while ($line=<FILE>)
     {$lines.=$line;}
close FILE;

# remove new line, tab characters form the text of the file
$lines=~s/[\n\r\|]/ /g;

# get the title...
  $title = (   $lines =~s/<product>(.{0,100})<\/product>/
    ) ? $1 : 'No Title';

print $title;



I am working on Active Perl on Win2K machine.

thanks  in advance

Rajeev Rumale

****************************************************************************
**********
"The human race has one really effective weapon, and that is laughter."
****************************************************************************
**********














----- Original Message -----
From: "Bob Showalter" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, July 31, 2001 8:28 PM
Subject: RE: substitution


> > -----Original Message-----
> > From: COLLINEAU Franck FTRD/DMI/TAM
> > [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, July 31, 2001 8:15 AM
> > To: Perl (E-mail)
> > Subject: substitution
> >
> >
> > Hi!
> >
> > I have a file where there is a line whitch begans by the
> > string "<CENTER>". I would like to remove the string
> > "<CENTER>" by nothing. How can i do ?
>
>    perl -pe 's/^<CENTER>//' myfile
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to