Steve Tattersall wrote: > > Please help I am trying to extract the line begining with GB and also the > Title between html tags from multiple html files. > > For example I want to extract the line: (see the html code below) > GB 0152 MSS.126/NUDL > > and also the title which is: > > National Union of Dock, Riverside and General Workers in Grea > t Britain and Ireland > > does anyone know how to go about this please, I would be extremly grateful. > > -------------------------------------------------------- > <br><b>Reference</b>: > <a target = "new" title = "Repository contact details from AR > CHON - opens new window" href = "http://www.hmc.gov.uk/archo > n/searches/locresult.asp?LR=152"> > GB 0152 MSS.126/NUDL > </a> > <br><b>Title</b>: > > National Union of Dock, Riverside and General Workers in Grea > t Britain and Ireland > > <br><b>Dates of creation</b>: > ------------------------------------------------------------- > I assume the html text is in the variable $html. Then
my ($repository) = $html =~ /<br><b>Reference</b>:\n<a.*?\>\n(.*?)\n/s; my ($title) = $html =~ /<nr><b>Title</b>:\n\n(.*?)\n/s; should extract what you need. Best Wishes, Andrea -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]