Rather than removing the <solar>, </solar> pair as separate actions it might be easier to use them to 'anchor' the start and end of text you want to extract.
Try: $solar_info =~ s/<solar>([^<]*)<\/solar>/$1\n/; This matches everything between the <solar> ... </solar> pair, and replaces it with the text in between- it also sticks an extra newline on the end for where you have 'joined' lines - it should be easy to remove blank lines later. I'm a bit rusty: you might want to stick a 'g' on the very end (after the replacement expression) to make it match more than once on the same line. $solar_info =~ s/<solar>([^<]*)<\/solar>/$1\n/g; Let us know how you get on. -- Matthew Bassett <hewb...@gmail.com> Sorry about the top posting- am replying from my phone. -----Original Message----- From: LeeGroups Sent: 12/07/2010 22:55:38 Subject: Re: [ubuntu-uk] [OT] Quick Perl question... >> $solar_info =~ s/<\/solar>.*/,/; >> >> From my tinkerings, this should find the string </solar> in the string >> $solar_info, and then remove it and any number of following characters >> (the .*) and then replace them with a ",". >> Except that it doesn't. It hacks out the </solar> and replaces it with a >> , but leaves the rest of the string intact... Much to my annoyance... :| >> > What's the input string? The following code simply prints "," for me > not ",abcdef" as you suggest it would: > $test = "</solar>abcdef"; > $test =~ s/<\/solar>.*/,/; > print $test; This input <solar>8,27.31,28.68,28.81,0.00,0.00,0</solar> It need to be -- 8,27.31,28.68,28.81,0.00,0.00,0 Another line chops off the <solar>. The problem is that occasionally there is rubbish on the end of the line, or even another line appended to the end of the first... -- ubuntu-uk@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-uk https://wiki.ubuntu.com/UKTeam/ -- ubuntu-uk@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-uk https://wiki.ubuntu.com/UKTeam/