Hi, What about this solution:
use warnings; use strict; my $str = ' chr1 ucsc exon 226488874 226488906 0.000000 - . gene_id "NM_173083"; transcript_id "NM_173083"; chr1 ucsc exon 226496810 226497198 0.000000 - . gene_id "NM_173083"; transcript_id "NM_173083"; chr1 ucsc exon 2005086 2005368 0.000000 + . gene_id "NM_001033581"; transcript_id "NM_001033581"; chr1 ucsc exon 2066701 2066786 0.000000 + . gene_id "NM_001033581"; transcript_id "NM_001033581";'; my @patterns = map {/(NM_\d+)"/; $1} grep(/NM_\d+"/, split(/\n+/, $str)); my $additional = 12345; foreach (@patterns) { $str =~ s/($_)\"/$1:$additional\"/g and $additional++; } print "$str\n"; Regards, Katya -----Original Message----- From: Richard Green [mailto:gree...@uw.edu] Sent: Saturday, February 26, 2011 10:07 PM To: beginners@perl.org Subject: string substitution command question Hi Perl users, Quick question, I have a one long string with tab delimited values separated by a newline character (in rows) Here is a snippet of the the string: chr1 ucsc exon 226488874 226488906 0.000000 - . gene_id "NM_173083"; transcript_id "NM_173083"; chr1 ucsc exon 226496810 226497198 0.000000 - . gene_id "NM_173083"; transcript_id "NM_173083"; chr1 ucsc exon 2005086 2005368 0.000000 + . gene_id "NM_001033581"; transcript_id "NM_001033581"; chr1 ucsc exon 2066701 2066786 0.000000 + . gene_id "NM_001033581"; transcript_id "NM_001033581"; I am trying to perform substitution on some values at the end of each rows, for example, I'm trying to replace the above string with the following: chr1 ucsc exon 226488874 226488906 0.000000 - . gene_id "NM_173083:12345"; transcript_id "NM_173083:12345"; chr1 ucsc exon 226496810 226497198 0.000000 - . gene_id "NM_173083:12345"; transcript_id "NM_173083:12345"; chr1 ucsc exon 2005086 2005368 0.000000 + . gene_id "NM_001033581:12346"; transcript_id "NM_001033581:12346"; chr1 ucsc exon 2066701 2066786 0.000000 + . gene_id "NM_001033581:12346"; transcript_id "NM_001033581:12346"; Here is the substitution command I am trying to use: $data_string=~ s/$gene_id\"NM_173083\"\; transcript_id \"NM_173083\"\;/\"NM_173083:12345\"\; \"NM_173083:12345\"\;/g; $data_string=~ s/$gene_id\"NM_001033581\"\; transcript_id \"NM_001033581\"\;/\"NM_001033581:12346\"\; \"NM_001033581:12346\"\;/g; I don't know why I am not able to substitute at the end of each row in the string. Any suggestions folks have are muchly appreciated. Thanks -Rich -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/