Wagner, David --- Senior Programmer Analyst --- WGO am Dienstag, 21. Februar 2006 23.55: > here is a small snippet of code(LABEL1) which appears to remove a comma > which lies between two double quotes. I run it and and display output and > the one line of code which does have the comma is cleaned up. In LABEL2 , > is a snippet of code which does not work, but in all appearances is the > same as my small snippet of code. The working code is AS 5.8.3 on Windows > XP while the the failing is on Sun and is also 5.8.3. > > I am receiving some data and and need to clean up and also split. I > prefer > to not have to load any type of csv handler and works for the most part. > > I don't see the difference in the code other than two different systems. > Note: Moved this same code over ( didn't occur to try it, but head must > be > stuck). It runs and removes the , from within the double quotes. > > Has to be something simple that I am missing. Though been doing Perl for > quite a while, never really been good at the regex processing. > > Thanks. > > Wags ;) > =========================================================================== >============================================== > > LABEL1: > #!perl > use strict; > use warnings; > > my $MyIn = 0; > my $MyOut = 0; > > my $MyHldData; > my $MyWrkFld; > my $MyWrkFldUpd; > > while ( <DATA> ) { > chomp; > s/\r//g; > next if ( /^\s*$/ ); > my $MyHldData = $_; > > if ( /"/ ) { > printf "*1a* Looking at line with quotes\n"; > while ( /("[^"]+")/ ) { > $MyWrkFld = $1; > printf "*1* <%s>", > $1; > > $MyWrkFldUpd = $MyWrkFld; > > if ( $MyWrkFld =~ /,/ ) { > printf "<--Comma hit!!"; > $MyWrkFldUpd =~ s/[,"]//g; > s/$MyWrkFld/$MyWrkFldUpd/g; > } > else { > $MyWrkFldUpd =~ s/"//g; > s/$MyWrkFld/$MyWrkFldUpd/g; > } > printf "\n"; > } > } > else { > printf "No quotes in line %d\n", > $.; > next; > } > printf "ln:<%5d>\nor:<%s>\nmd:<%s>\n", > $., > $MyHldData, > $_ > } > __DATA__ > 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","FHEZGG PEINT > TD","7077 CBNTBLIDETGD GEY",2006-02-14 > 12:00EE,15,"10:05","0152785","2737526",1,1250,10,"892913494",1,25 > 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","EETGH > EGCHENICEL","7840 BELBBE EVG",2006-02-15 > 12:00EE,16,"11:27","0107405","2846954",1,1167,3,"916708540",1,25 2006-02-18 > 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","EETGH EGCHENICEL","7840 > BELBBE EVG",2006-02-15 > 12:00EE,17,"13:47","0107405","2846954",1,456,1,"916708557",1,25 2006-02-18 > 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","NEVEL DGPBT LGVGL HGPEIH > EGGNT","N46433 FLGGT TT, BLDG 661-3",2006-02-16 > 12:00EE,18,"11:40","0164109","2500058",1,529,1,"1078754644",1,25 > > ================================================================== > > LABEL2: > .... > > INPUTTP: while (<MYFILEIN>) { > chomp; > > $in++; > s/\r//g; > > next if ( /^\s*$/ ); # bypass blank lines > > if ( ! /,(\d+)$/ ) { > printf "Expecting a csv line ending with the total number of > times associated with\n"; printf "a terminal, but did not get a hit!\n"; > printf "Data(%d):\<%-s>\n", > $., > $_; > diet(5, $MyFileIn); > } > > $MyDtlCnt = $1; > undef @MyWorka; > undef @MyUnSortedData; > > if ( /"/ ) { > printf "*1a* Looking at line with quotes\n"; > while ( /("[^"]+")/ ) { > $MyWrkFld = $1; > $MyWrkFldUpd = $MyWrkFld; > if ( $MyWrkFld =~ /,/ ) { > $MyWrkFldUpd =~ s/[,"]//g; > s/$MyWrkFld/$MyWrkFldUpd/g; > } > else { > $MyWrkFldUpd =~ s/"//g; > s/$MyWrkFld/$MyWrkFldUpd/g; > } > } > } > > .....
For fun I played around a bit with regexes, but I think the usage of a csv module is easier :-) while (<DATA>) { chomp; # extract fields # my @fields=$_=~/((?:".*?")|(?:(?<=,).*?(?=,))|(?:(?<=,).*?$)|(?:^.*?(?=,)))/g; # remove quotes # $_=~s/"(.*?)"/$1/ for @fields; # print the fields separated with * # print join '*', @fields; print "\n"; } __DATA__ 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","FHEZGG PEINT TD","7077 CBNTBLIDETGD GEY",2006-02-14 12:00EE,15,"10:05","0152785","2737526",1,1250,10,"892913494",1,25 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","EETGH EGCHENICEL","7840 BELBBE EVG",2006-02-15 12:00EE,16,"11:27","0107405","2846954",1,1167,3,"916708540",1,25 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","EETGH EGCHENICEL","7840 BELBBE EVG",2006-02-15 12:00EE,17,"13:47","0107405","2846954",1,456,1,"916708557",1,25 2006-02-18 12:00EE,"TBUTHGHN - GGTT","TEN DIGGB","TEN DIGGB","NEVEL DGPBT LGVGL HGPEIH EGGNT","N46433 FLGGT TT, BLDG 661-3",2006-02-16 12:00EE,18,"11:40","0164109","2500058",1,529,1,"1078754644",1,25 -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>