On 16/05/2011 23:44, Owen wrote:
I am trying to get all the 6 letter names in the second field in DATA below, eg BARTON DARWIN DARWIN But the script below gives me all 6 letter and more entries. What I read says {6} means exactly 6. What is the correct RE? I have solved the problem my using if (length($data[1]) == 6 ) but would love to know the correct syntax for the RE ================================================================= #!/usr/bin/perl use strict; use warnings; while (<DATA>) { my $line = $_; my @line = split /,/; $line[1] =~ s /\"//g; print "$line[1]\n" if $line[1] =~ /\S{6}/; } __DATA__ "0200","AUSTRALIAN NATIONAL UNIVERSITY","ACT","PO Boxes" "0221","BARTON","ACT","LVR Special Mailing" "0800","DARWIN","NT",,"DARWIN DELIVERY CENTRE" "0801","DARWIN","NT","GPO Boxes","DARWIN GPO DELIVERY ANNEXE" "0804","PARAP","NT","PO Boxes","PARAP LPO" "0810","ALAWA","NT",,"DARWIN DELIVERY CENTRE" "0810","BRINKIN","NT",,"DARWIN DELIVERY CENTRE" "0810","CASUARINA","NT",,"DARWIN DELIVERY CENTRE" "0810","COCONUT GROVE","NT",,"DARWIN DELIVERY CENTRE" ===============================================================
Hi Owen. Your test establishes only whether the pattern can be found within the object string a test like "CASUARINA" =~ /\S{6}/; finds the six non-space characters "CASUAR" and then returns success as the criterion has been satisfied. To get it to match /only/ six-character non-space strings you can add anchors at the beginning and end of the regex: "CASUARINA" =~ /^\S{6}$/; will fail because the sequence "beginning of line, six non-space characters, end of line" don't appear in "CASUARINA". But the proper way to do this is to forget about regular expressions and treat the data as comma-separated fields. The module Text::CSV will do this for you, as per the progrm below. HTH, Rob use strict; use warnings; use Text::CSV; my $csv = Text::CSV->new; while (my $fields = $csv->getline(*DATA)) { my $suburb = $fields->[1]; next unless $suburb and length $suburb == 6; print $suburb, "\n"; } __DATA__ "0200","AUSTRALIAN NATIONAL UNIVERSITY","ACT","PO Boxes" "0221","BARTON","ACT","LVR Special Mailing" "0800","DARWIN","NT",,"DARWIN DELIVERY CENTRE" "0801","DARWIN","NT","GPO Boxes","DARWIN GPO DELIVERY ANNEXE" "0804","PARAP","NT","PO Boxes","PARAP LPO" "0810","ALAWA","NT",,"DARWIN DELIVERY CENTRE" "0810","BRINKIN","NT",,"DARWIN DELIVERY CENTRE" "0810","CASUARINA","NT",,"DARWIN DELIVERY CENTRE" "0810","COCONUT GROVE","NT",,"DARWIN DELIVERY CENTRE" **OUTPUT** BARTON DARWIN DARWIN -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/