[EMAIL PROTECTED] (Debbie Cooper) writes:

> I'm pretty new to Perl but with the help of this list I've been able to come
> up with a few helpful scripts.  This time I'm trying to read through a
> tab-delimited text file with the first row containing headers.  I want to
> print out any field/column name where the entire field is null (meaning
> there is no value for that field for any record in the file). 

Well, here is something that does what I think you want (note that
there are embedded tab characters down in the data section that may
get lost in transmission).  

Just to be sure I understand your problem space: If there are NO
values for a given column, in all of the rows of the data set, then
you want to print the name of that column, where name is defined as
the word that is in the firs header row (also tab separated).

To do that, I used a hash (%seen), indexed by the column name,
although this could just has easily been a list indexed by the column
number.  Inside the part of the loop that reads the lines of data, I
do a quick scan across, setting the $seen{} for each possible column
where there is a non-empty string in that column.

Note, that I don't actually KEEP any of the row data - since that did
not seem to be a requirement -- you only want to know the list of
empty columns. 

#!/usr/bin/perl
use warnings;
use strict;

our @header; 
our %seen; 

while (<main::DATA>) {
    chomp;
    if ($. == 1) {
        @header = split /\t/ ;
    } else { 
        my @field = split /\t/;    # get each of the fields

        # now, for each possible column in the file ... 
        foreach my $index ( 0 .. $#header ) {
            # make $seen{header_name} nonzero if there was some value
            # in this row
            $seen{$header[$index]}++ if length($field[$index]);
        }
    }
}


# now, for each possible column name...

foreach my $header (@header) {
    # print yourself unless there was a nonzero number of rows
    # that had values in that column
    print "$header\n" unless $seen{$header};
}

#
# with this test data, the program should print "fuzz"
# as all of the other columns have at least one value 
#

__DATA__
name    city    age     fuzz    tax     job
Lars    Tijuana 38                      perl hacker
Marco   Tecate  28              123     artist



-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
        Lawrence Statton - [EMAIL PROTECTED] s/aba/c/g
Computer  software  consists of  only  two  components: ones  and
zeros, in roughly equal proportions.   All that is required is to
sort them into the correct order.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to