On Thu, Jun 26, 2008 at 9:42 AM, yitzle <[EMAIL PROTECTED]>
wrote:

> On Thu, Jun 26, 2008 at 8:18 AM, vikingy <[EMAIL PROTECTED]> wrote:
> > Hi all,
> >
> >   I have two files,one is label file,another is thickness file, they are
> one to one correspondence, for example:
> >   the label file is :          2     2     3      2      1     3     4
>   5     2     5     1     4  ......
> >   the thickness file is:    0.3  0.8   0.2   0.1   2.4   0.9  3.2  0.2
> 0.1  0.3   2.1   2.3 ......
> >  Now I want to calculate  the sum and mean thickness of the same labeled,
>  just like this:
> >   label 1 : 2.4
> >   label 2 : (0.3+0.8+0.1 +0.1)/4
> >   label 3 : (0.2+2.4)/2
> >   label 4 : (3.2+2.3)/2
> >   label 5 : (0.2+0.3)/2
> >   .......
> > and then there is also a index [3  4] to select the label, so in the end
> ,I want to get the sum and mean of  (0.2+2.4)/2,  (3.2+2.3)/2.
> >
> > I'm a beginner to perl, and don't know how to implement this with
> perl,could you give me some suggestion? thanks in advance!
> >
> > bin lv
>
> For the first part of your question:
> If the info in the file is all on one line, you can either (1) read in
> the entire line and use the split() function to split it up into an
> array or, if there is a fixed number of spaces between records, you
> can set
> local $/ = " ";
> (See perldoc perlvar)
> For a large amount of data, the second method would be faster.
>
> You can then use a hash to store the information for each label.
>
> __CODE__
> my %data;
> open my $thicknessFH, "< thickness.txt" or die;
> open my $labelFH, "< labels.txt" or die;
> local $/ = " ";
>
> # Process the files
> while ( my $label = <$labelFH> ) {
>    my $thickness = <$thicknessFH>;
>    $data{ $label }{ 'count' }++;
>    $data{ $label }{ 'thickness' } += $thickness;
> }
>
> # Display results
> for my $label ( sort keys %data ) {
>    local $\ = "\n"; # Auto-append newlines to prints
>    print "Label $label: " . ( $data{ $label }{ 'thickness' } / $data{
> $label }{ 'count' };
> }
> __END__
>
> I don't fully understand how you'd like to deal with indexes like [3 4]...
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> http://learn.perl.org/
>


another way to handle the one line input is to read the whole line at once
and then use
regex to extract each column


__CODE__
#!/usr/bin/perl
use strict;
use warnings;
my $label_file = "label.in";
my $thickness_file = "thickness.in";

open my $fp_l, "<", $label_file or die "Cannot open $label_file";
my $label = <$fp_l>;
close $fp_l;

open my $fp_t, "<", $thickness_file or die "Cannot open $thickness_file";
my $thick = <$fp_t>;
close $fp_t;

my @labels = ($label =~ /(\d+)\s+/g);
my @thicks = ($thick =~ /(\d*\.\d*)\s+/g);

my %data;
for my $i( 0 .. $#labels) {
  $data{$labels[$i]}{count}++;
  $data{$labels[$i]}{thick} += $thicks[$i];
}

for my $key (sort {$a <=> $b} keys %data) {
  print "label: ", $key, "\tcount: ",$data{$key}{count},"\tvalue:",
$data{$key}{thick}/$key,"\n";
}

__END__

Reply via email to