Pedro Soto wrote:
> Hi,
> I am trying to write script to retrieve info from a file that looks like:
> 
> col1   col2    col3
> A        5         10
> A        5         10
> A        5         11
> A        6          8
> A        7          9
> B        5          8
> B        6          9
> what i need is to get for each (non redundant) value from column 1, the
> corresponding non redundant values from column 2 and 3. e.g:
> For A (col 1), I want 5 -10, 5-11 and 6-8. For B: 5-8 and  6-9.
> I wrote a script to get rid of the redundant values using hashes and
> subroutines and it worked. However I still need to compare the elements from
> col2 and col3 with other values. To do this I want to sort the data, but I
> am struggling to sort the hash. It prints what I want but only if ask it to
> print within the subroutine (line 29). I do not know how to return a hash
> with the sorted values. I hope someone could help me out with this. The code
> is below:
> 
> 
> #! usr/local/bin/perl
> 
>  use warnings;
>  use strict;
>  my %db_del;
>  my %std_dup;
>  open(IN,"file.csv") || die;
>      while (<IN>) {
>      my @temp=split/,/;
>      push (@{$db_del{$temp[0]}}, $temp[1]."\t".$temp[2]);
>                       }
>      &NONRE(%db_del,%std_dup);
> 
> foreach my $e(%db_dup) {
> foreach my $l (@{$db_dup{$e}}) {
> print "$e,$l,$std_dup{$l}\n"; #does not print $std_dup{$l}
> }}
> 
>  ########sub##############
> sub NONRE {
> my %hash;
> my %seen;
> my @uniq;
> my %st;
> %hash = @_;
> foreach my $k (sort keys%hash) {
>        foreach my $item(@{$hash{$k}}) {
>               push(@uniq,$item) unless $seen{$item}++;
>               }
>        foreach my $item(@uniq) {
>         my @stend =split/\t/,$item;
>         $st{$stend[0]}= $stend[1];
>                }
>         @{$hash{$k}}= sort {$a <=> $b} keys%st;
>        foreach my $f(keys%hash){
>        foreach my $l(@{$hash{$f}}) {
>        print "$f,$l,$st{$l} ok\n";# it prints OK
>                      }
>               }
> }
> @uniq =();
> %seen =();
> return(%hash,%st);
> }

I think this doesn't do what you want, because the hash %st is keyed by the
values from column 2, so pairs like (5,10) and (5,11) cannot both exist in %st.
But you do pass in a hash called %st_dup, so you may want something like that.

You can pass single hashes to and from subroutines as a simple list. So you
successfully passed in %db_del, for instance, but if you need to keep two or
more hashes separate you must pass them by reference.

having said that, I don't see any reason to pass in %st_dup, as it seems to be
only a return value. Remember that Perl doesn't pass its return values to the
parameters like this: it is possible to modify the contents of the @_ array,
which will alter the parameters that were passed, but that isn't recommended
unless you know what you're doing. Collect the return values from a subroutine
with a simple assignment, like this

  my $return = subroutine($p1, $p2);

and if you need to pass back two hashes, you could write

  return \%hash, \%st;

and then make the call like this.

  {
    my ($r1, $r2) = NONRE(%db_del);
    %db_del = %$r1;
    %std_dup = %$r2;
  }

Finally, the program below does what I think you want (removes duplicate records
and prints the rest in sorted order) but you haven't said enough to be sure.

HTH,

Rob



use strict;
use warnings;

my %db_del;

open IN, '<', 'file.csv' or die $!;
while (<IN>) {
  chomp;
  my ($key, $f1, $f2) = split/,/;
  $db_del{$key}{$f1,$f2} = [$f1, $f2];
}

foreach my $key (sort keys %db_del) {

  my @vals = sort {
    $a->[0] <=> $b->[0] or $a->[1] <=> $b->[1]
  } values  %{$db_del{$key}};

  foreach my $val (@vals) {
    print join ',', $key, @$val;
    print "\n";
  }
}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to