On Jan 21, 2008 1:27 PM, Kevin Viel <[EMAIL PROTECTED]> wrote:
snip
> > while (<DATA>) {
> > my ($snp, $genotype) = split;
> > $data{$snp}{$genotype}++
> > }
snip
> > $data{$snp}{$genotype}++
>
> Is the semicolon unnecessary for this line?
snip
The semi-colon in Perl is a statement separator not a statement
terminator (i.e. Pascal-like rather than ANSI C-like). So the last
statement in a block does not require a semi-colon; however, the lack
of a semi-colon on that line was an oversight on my part (I am from an
ANSI C background so I tend to treat it like a terminator) even though
the code is fine. The comma is also odd in Perl. Saying
my @a = (1,2,3,4,5,);
is the same as saying
my @a = (1,2,3,4,5);
snip
> So, if I understand correctly $data{$snp} is a value in a hash. That value
> is a scalar that happens, in this case, to be a reference to an anonymous
> hash? The key of this anonymous hash is $genotype? It does not seem like
> the simplicity of this lines relates the complexity of the object:
>
> $data{ $snp }
> $data{ $snp }{ $genotype }
snip
I think I understand what you are saying, and it seems correct. I am
going to restate everything about $data{$snp}{$genotype} just to make
sure we are on the same page.
%data is a hash.
$snp is a scalar
$data{$snp} yields a hash ref
$data{$snp}->{$genotype} (aka $data{$snp}{$genotype}) yields a scalar value
>
>
> ####
>
> Consider the code below:
>
> #! /usr/bin/perl
>
> use strict ;
> use warnings ;
>
> use Data::Dumper ;
>
> my %outer ;
>
> while ( <DATA> ) {
> my ( $snp , $genotype ) = split , /\s+/ ;
snip
The default split for split is /\s+/, plus some magic to make spaces
at the start of a line disappear. It is generally preferable to say
my ($snp, $genotype) = split;
unless you need padding spaces in $snp.
snip
> My goal is to print all of the SNP (keys of outer) that have more than two
> alleles (keys of the second anonymous hash). How can I achieve this?
>
> for my $SNP_keys ( sort { $a cmp $b ) keys %outer ){
>
> my $num_alleles = keys _______ ;
>
> }
>
> The values of outer, for instance $outer{ $num_alleles }, are scalar
> (references to anonymous hashes). However, I cannot seem to dereference
> them.
snip
I am not sure which count you wanted, so I gave you both. If you want
to know how many keys are in a given hash you can call keys in a
scalar context:
my $count = keys %hash;
For a hashrefs, you need to tell Perl to use the hashref like a hash
with %{} (in some cases you don't need the {})
my $count = keys %{$hashref};
When you want to iterate over all of the keys in a hash you can use
the keys in a list context:
for my $key (keys %hash) {
}
If it is a Hash of hashes you can just keep nesting (remembering to use %{}):
for my $k1 (keys %hash) {
for my $k2 (keys %{$hash{$k1}}) {
for my $k3 (keys %{$hash{$k1}{$k2}}) {
print "the value at $k1, $k2, $k3 is $hash{$k1}{$k2}{$k3}\n";
}
}
}
Since hash keys are unordered, it is often useful to sort them before
iterating over them.
#! /usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
#this needs a better more descriptive name
#maybe %alleles_by_snp, I don't know your
#domain well enough to make a better suggestion
my %outer;
while ( <DATA> ) {
my ($snp, $genotype) = split;
$outer{$snp}{$_}++ for $genotype =~ /(.)/g;
}
for my $snp (sort keys %outer) {
my $count = keys %{$outer{$snp}};
my $total;
$total += $outer{$snp}{$_} for keys %{$outer{$snp}};
print "snp $snp has $count different alleles and a total of $total\n";
}
__DATA__
1 CC
1 CT
1 TT
1 NN
2 CC
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/