On Jan 21, 2008 1:27 PM, Kevin Viel <[EMAIL PROTECTED]> wrote: snip > > while (<DATA>) { > > my ($snp, $genotype) = split; > > $data{$snp}{$genotype}++ > > } snip > > $data{$snp}{$genotype}++ > > Is the semicolon unnecessary for this line? snip
The semi-colon in Perl is a statement separator not a statement terminator (i.e. Pascal-like rather than ANSI C-like). So the last statement in a block does not require a semi-colon; however, the lack of a semi-colon on that line was an oversight on my part (I am from an ANSI C background so I tend to treat it like a terminator) even though the code is fine. The comma is also odd in Perl. Saying my @a = (1,2,3,4,5,); is the same as saying my @a = (1,2,3,4,5); snip > So, if I understand correctly $data{$snp} is a value in a hash. That value > is a scalar that happens, in this case, to be a reference to an anonymous > hash? The key of this anonymous hash is $genotype? It does not seem like > the simplicity of this lines relates the complexity of the object: > > $data{ $snp } > $data{ $snp }{ $genotype } snip I think I understand what you are saying, and it seems correct. I am going to restate everything about $data{$snp}{$genotype} just to make sure we are on the same page. %data is a hash. $snp is a scalar $data{$snp} yields a hash ref $data{$snp}->{$genotype} (aka $data{$snp}{$genotype}) yields a scalar value > > > #### > > Consider the code below: > > #! /usr/bin/perl > > use strict ; > use warnings ; > > use Data::Dumper ; > > my %outer ; > > while ( <DATA> ) { > my ( $snp , $genotype ) = split , /\s+/ ; snip The default split for split is /\s+/, plus some magic to make spaces at the start of a line disappear. It is generally preferable to say my ($snp, $genotype) = split; unless you need padding spaces in $snp. snip > My goal is to print all of the SNP (keys of outer) that have more than two > alleles (keys of the second anonymous hash). How can I achieve this? > > for my $SNP_keys ( sort { $a cmp $b ) keys %outer ){ > > my $num_alleles = keys _______ ; > > } > > The values of outer, for instance $outer{ $num_alleles }, are scalar > (references to anonymous hashes). However, I cannot seem to dereference > them. snip I am not sure which count you wanted, so I gave you both. If you want to know how many keys are in a given hash you can call keys in a scalar context: my $count = keys %hash; For a hashrefs, you need to tell Perl to use the hashref like a hash with %{} (in some cases you don't need the {}) my $count = keys %{$hashref}; When you want to iterate over all of the keys in a hash you can use the keys in a list context: for my $key (keys %hash) { } If it is a Hash of hashes you can just keep nesting (remembering to use %{}): for my $k1 (keys %hash) { for my $k2 (keys %{$hash{$k1}}) { for my $k3 (keys %{$hash{$k1}{$k2}}) { print "the value at $k1, $k2, $k3 is $hash{$k1}{$k2}{$k3}\n"; } } } Since hash keys are unordered, it is often useful to sort them before iterating over them. #! /usr/bin/perl use strict; use warnings; use Data::Dumper; #this needs a better more descriptive name #maybe %alleles_by_snp, I don't know your #domain well enough to make a better suggestion my %outer; while ( <DATA> ) { my ($snp, $genotype) = split; $outer{$snp}{$_}++ for $genotype =~ /(.)/g; } for my $snp (sort keys %outer) { my $count = keys %{$outer{$snp}}; my $total; $total += $outer{$snp}{$_} for keys %{$outer{$snp}}; print "snp $snp has $count different alleles and a total of $total\n"; } __DATA__ 1 CC 1 CT 1 TT 1 NN 2 CC -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/