Hi,
I need some help with a hash of array of array.
this is my input data structure:
gene a al1 data1 data2 data9
gene b al2 data3 data4 data10
gene b al3 data5 data6 data12
gene b al4 data7 data8 data12
I take each data variable, see above, from a sql query and parse the data
to build a new data structure: a hash of arrays of arrays.
In the input data presented here, the first column will be the key of the
hash and the other 4 columns should compose 4 arrays.
example :Each gene (gene a gene b ..) should be the keys, the al column
should be the first array, data1,3,5,7 should be in the second array and
so on for the third and fourth array. further more the data in the arrays
should be unique. Data should be grouped depending on their keys, I am
expecting this structure below, furthermore each variable in an array
should be unique. ex: for gene b key in the 4th array data12 should appear
once.
Thanks for any tips.
Nat
this is what I would like to achieve (dataDumper format)
$VAR1 = {
gene b => [
[
'al4','al2','al3'
],
[
'data7','data3','data5'
],
[
'data8','data4','data6'
],
[
'data12','data10'
]
],
gene a => [
['al1'],
['data1'],
['data2'],
['data9']
]
};
using the script below and the data dumper, I create a structure which is
not correct some of the data is erased and other grouped wrongly, 'gene a'
data goes into 'gene b' . gene a is empty.
$VAR1 = {
gene b => [
[
'al1',
'al2',
'al3'
],
[
'data1',
'data3',
'data5'
],
[
'data2',
'data4',
'data6'
],
[
'data9',
'data10',
'data12'
]
],
gene a => [
[],
[],
[],
[]
]
};
here is my code
#!/usr/local/bin/perl
use strict;
use warnings;
use DBI;
use Data::Dumper;
use List::MoreUtils qw(uniq);
my $subrow_hash;
my $row_hash;
my %hasharray;
my $geno_dbh = DBI->connect( credential...} ) || die "Database
connection not made: $DBI::errstr";
print STDERR "Connection...\n";
my $subsql = "SELECT * FROM table_2015";
#this is the table structure
#gene a al1 data1 data2 data9
#gene b al2 data3 data4 data10
#gene b al3 data5 data6 data12
#gene b al4 data7 data8 data12
my $subresult = $geno_dbh->prepare($subsql);
$subresult->execute() or die "SQL Error: $DBI::errstr\n";
my @gene_name_list;
my @allele_list;
my @mp_list;
my @mp_list_def;
my @unique_mp_def;
my $Xref;
my @unique_gene_name_list ; # unique gene only
my @unique_allele_list;# unique allele only
my @unique_mp_list;# unique MP terms only
my $list;
while ( $subrow_hash = $subresult->fetchrow_hashref) {
my $symbol_id=$subrow_hash->{symbol};#this is the first set of data for
the first array
my $allele_id=$subrow_hash->{allele_symbol};#this is the 2nd set of data
for the 2nd array
my $mp_id=$subrow_hash->{phenotype_acc};#this is the 3rd set of data for
the 3rd array
my $mp_def=$subrow_hash->{name};#this is the fourth set of data for the
fourth array
$Xref=$subrow_hash->{xref_acc}; #this is the key of the hash (gene a and
gene b)
$list=[[@gene_name_list], [@allele_list], [@mp_list],
[@mp_list_def]];#create arrays of arrays
if ($Xref){
$hasharray{$Xref} = $list; #create a hash of arrays of arrays
for a
specific key
@gene_name_list =@{$list->[0]}; #maybe not necessary to declare
this.
@allele_list =@{$list->[1]};
@mp_list =@{$list->[2]};
@mp_list_def =@{$list->[3]};
if ($symbol_id){
push (@gene_name_list, $symbol_id); #fill arrays with data
}
if ($allele_id){
push (@allele_list, $allele_id);
}
if ($mp_id){
push (@mp_list, $mp_id);
}
if ($mp_def){
push (@mp_list_def, $mp_def);
}
}
}
print Dumper(\%hasharray);
__END__
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/