----- Original Message ----- From: "Charles K. Clarkson" <[EMAIL PROTECTED]> To: "'Öznur Tastan'" <[EMAIL PROTECTED]>; "'Perl Lists'" <[EMAIL PROTECTED]> Sent: Tuesday, February 24, 2004 10:56 PM Subject: RE: how to push a double dimensional array
> Öznur Tastan <[EMAIL PROTECTED]> wrote: > : > : ----- Original Message ----- > : From: "Charles K. Clarkson" <[EMAIL PROTECTED]> > : To: "'Öznur Tastan'" <[EMAIL PROTECTED]>; "'Perl Lists'" > : <[EMAIL PROTECTED]> > : Sent: Tuesday, February 24, 2004 8:59 PM > : Subject: RE: how to push a double dimensional array > : > : > : > Öznur Tastan <[EMAIL PROTECTED]> wrote: > : > : > : > : > : > : Ok: I am starting from the beginning. > : > > : > I thought I had accidentally discouraged you. > : > > : > > : > : The alignment is alignment of two strings so that similar > : > : characters according to a scoring system (this is out of scope > : > : I think) should be in the same register and there can be gaps > : > : in one of the strings denotes as a dash. > : > > : > What do you mean by "should be in the same register"? What > : > is a register? > : > : Like in the alignment below A is in the same 'register'( bad > : English himm :) with - but seq1 and seq2 have these information > : already so when I know seq1 and seq2 I don't need any further > : information to visualize the alignment. > : Three feature is enough to define my alignment > : > : AADALLL > : - - EVLLL > : the alignment is there ( this is pairwise alignment of protein > : sequences that can be calculated by dynamic programming of two > : strings) > You are aware that this is the first time you have actually come > out and said this was biology related, right? Yes I know but I didn't want bother you all with details. After all a protein sequence is a string of characters:) Have you checked out > http://bio.perl.org/? I don't know enough about your field to know > if your current application is already solved there. Yes I have already checked there. > Okay, so we have established that each alignment has three > characteristics and we are trying to see how best to arrange > the alignments in a data structure that will ease your manipulation > of those alignments. > > > : > : So my alignments have three features (seq1 seq2 and the score > : > : of the alignments) > : > : seq1: AADALLL > : > : seq2: - - EVLLL > : > : score:12 > : > : > : > : There are groups of alignments that I want to keep separated. > : > : Say I have 15 alignments (the numbers and distributions can > : > : be changed with different input strings) > : > : 3 of the alignments belong to first group > : > : 4 of them second belong to 2nd group. > : > : 5 of them belong to 3rd > : > : 2 of them belong to 4th > : > : 1 of them belong to 5th) > : > > : > What is a group of alignments grouped by? Is it by score > : > or some other quality? > : > : No not score some other quality related to origins of the > : sequences ( again related to biology ). > : So the alignments of certain sequence pairs should belong to one > : group and the others another. So we for the scope of problem > : believe me it is not important. > > So, in this example we have 15 alignments and they can be > grouped by their origin. > > Are the group names alphanumeric of numbered? If numbered, are > they in sequence? Assuming they are not a numbered sequence, I > would think the top level of your structure would be a hash. > > > ( > group_name_1 => ( '3 alignments' ), > group_name_2 => ( '4 alignments' ), > group_name_3 => ( '5 alignments' ), > group_name_4 => ( '2 alignments' ), > group_name_5 => ( '1 alignments' ), > ) > > > What goes inside each hash value depends on what you need > to do with those values. I am still not clear what that is, but > you have two basic choices: > > ( 'AADALLL', '- - EVLLL', 12 ) > > Or: > > ( > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > ) > > A hash of an array of arrays might look like this: > ( > group_name_1 => [ > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > ], > group_name_2 => [ > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > ], > group_name_3 => [ > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > ], > group_name_4 => [ > [ 'AADALLL', '- - EVLLL', 12 ], > [ 'AADALLL', '- - EVLLL', 12 ], > ], > group_name_5 => [ > [ 'AADALLL', '- - EVLLL', 12 ], > ], > ) > > Of course, the alignments would all be different and the and > the group names would probably be more topical, but I don't know > enough biology to make up other values. > > Assuming this data structure is named %groups, we access > each 'seq1' of a group named $group with: > > my $group = 'group_name_1'; > foreach my $sequence ( @{ $groups{ $group } } ) { > print "$sequence->[0]\n"; > # do something with each sequence > } > > > A hash of array of hashes might look like this: > > ( > group_name_1 => [ > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > ], > group_name_2 => [ > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > ], > group_name_3 => [ > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > ], > group_name_4 => [ > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > ], > group_name_4 => [ > { > seq1 => 'AADALLL', > seq2 => '- - EVLLL', > score => 12 > }, > ], > ) > > > Assuming this data structure is named %groups, we access > each 'seq1' of a group named $group with: > > my $group = 'group_name_1'; > foreach my $sequence ( @{ $groups{ $group } } ) { > print "$sequence->{seq1}\n"; > # do something with each sequence > } > > > Unfortunately, I have no idea if I have helped or not. No you exactly helped me but now let me ask one more question. When using a hash of array of hashes or a hash of an array of arrays how can I insert an new element in if I know which group it belongs to? for example how can I insert a new alignment whose seq1 AAAK seq2 VVV- score 10 into group_name_1 without backtracking how many elements group_name_1 contains so something like "push"????? thanks A LOT oznur > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > <http://learn.perl.org/> <http://learn.perl.org/first-response> > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>