Re: Pattern matching problem

Henry Todd Thu, 26 Feb 2004 03:36:15 -0800

On 2004-02-26 00:43:21 +0000, [EMAIL PROTECTED] (Wolf Blaum) said:

As I understand Biology, there is 4 nucleotid acids which gives 4**2 combinaions for dupplets. So you need 8 vars to count the occourence of all douplets. Worse for triplets. (24) As I understand genetics, triplets are what matters, since the rma transcriptase reads triplets as code of amino acids. You might give my updates un my biol. knowledge:-)

Wolf -

It's been a while since my A-Level biology days, but I believe you're correct. However, this particular coursework was to create two programs for a different purpose than I think you're imagining:

transition.pl: returns tables of transition probabilities for plus and minus models (exon and non-exon regions) as well as beta values (log-odds ratios) to compare the two models.

The transition probability for AT for example (the probability that adenine will be followed by thymine) is calculated thus:

tp(AT) = |AT| / |A_|

The total number of occurrences of "AT" divided by the total number of "A" followed by anything.

The program can also write the transition probabilities to a file to be used as input for the other program...

simulation.pl: which asks the user to specify the length of the sequence they want, then generates it according to the model file used as input (by simulating a Markov chain). So if you supply a file containing the transition probabilities of a typical exon (coding) region, the simulation will use them to generate a typical exon sequence.

Thanks very much to everyone who's offered further advice on this problem, I know now that my method of counting the dinucleotides in the input sequence is a little brain-dead. However, it works, and I've learnt from it. I'm looking forward to my next foray into the world of Perl.

Regards,

Henry.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Pattern matching problem

Reply via email to