Dear all,


I am a R beginner, and I am looking for a way to do the same thing for all
levels of a column in a table.



Basically, I have a bunch of protein sequences composed of different amino
acid residues, and each residue is represented by an uppercase letter. I
want to calculate the ratio of different amino acid residues at each
position of the proteins. Here is an example table:

Proteins

Time_zero

1

2

3

4

5

6

7

8

p1

0.0050723

L

E

Y

I

I

P

D

A

p2

0.0002731

T

E

N

L

V

P

G

A

p3

9.757E-05

L

M

Y

Q

I

P

E

C

p4

0.0002077

R

E

Y

L

I

S

E

A



If I name this table as myfile.txt, I have the following scripts to
calculate the ratio of each amino acid residue at position 1:

# showing levels of the 3rd column, which means the types of residues

>myfile[,3]



# calculating the ratio of L

>list=c(which(myfile[,3]=="L"))

>time0total=sum(myfile[,2])

>AA_L=0

>for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}

>ratio_L=AA_L/time0total



So how can I write a script to do the same thing for the other two levels
(T and R) in column 3, and also do this for every column that contains
amino acid residues?



Many thanks for any help you could give me on this topic! :)



Regards,

Zhao
-- 
Zhao JIN
Ph.D. Candidate
Ruth Ley Lab
467 Biotech
Field of Microbiology, Cornell University
Lab: 607.255.4954
Cell: 412.889.3675

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to