Re: [R] for loop and if problem

Philipp Pagel Tue, 06 Jan 2009 08:38:43 -0800

On Tue, Jan 06, 2009 at 07:21:48AM -0800, Sake wrote:
> I'm heaving difficulties with a dataset containing gene names and positions
> of those genes.
> Not such a big problem, but each gene has multiple exons so it's hard to say
> where de gene starts and where it ends. I want the starting and ending
> position of each gene in my dataset.
> Attached is the dataset:
> http://www.nabble.com/file/p21312449/genlistchrompos.csv genlistchrompos.csv 
> Column 'B' is the gene name, 'G' is the starting position and 'H' is the
> stop position.


I don't really see how 'if' and 'for loops' are involved in the
question. You may want to give us a little more detail on what
exactly you need and what you tried unsuccessfully.  (By the way
-- there are no columns labeled 'B', 'G' or 'H' in the file).

Anyway - I believe this is what you are after:

# get minimum start position by gene
aggregate(dat[, c('Exon_Start.Chr.')], by=list(dat$Gene), min)
# get maximum stop position by gene
aggregate(dat[, c('Exon_Stop.Chr.')], by=list(dat$Gene), max)

Of course, these will only reflect the real start and stop
coordinates of the gene if ALL exons are given in the file.

cu
        Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for loop and if problem

Reply via email to