Dear all,
 
Probably a very basic question but I need some help.
I have a data frame (made by read.table from a text file) of microarray data, 
of which the first column is a factor and the rest of the columns are numeric. 
The factor column contains chromosome names, so values 1 through 22 plus X, Y 
and XY. The numeric columns contain positions or intensity measurements.
What I need to do is change the X's in the first column to a value of 23. 
 
This is what I thought I would do:
 
BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T)  #to read in the table
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23                           #"in rows where 
the first column of BAF_temp is X, change the first column of BAF_temp to 23"
 
However with this last line I get an error: "Invalid factor level, NAs 
generated in '[<-.factor'('*tmp*', BAF_temp[,1]=="X", value=23)"
 
(I tested if my syntax for selecting the rows of chromosome X was correct by 
trying
BAF_X <- BAF_temp[BAF_temp[,1]=="X",]
which worked to give me a data frame with only the rows of the X chromosome.)
 
I then thought it might work better if I changed the data frame to a matrix.
When I change the BAF_temp data frame into a matrix (by BAF_matrix <- 
as.matrix(BAF_temp)), then the command I used above:
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
works fine and the end result is as I meant it to be, with all the X's changed 
into 23's.
However, by using as.matrix all columns are changed to 'character' including 
the numeric measurements (I understand this is because one of the columns of 
the data frame is 'factor')
 
I would like some help on what is the best option to solve this. I have thought 
of a few options myself and would like your comment/help:
1. Is there another syntax I can use on the data frame to change the X's to 
23's, so I don't have to change the data frame into a matrix first?
 
2. I could change the data frame into a matrix and run the syntax as I 
described, resulting in all columns becoming 'character'; is there then an easy 
way to turn the columns with measurements (columns 2 and further) back into 
'numeric' while leaving the first column with the chromosome numbers as 
'character'?
 
3. I thought of using data.matrix(BAF_temp) and making use of the fact that the 
first column of factors would be changed to the underlying numbers (because X 
being the 23rd level in the list would automaticly be changed to 23). However 
because the levels (chromosome names) of the factor column are ordered as "1", 
"10", "11", "12",....,"19", "2", "20", "21", "3", "4", etc. (I see this when 
using str(BAF_temp)) , this results in chromosome 10 being changed into a value 
of 2, chromosome 11 into 3, chromosome 2 into 12 etc. For info: the chromosome 
names in the text file that is imported are ordered just 1, 2, 3, etc.
 
If anyone has some tips for me I would greatly appreciate it.
 
Best wishes,
Marije
 
 


De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de 
geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken van 
dit bericht, het niet openbaar maken of op enige wijze verspreiden of 
vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een 
incomplete aankomst of vertraging van dit verzonden bericht.

The contents of this message are confidential and only intended for the eyes of 
the addressee(s). Others than the addressee(s) are not allowed to use this 
message, to make it public or to distribute or multiply this message in any 
way. The UMCG cannot be held responsible for incomplete reception or delay of 
this transferred message.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to