On Apr 11, 2012, at 2:01 PM, Jean V Adams wrote:

Alison,

Your code works fine on the first six lines of the data that you provided.

Rumino_Reps_agreeWalign <- data.frame(
       geneid = c("657313.locus_tag:RTO_08940",
               "457412.251848018",
               "657314.locus_tag:CK5_20630",
               "657323.locus_tag:CK1_33060",
               "657313.locus_tag:RTO_09690",
               "471875.197297106"),
       count_Conser = c(7, 1, 2, 1, 3, 0),
       count_NonCons = c(5, 4, 4, 0, 0, 2),
       count_ConsSubst = c(5, 3, 1, 1, 3, 1),
       count_NCSubst = c(1, 0, 0, 0, 1, 1))
gene.list <- strsplit(as.character(Rumino_Reps_agreeWalign$geneid), "\\.")
Rumino_Reps_agreeWalignTR <- transform(Rumino_Reps_agreeWalign,
       taxid=do.call(rbind, gene.list))

Perhaps in later rows of the data there are cases where there is no "." in geneid? If not, can you provide a subset of your data that results in the
warning?  Use the dput() function.

It's not a good idea to create an object named "strsplit". That will only
mask the function strsplit() in later runs.

There is not a problem with masking the function unless the new name is replaced with a language object (which wasn't the case here). The potential confusion is in minds of users. Function names are stored separately from non-language object names so you can have a data object named 'strsplit' and it will not mask the function 'strsplit'.

--
David.

If time is an issue, a slightly faster way to do this, after the
strsplit() function is:
Rumino_Reps_agreeWalign$geneid.prefix <- sapply(gene.list, "[", 1)
Rumino_Reps_agreeWalign$geneid.suffix <- sapply(gene.list, "[", 2)

Jean


alison waller wrote on 04/11/2012 08:23:29 AM:

Dear all,

I want to use string split to parse column names, however, I am having
some errors that I don't understand.
I see a problem when I try to rbind the output from strsplit.

please let me know if I'm missing something obvious,

thanks,
alison

here are my commands:
strsplit<-strsplit(as.character(Rumino_Reps_agreeWalign$geneid),"\ \.")

Rumino_Reps_agreeWalignTR<-transform
(Rumino_Reps_agreeWalign,taxid=do.call(rbind,
strsplit))
Warning message:
In function (..., deparse.level = 1)  :
  number of columns of result is not a multiple of vector length (arg
1)


here is my data:

head(Rumino_Reps_agreeWalign)
geneid count_Conser count_NonCons count_ConsSubst 1 657313.locus_tag:RTO_08940 7 5 5 2 457412.251848018 1 4 3 3 657314.locus_tag:CK5_20630 2 4 1 4 657323.locus_tag:CK1_33060 1 0 1 5 657313.locus_tag:RTO_09690 3 0 3 6 471875.197297106 0 2 1
  count_NCSubst
1             1
2             0
3             0
4             0
5             1
6             1

here are the results from strsplit:
head(strsplit)
[[1]]
[1] "657313"              "locus_tag:RTO_08940"

[[2]]
[1] "457412"    "251848018"

[[3]]
[1] "657314"              "locus_tag:CK5_20630"

[[4]]
[1] "657323"              "locus_tag:CK1_33060"

[[5]]
[1] "657313"              "locus_tag:RTO_09690"

[[6]]
[1] "471875"    "197297106"

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to