Alison,

Your code works fine on the first six lines of the data that you provided.

Rumino_Reps_agreeWalign <- data.frame(
        geneid = c("657313.locus_tag:RTO_08940", 
                "457412.251848018", 
                "657314.locus_tag:CK5_20630", 
                "657323.locus_tag:CK1_33060", 
                "657313.locus_tag:RTO_09690", 
                "471875.197297106"), 
        count_Conser = c(7, 1, 2, 1, 3, 0),
        count_NonCons = c(5, 4, 4, 0, 0, 2), 
        count_ConsSubst = c(5, 3, 1, 1, 3, 1), 
        count_NCSubst = c(1, 0, 0, 0, 1, 1))
gene.list <- strsplit(as.character(Rumino_Reps_agreeWalign$geneid), "\\.")
Rumino_Reps_agreeWalignTR <- transform(Rumino_Reps_agreeWalign, 
        taxid=do.call(rbind, gene.list))

Perhaps in later rows of the data there are cases where there is no "." in 
geneid?  If not, can you provide a subset of your data that results in the 
warning?  Use the dput() function.

It's not a good idea to create an object named "strsplit".  That will only 
mask the function strsplit() in later runs.

If time is an issue, a slightly faster way to do this, after the 
strsplit() function is:
Rumino_Reps_agreeWalign$geneid.prefix <- sapply(gene.list, "[", 1)
Rumino_Reps_agreeWalign$geneid.suffix <- sapply(gene.list, "[", 2)

Jean


alison waller wrote on 04/11/2012 08:23:29 AM:

> Dear all,
> 
> I want to use string split to parse column names, however, I am having 
> some errors that I don't understand.
> I see a problem when I try to rbind the output from strsplit.
> 
> please let me know if I'm missing something obvious,
> 
> thanks,
> alison
> 
> here are my commands:
>  >strsplit<-strsplit(as.character(Rumino_Reps_agreeWalign$geneid),"\\.")
>  > 
> Rumino_Reps_agreeWalignTR<-transform
> (Rumino_Reps_agreeWalign,taxid=do.call(rbind, 
> strsplit))
> Warning message:
> In function (..., deparse.level = 1)  :
>    number of columns of result is not a multiple of vector length (arg 
1)
> 
> 
> here is my data:
> 
>  > head(Rumino_Reps_agreeWalign)
>                        geneid count_Conser count_NonCons count_ConsSubst
> 1 657313.locus_tag:RTO_08940            7             5               5
> 2           457412.251848018            1             4               3
> 3 657314.locus_tag:CK5_20630            2             4               1
> 4 657323.locus_tag:CK1_33060            1             0               1
> 5 657313.locus_tag:RTO_09690            3             0               3
> 6           471875.197297106            0             2               1
>    count_NCSubst
> 1             1
> 2             0
> 3             0
> 4             0
> 5             1
> 6             1
> 
> here are the results from strsplit:
>  > head(strsplit)
> [[1]]
> [1] "657313"              "locus_tag:RTO_08940"
> 
> [[2]]
> [1] "457412"    "251848018"
> 
> [[3]]
> [1] "657314"              "locus_tag:CK5_20630"
> 
> [[4]]
> [1] "657323"              "locus_tag:CK1_33060"
> 
> [[5]]
> [1] "657313"              "locus_tag:RTO_09690"
> 
> [[6]]
> [1] "471875"    "197297106"

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to