Dear David, it is showing this error-
data.frame(A = unlist(lapply( lapply( sapply(mydf[,5], strsplit, + split="a|A"), length) , "-", 1)),C = unlist(lapply( lapply( sapply((mydf[,5], strsplit, split="c|C"), Error: unexpected ',' in: "data.frame(A = unlist(lapply( lapply( sapply(mydf[,5], strsplit, split="a|A"), length) , "-", 1)),C = unlist(lapply( lapply( sapply((mydf[,5]," > length) , "-", 1)),G = unlist(lapply( lapply( sapply((mydf[,5], strsplit, > split="g|G"), Error: unexpected ')' in "length)" > length) , "-", 1)),T = unlist(lapply( lapply( sapply(mydf[,5], strsplit, > split="t|T"), Error: unexpected ')' in "length)" What should I do? Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London ________________________________________ From: David Winsemius [dwinsem...@comcast.net] Sent: Saturday, July 02, 2011 2:07 AM To: Bansal, Vikas Subject: Re: [R] For help in R coding On Jul 1, 2011, at 8:01 PM, Bansal, Vikas wrote: > Dear David, > > Thanks for your reply.I tried your code it is running but as I > mentioned in my mail,I am working on pileup file.So I used a command- > mydf=read.table( > to read pileup file to have data frame i:e mydf.Now the problem is > it has 10 columns and have to count the number of A C G T which is > in 9th column. > In your mail we input data like this >> txt <- " .a,g,, > + .t,t,, > + .,c,c, > + .,a,,, > + .,t,t,t > + .c,,g,^!. > + .g,ggg.^!, > + .$,,,,,., > + a,g,,t, > + ,,,,,.,^!. > + ,$,,,,.,." > > but how I should input my data from dataframe mydf using txt command > because there are thousands of rows? Just sent mydf[ , 9] as the argument in place of testvec. > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > ________________________________________ > From: David Winsemius [dwinsem...@comcast.net] > Sent: Friday, July 01, 2011 11:25 PM > To: Bansal, Vikas > Cc: r-help@r-project.org > Subject: Re: [R] For help in R coding > > On Jul 1, 2011, at 12:47 PM, Bansal, Vikas wrote: > >> Dear all, >> >> I am doing a project on variant calling using R.I am working on >> pileup file.There are 10 columns in my data frame and I want to >> count the number of A,C,G and T in each row for column 9.example of >> column 9 is given below- >> >> .a,g,, >> .t,t,, >> .,c,c, >> .,a,,, >> .,t,t,t >> .c,,g,^!. >> .g,ggg.^!, >> .$,,,,,., >> a,g,,t, >> ,,,,,.,^!. >> ,$,,,,.,. >> >> This is a bit confusing for me as these characters are in one column >> and how can we scan them for each row to print number of A,C,G and T >> for each row. > > Seems a bit clunky but this does the job (first the data): >> txt <- " .a,g,, > + .t,t,, > + .,c,c, > + .,a,,, > + .,t,t,t > + .c,,g,^!. > + .g,ggg.^!, > + .$,,,,,., > + a,g,,t, > + ,,,,,.,^!. > + ,$,,,,.,." > >> txtvec <- readLines(textConnection(txt)) > > Now the clunky solution, Basically subtracts 1 from the counts of > "fragments" that result from splitting on each letter in turn. Could > be made prettier with a function that did the job. > >> data.frame(A = unlist(lapply( lapply( sapply(txtvec, strsplit, > split="a"), length) , "-", 1)), > + C = unlist(lapply( lapply( sapply(txtvec, strsplit, split="c"), > length) , "-", 1)), > + G = unlist(lapply( lapply( sapply(txtvec, strsplit, split="g"), > length) , "-", 1)), > + T = unlist(lapply( lapply( sapply(txtvec, strsplit, split="t"), > length) , "-", 1)) ) > A C G T > .a,g,, 1 0 1 0 > .t,t,, 0 0 0 2 > .,c,c, 0 2 0 0 > .,a,,, 1 0 0 0 > .,t,t,t 0 0 0 2 > .c,,g,^!. 0 1 1 0 > .g,ggg.^!, 0 0 4 0 > .$,,,,,., 0 0 0 0 > a,g,,t, 1 0 1 1 > ,,,,,.,^!. 0 0 0 0 > ,$,,,,.,. 0 0 0 0 > > Has the advantage that the input data ends up as rownames, which was a > surprise. > > If you wanted to count "A" and "a" as equivalent, then the split > argument should be "a|A" > > >> Most of the rows have . and , and other symbols >> but we will ignore them.I just want to run a loop with a counter >> which will count the number of A,C,G and T for each row and will >> give output something like this- >> >> >> A C G T >> 1 0 1 0 >> 0 0 0 2 >> 0 2 0 0 >> 1 0 0 0 >> 0 0 0 3 >> >> This output is for first 5 rows from the example given above. >> >> I am new to R can you please help me.I will be very thankful to you. >> >> >> >> Thanking you, >> Warm Regards >> Vikas Bansal >> Msc Bioinformatics >> Kings College London >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > > > > > David Winsemius, MD West Hartford, CT ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.