[R] how to find end of a FASTA file
Hi: I am trying to find end of a FASTA file: library(ShortRead) fastadata <- readFasta("fastafolder", "fa$") file <- tempfile() writeFasta(fastadata, file) var1 <- readLines(file) while(countlength(tmp <- readLines(file, n = -1)) > 0) { #do something } I want the while loop to run till the end of file is reached, but the while statement dosent work. Thanks for help. Regards Jac __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA on high dimentional data
Hi: I have a large dataset mydata, of 1000 rows and 1000 columns. The rows have gene names and columns have condition names (cond1, cond2, cond3, etc). mydata<- read.table(file="c:/file1.mtx", header=TRUE, sep="") I applied PCA as follows: data_after_pca<- prcomp(mydata, retx=TRUE, center=TRUE, scale.=TRUE); Now i get 1000 PCs and i choose first three PCs and make a new data frame new_data_frame<- cbind(data_after_pca$x[,1], data_after_pca$x[,2], data_after_pca$x[,3]); After the PCA, in the new_data_frame, i loose the previous cond1, cond2, cond3 labels, and instead have PC1, PC2, PC3 as column names. My question is, is there any way I can map the PC1, PC2, PC3 to the original conditions, so that i can still have a reference to original condition labels after PCA? Thanks: deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data generation
Hi: I am trying to generate data form a simple linear regression model. The training data T = {(x1, y1), . . . , (xn), yn}, want to sample x uniformly from the range [0,1], find uncorrupted response y = x^2, and generate random noise "e" from normal distribution N(0, 1). Any idea how to do in simple steps? Thanks in advance. deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] word frequency count
Hi: I have a dataframe containing comma seperated group of words such as milk,bread bread,butter beer,diaper beer,diaper milk,bread beer,diaper I want to output the frequency of occurrence of comma separated words for each row and collapse duplicate rows, to make the output as shown in the following dataframe: milk,bread 2 bread,butter 1 beer,diaper 3 milk,bread 2 Thanks for help! deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] word frequency count
Hi: Suppose I create the dataframe df using the following code: df <- data.frame( item1 = c('milk', 'bread','beer','beer','milk','beer'), item2 =c('bread', 'butter','diaper','diaper','bread', 'diaper'), stringsAsFactors = F); df item1 item2 1 milk bread 2 bread butter 3 beer diaper 4 beer diaper 5 milk bread 6 beer diaper And now i want the following output: milk,bread 2 bread,butter 1 beer,diaper 3 milk,bread 2 and "milk,bread" is a single datum. I hope this clarifies the problem! Thanks! On 3/18/12, John Kane wrote: > ? table > > First however confirm "that milk,bread" is a single datum. str() should do > this > > Can you post a sample of the data here using dput()? > > John Kane > Kingston ON Canada > > >> -Original Message- >> From: mailme...@googlemail.com >> Sent: Sun, 18 Mar 2012 13:12:48 +0200 >> To: r-help@r-project.org >> Subject: [R] word frequency count >> >> Hi: >> >> I have a dataframe containing comma seperated group of words such as >> >> milk,bread >> bread,butter >> beer,diaper >> beer,diaper >> milk,bread >> beer,diaper >> >> I want to output the frequency of occurrence of comma separated words >> for each row and collapse duplicate rows, to make the output as shown >> in the following dataframe: >> >> milk,bread 2 >> bread,butter 1 >> beer,diaper 3 >> milk,bread 2 >> >> Thanks for help! >> >> deb >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your > desktop! > Check it out at http://www.inbox.com/marineaquarium > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] word frequency count
Hi: Thanks for reply. I am using the following statement res <- with(df, table(paste(item1, item2, sep=', ')) ) to get the frequency counts of the rows, which gives the following output: milk,bread 2 bread,butter 1 beer,diaper 3 milk,bread 2 But I need to extract from the above result two vectors or dataframes (such as DF1 and DF2) to make the final output as below: DF1 milk,bread bread,butter beer,diaper milk,bread DF2 2 1 3 2 Can anyone help? Thanks in advance! On Sun, Mar 18, 2012 at 4:22 PM, S Ellison wrote: > You could do try > with(df, table(item1:item2) ) > or > with(df, table(paste(item1, item2, sep=', ')) ) > > If the order is immaterial, so that (milk, bread) is the same as (bread, > milk), there's a bit more work to do. Maybe > > table( apply(df, 1, function(x) paste(sort(x))) ) > > > From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf > Of mail me [mailme...@googlemail.com] > Sent: 18 March 2012 13:31 > To: r-help > Subject: Re: [R] word frequency count > > Hi: > > Suppose I create the dataframe df using the following code: > > df <- data.frame( item1 = c('milk', > 'bread','beer','beer','milk','beer'), item2 =c('bread', > 'butter','diaper','diaper','bread', 'diaper'), stringsAsFactors = F); > > > df > > item1 item2 > 1 milk bread > 2 bread butter > 3 beer diaper > 4 beer diaper > 5 milk bread > 6 beer diaper > > And now i want the following output:milk,bread 2 > bread,butter 1 > beer,diaper 3 > milk,bread 2 > > > > and "milk,bread" is a single datum. I hope this clarifies the problem! > > Thanks! > > > > On 3/18/12, John Kane wrote: >> ? table >> >> First however confirm "that milk,bread" is a single datum. str() should do >> this >> >> Can you post a sample of the data here using dput()? >> >> John Kane >> Kingston ON Canada >> >> >>> -Original Message- >>> From: mailme...@googlemail.com >>> Sent: Sun, 18 Mar 2012 13:12:48 +0200 >>> To: r-help@r-project.org >>> Subject: [R] word frequency count >>> >>> Hi: >>> >>> I have a dataframe containing comma seperated group of words such as >>> >>> milk,bread >>> bread,butter >>> beer,diaper >>> beer,diaper >>> milk,bread >>> beer,diaper >>> >>> I want to output the frequency of occurrence of comma separated words >>> for each row and collapse duplicate rows, to make the output as shown >>> in the following dataframe: >>> >>> milk,bread 2 >>> bread,butter 1 >>> beer,diaper 3 >>> milk,bread 2 >>> >>> Thanks for help! >>> >>> deb >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your >> desktop! >> Check it out at http://www.inbox.com/marineaquarium >> >> >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *** > This email and any attachments are confidential. Any u...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] writing data to file
Hi: I created a data frame df <- data.frame( person = c('John','Bob','Mary'), team = c('a','b','c'), stringsAsFactors = F); and obtained the expected output df person team 1 John a 2Bob b 3 Mary c now I want to save the whole content of df preserving its row and column order to a file in disk with the following command: write(df, file = "testfile", append=FALSE, sep=" "); and I get the error message Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'list') cannot be handled by 'cat' Can you help to solve the problem? Thanks in advance. deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to cluster rows of words in a text file
Hi: I am trying to cluster the rows of a text file with kmeans: I load the data as follows file1 <- read.csv("somefile.csv") and the file can be viewed having the following line of words > file1 1 word1 word3 word4 word1 2 word1 word4 word3 word1 3 word4 word2 word4 word3 4 word4 word2 word1 word3 5 word2 word2 word4 word2 file_as_matrix <- as.matrix(file1); Now, I want to apply some clustering algorithm such as kmeans to cluster the rows in the file to get the following output: Cluster1 word1 word3 word4 word1 word1 word4 word3 word1 Cluster2 word4 word2 word4 word3 word4 word2 word1 word3 word2 word2 word4 word2 But as kmeans takes as input numeric matrix of data, it cannot be used to cluster the rows in this case. Is there any simple way to cluster the rows of such a text file? An example code would be really useful. Thanks and regards: debb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] create waveform sawtooth
Hi: I am trying to create a sawtooth waveform. I used the following x <- runif(500, min = -2, max = 2) y <- (1 -abs(x3))* ((x3) <= 1) combined <- data.frame(x = x3, y = y3) plot(combined) and I get a triangular waveform, not sawtooth. Can someone give a solution to create a sawtooth waveform? Thanks deb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.