It is perhaps worth noting that (assuming I understand correctly) this can easily be done in one go without any overt looping as a nice application of Reduce() after all your files are read into your global environment as a nice application of Reduce().
Example: > a.out <- data.frame(x = 1:3, y1 = 11:13) > b.out <- data.frame(x = c(1,3), y2 = 21:22) > d.out <- data.frame(x = c(2:3), y3 = c(.5,.6)) > nm <- ls(pat = ".*out$") > f <- function(dat, y) merge(dat, get(y), all = TRUE) > allofthem <- Reduce(f, nm[-1], init = get(nm[1])) > allofthem x y1 y2 y3 1 1 11 21 NA 2 2 12 NA 0.5 3 3 13 22 0.6 ## note the change to "all = TRUE" in the merge() call Cheers, Bert On Fri, Dec 20, 2019 at 9:37 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > ?merge ## note the all.x option > Example: > > a <- data.frame(x = 1:3, y1 = 11:13) > > b <- data.frame(x = c(1,3), y2 = 21:22) > > > merge(a,b, all.x = TRUE) > x y1 y2 > 1 1 11 21 > 2 2 12 NA > 3 3 13 22 > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Dec 20, 2019 at 9:00 AM Yuan Chun Ding <ycd...@coh.org> wrote: > >> Hi Bert, >> >> >> >> Sorry that I was in a hurry going home yesterday afternoon and just >> posted my question and hoped to get some advice. >> >> >> >> Here is what I got yesterday before going home. >> >> --------------------------------------------------------------- >> >> setwd("C:/Awork/VNTR/GETXdata/GTEx_genotypes") >> >> >> >> file_list <- list.files(pattern="*.out") >> >> >> >> #to read all 652 files into Rstudio and found that NOT all files have >> same number of rows >> >> for (i in 1:length(file_list)){ >> >> >> >> assign( substr(file_list[i], 1, nchar(file_list[i]) -4) , >> >> >> >> read.delim(file_list[i], head=F)) >> >> } >> >> >> >> #the first file, GTEX_1117F, in the following format, one column and >> 19482 rows >> >> #4 is marker id, 25/48 is its marker value; >> >> # V1 >> >> # 4 >> >> # 25/48 >> >> # 201 >> >> # 2/2 >> >> # ... >> >> # 648589 >> >> # None >> >> >> >> #to make this one-column file into a two-column file as below >> >> # so first column is marker id, second is corresponding marker values for >> the sample GTEX_1117F >> >> # VNTRid GTEX_1117F >> >> # 4 25/48 >> >> # 201 2/2 >> >> # ... ... >> >> # 648589 None >> >> >> >> for (i in 1:length(file_list)){ >> >> temp <- read.delim(file_list[i], head=F) >> >> even <-seq(2, length(temp$V1),2) >> >> odd <-seq(1, length(temp$V1)-1, 2) >> >> output <-matrix(0, ncol=2, nrow=length(temp$V1)/2) >> >> colnames(output)<- c("VNTRid",substr(file_list[i], 1, >> nchar(file_list[i]) -4)) >> >> for (j in 1:length(temp$V1)/2){ >> >> output[j,1]<- as.character(temp$V1)[odd[j]] >> >> output[j,2]<- as.character(temp$V1)[even[j]]} >> >> assign(gsub("-","_", substr(file_list[i], 1, nchar(file_list[i])-4)), >> as.data.frame(output)) >> >> } >> >> >> >> Yesterday, I intended to reshape the output file above from long to wide >> using VNTRid as key. >> >> Since not all files have the same number of rows, after reshaping, those >> file would not bind correctly using rbind function. >> >> One my way to work place this morning, I changed my intension; I will not >> reshape to wide format and actually like the long format I generated. I >> will read in a VNTR marker annotation file including VNTRid in first column >> and marker locations in human chromosomes in the second column, this >> annotation file should include all the VNTR markers. I know the VNTRid in >> the annotation file are same as the VNTRid in the 652 file I read in. >> >> >> >> Do you know a good way to merge all those 652 files (with two columns) ? >> >> >> >> Thank you, >> >> >> >> Ding >> >> >> >> >> >> #merge all 652 files into one file with VNTRid as first column, 2nd to >> 653th column are genotype with header >> >> #as sample ID, so >> >> >> >> *From:* Bert Gunter [mailto:bgunter.4...@gmail.com] >> *Sent:* Thursday, December 19, 2019 6:52 PM >> *To:* Yuan Chun Ding >> *Cc:* r-help@r-project.org >> *Subject:* Re: [R] data reshape >> >> >> ------------------------------ >> >> [Attention: This email came from an external source. Do not open >> attachments or click on links from unknown senders or unexpected emails.] >> ------------------------------ >> >> Did you even make an attempt to do this? -- or would you like us do all >> your work for you? >> >> >> >> If you made an attempt, show us your code and errors. >> >> If not, we usually expect you to try on your own first. >> >> If you have no idea where to start, perhaps you need to spend some more >> time with tutorials to learn basic R functionality before proceeding. >> >> >> >> Bert >> >> >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> >> >> >> On Thu, Dec 19, 2019 at 6:01 PM Yuan Chun Ding <ycd...@coh.org> wrote: >> >> Hi R users, >> >> I have a folder (called genotype) with 652 files; the file names are >> GTEX-1A3MV.out, GTEX-1A3MX.out, GTEX-1B8SF.out, etc; in each file, only >> one column of data without a header as below >> 201 >> 2/2 >> 238 >> 3/4 >> 245 >> 1/2 >> ..... >> 983255 >> 3/3 >> 983766 >> None >> >> >> A total of 20528 rows; >> >> I need to read all those 652 files in the genotype folder and then >> reshape the one column in each file as: >> SampleID 201 238 245 .... 983255 >> 983766 >> GTEX-1A3MV 2/2 3/4 1/2 3/3 >> None >> >> There are 10264 data columns plus the sample ID column, so 10265 columns >> in total after data reshaping. >> >> After reading those 652 file and reshape the one column in each file, I >> will stack them by the rbind function, then I have a file with a dimension >> of 653 row, 10265 column. >> >> >> Thank you, >> >> Ding >> >> ---------------------------------------------------------------------- >> ------------------------------------------------------------ >> -SECURITY/CONFIDENTIALITY WARNING- >> >> This message and any attachments are intended solely for the individual >> or entity to which they are addressed. This communication may contain >> information that is privileged, confidential, or exempt from disclosure >> under applicable law (e.g., personal health information, research data, >> financial information). Because this e-mail has been sent without >> encryption, individuals other than the intended recipient may be able to >> view the information, forward it to others or tamper with the information >> without the knowledge or consent of the sender. If you are not the intended >> recipient, or the employee or person responsible for delivering the message >> to the intended recipient, any dissemination, distribution or copying of >> the communication is strictly prohibited. If you received the communication >> in error, please notify the sender immediately by replying to this message >> and deleting the message and any accompanying files from your system. If, >> due to the security risks, you do not wish to rec >> eive further communications via e-mail, please reply to this message and >> inform the sender that you do not wish to receive further e-mail from the >> sender. (LCP301) >> ------------------------------------------------------------ >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> <https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!8ZMVp6KEM5teZqzisPd2_VC4UWgOKsPv57IKfSREDz7-G68yAohVXLf7Sf4L$> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> <https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!8ZMVp6KEM5teZqzisPd2_VC4UWgOKsPv57IKfSREDz7-G68yAohVXNnRAp_Y$> >> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.