I think the last should be: output <- cast(i.melt, Item_name ~ label, sum)
On Fri, Jun 12, 2009 at 9:27 AM, Jon Loehrke <jloeh...@umassd.edu> wrote: > Hi R list, > I would like to automate, or speed up the process from which I take > several separate datasets, stored in .csv formate, import and merge > them by a common variable. So far I have greatly sped up the loading > process but cannot think of a way to automate the merger of all > datasets into a common data.frame. > My apologies if this has been covered, any R search suggestions are > appreciated. > > # All scripts function out of the base directory > rm(list=ls()) > setwd('/Users/myuser/Documents/workfolder/') > > # Check files and list all .csv in directory > files<-list.files() > files<-files[grep('.csv', files)] > # Create labels for each file (ex. June08.csv becomes June08) > labels<-gsub('.csv', '', files) > > # Load all .csv datasets and assign name > > item<-vector() # preallocate an index of all items in datasets > for(i in 1:length(files)){ > X<-read.csv(files[i]) > item<-union(item, X$Item_Name) > assign(labels[i], X) > } > # What is loaded > ls() > # [1] "files" "i" "item" "June01" "June02" "June03" > "labels" > > # What does everything look like? > str(June03) > #'data.frame': 992 obs. of 8 variables: > # $ Item_Name : Factor w/ 992 levels "Birds","Fish",..: 1 2 3 4 > 5 6 7 8 9 10 ... > # $ Occurance : int 30 30 50 450 75 550 100 500 250 75 ... > > str(June01) > #'data.frame': 819 obs. of 8 variables: > # $ Item_Name : Factor w/ 819 levels "Birds","Turtles",..: 1 2 > 3 4 5 6 7 8 9 10 ... > # $ Occurance : int 30 50 450 750 550 100 500 250 275 450 ... > > # Here is where I'm stuck... > #I would like to: > # Create a data.frame with an index column composed of the union of > all items > # Create columns in the frame by a merger of the 'Occurance' in each > loaded dataset and are labeled by their name (eg. June01) > # Automate this procedure so that I do not have to manuualy type in > each column addition when I have a new dataset. > > # This is my current strategy, but when I have new datasets I have to > mannually setup the preallocation and merger > > allData<-data.frame(Item=item, June01 =NA, June02=NA, June03 =NA) > allData[match(June01$Item_Name, allData$Item ),]$June01 <- > June01$Occurance > allData[match(June02$Item_Name, allData$Item ),]$June02 <- > June02$Occurance > allData[match(June03$Item_Name, allData$Item ),]$June03 <- > June03$Occurance > > # Any help to automate this process is greatly appreciated!!! > > sessionInfo() > #R version 2.9.0 (2009-04-17) > #i386-apple-darwin8.11.1 > # > #locale: > #en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > # > #attached base packages: > #[1] stats graphics grDevices utils datasets methods base > > > Jon Loehrke > Graduate Research Assistant > Department of Fisheries Oceanography > School for Marine Science and Technology > University of Massachusetts > 200 Mill Road, Suite 325 > Fairhaven, MA 02719 > jloeh...@umassd.edu > T 508-910-6393 > F 508-910-6396 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.