The correct code is: for (i in 1:length(SeparatedGroupsofmealsCombs)) { ...
I had mentioned that this is untested, but the error is so obvious ... B. > On Nov 18, 2017, at 4:40 PM, Allaisone 1 <allaiso...@hotmail.com> wrote: > > > The loop : > > AllMAFs <- list() > > for (i in length(SeparatedGroupsofmealsCombs) { > AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf( > tabulate( x+1) )) > } > > gives these errors (I tried this many times and I'm sure I copied it > entirely) :- > Error in apply(SeparatedGroupsofmealsCombs[[i]], 2, function(x) > maf(tabulate(x + : > object 'i' not found > > } > Error: unexpected '}' in " }" > > > The lapply function : > results<-lapply(SeparatedGroupsofmealsCombs , function(x)maf(tabulate(x+1))) > gives this error :- > Error in FUN(left, right) : non-numeric argument to binary operator > > I have been trying since yesterday but but until now I'm not able to identify > the correct syntax. > > > > > From: David Winsemius <dwinsem...@comcast.net> > Sent: 18 November 2017 20:06:56 > To: Allaisone 1 > Cc: Boris Steipe; R-help > Subject: Re: [R] Complicated analysis for huge databases > > > > On Nov 18, 2017, at 1:52 AM, Allaisone 1 <allaiso...@hotmail.com> wrote: > > > > Although the loop seems to be formulated correctly I wonder why > > it gives me these errors : > > > > -object 'i' not found > > - unexpected '}' in "}" > > You probably did not copy the entire code offered. But we cannot know since > you did not "show your code", not=r did you post complete error messages. > Both of these practices are strongly recommended by the Posting Guide. Please > read it (again?). > > -- > David. > > > > > > the desired output is expected to be very large as for each dataframe in > > the list of dataframes I expect to see maf value for each of the 600 > > columns! and this is only for > > > > for one dataframe in the list .. I have around 150-200 dataframes.. not > > sure how R will store these results.. but first I need the analysis to be > > done correctly. The final output has to be something like this :- > > > > > >> mafsforeachcolumns(I,II,...600)foreachcombination > > > > MealsCombinations Cust.ID I II III > > IV ...... 600 > > 1 33-55 1 0.124 0.10 > > 0.65 0.467 > > 3 > > 5 > > > > 2 44-66 7 0.134 0.43 > > 0.64 0.479 > > 4 > > 9 > > > > . > > > > . > > > > ~180 dataframes > > > > > > ________________________________ > > From: Boris Steipe <boris.ste...@utoronto.ca> > > Sent: 18 November 2017 00:35:16 > > To: Allaisone 1; R-help > > Subject: Re: [R] Complicated analysis for huge databases > > > > Something like the following? > > > > AllMAFs <- list() > > > > for (i in length(SeparatedGroupsofmealsCombs) { > > AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, > > function(x)maf( tabulate( x+1) )) > > } > > > > > > (untested, of course) > > Also the solution is a bit generic since I don't know what the output of > > maf() looks like in your case, and I don't understand why you use tabulate > > because I would have assumed that's what maf() does - but that's not for me > > to worry about :-) > > > > > > > > B. > > > > > > > >> On Nov 17, 2017, at 7:15 PM, Allaisone 1 <allaiso...@hotmail.com> wrote: > >> > >> > >> Thanks Boris , this was very helpful but I'm struggling with the last part. > >> > >> 1) I combined the first 2 columns :- > >> > >> > >> library(tidyr) > >> SingleMealsCode <-unite(MyData, MealsCombinations, c(MealA, MealB), > >> remove=FALSE) > >> SingleMealsCode <- SingleMealsCode[,-2] > >> > >> 2) I separated this dataframe into different dataframes based on > >> "MealsCombination" > >> column so R will recognize each meal combination separately : > >> > >> SeparatedGroupsofmealsCombs <- > >> split(SingleMealCode,SingleMealCode$MealsCombinations) > >> > >> after investigating the structure of "SeparatedGroupsofmealsCombs" , I can > >> see > >> a list of different databases, each of which represents a different Meal > >> combinations which is great. > >> > >> No, I'm struggling with the last part, how can I run the maf code for all > >> dataframes? > >> > >> when I run this code as before :- > >> > >> maf <- apply(SeparatedGroupsofmealsCombs, 2, function(x)maf(tabulate(x+1))) > >> > >> an error message says : dim(X) must have a positive length . I'm not sure > >> which length > >> I need to specify.. any suggestions to correct this syntax ? > >> > >> Regards > >> Allaisone > >> From: Boris Steipe <boris.ste...@utoronto.ca> > >> Sent: 17 November 2017 21:12:06 > >> To: Allaisone 1 > >> Cc: R-help > >> Subject: Re: [R] Complicated analysis for huge databases > >> > >> Combine columns 1 and 2 into a column with a single ID like "33.55", > >> "44.66" and use split() on these IDs to break up your dataset. Iterate > >> over the list of data frames split() returns. > >> > >> > >> B. > >> > >>> On Nov 17, 2017, at 12:59 PM, Allaisone 1 <allaiso...@hotmail.com> wrote: > >>> > >>> > >>> Hi all .., > >>> > >>> > >>> I have a large dataset of around 600,000 rows and 600 columns. The first > >>> col is codes for Meal A, the second columns is codes for Meal B. The > >>> third column is customers IDs where each customer had a combination of > >>> meals. Each column of the rest columns contains values 0,1,or 2. The > >>> dataset is organised in a way so that the first group of customers had > >>> similar meals combinations, this is followed by another group of > >>> customers with similar meals combinations but different from the first > >>> group and so on. The dataset looks like this :- > >>> > >>> > >>>> MyData > >>> > >>> Meal A Meal B Cust.ID I II III IV > >>> ...... 600 > >>> > >>> 1 33 55 1 0 1 > >>> 2 0 > >>> > >>> 2 33 55 3 1 0 > >>> 2 2 > >>> > >>> 3 33 55 5 2 1 > >>> 1 2 > >>> > >>> 4 44 66 7 0 2 > >>> 2 2 > >>> > >>> 5 44 66 4 1 1 > >>> 0 1 > >>> > >>> 6 44 66 9 2 0 > >>> 1 2 > >>> > >>> . > >>> > >>> . > >>> > >>> 600,000 > >>> > >>> > >>> > >>> I wanted to find maf() for each column(from 4 to 600) after calculating > >>> the frequency of the 3 values (0,1,2) but this should be done group by > >>> group (i.e. group(33-55) : rows 1:3 then group(44-66) :rows 4:6 and so > >>> on). > >>> > >>> > >>> I can do the analysis for the entire column but not group by group like > >>> this : > >>> > >>> > >>> MAF <- apply(MyData[,4:600], 2, function(x)maf(tabulate(x+1))) > >>> > >>> How can I modify this code to tell R to do the analysis group by group > >>> for each column so I get maf value for 33-55 group of clolumn I, then maf > >>> value for group 44-66 in the same column I,then the rest of groups in > >>> this column and do the same for the remaining columns. > >>> > >>> In fact, I'm interested in doing this analysis for only 300 columns but > >>> all of the 600 columns. > >>> I have another sheet contains names of columns of interest like this : > >>> > >>>> ColOfinterest > >>> > >>> Col > >>> I > >>> IV > >>> V > >>> . > >>> . > >>> 300 > >>> > >>> Any one would help with the best combination of syntax to perform this > >>> complex analysis? > >>> > >>> Regards > >>> Allaisone > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > 'Any technology distinguishable from magic is insufficiently advanced.' > -Gehm's Corollary to Clarke's Third Law ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.