Thank you very much! Just what I needed. Too bad I never got to understand what was wrong with my original code...
Thanks again! George Vega Yon +56 9 7 647 2552 http://ggvega.cl 2013/11/7 Romain Francois <rom...@r-enthusiasts.com>: > Le 07/11/2013 14:43, Romain Francois a écrit : > >> Le 07/11/2013 14:30, George Vega Yon a écrit : >>> >>> Romain, >>> >>> Thanks for your quick response. I've already received that suggestion, >>> but, besides of haven't ever used C++, I wanted to understand first >>> what am I doing wrong. >> >> >> For that type of code, it is actually quite simpler to learn c++ than it >> is to learn the macros and loose typing of the R interface. >> >>> Still, would you give me a small example, in R >>> C++, of: >>> >>> - Creating a generic vector "L1" of size N >>> - Creating a data.frame "D" and increasing its the number of rows >>> of it >>> - Storing the data.frame "D" in the first element of "L1" >>> >>> I would be very gratefull if you can do that. >> >> >> #include <Rcpp.h> >> using namespace Rcpp ; >> >> // [[Rcpp::export]] >> List example(int N){ >> List out(N) ; >> >> // let's first accumulate data in these two std::vector >> std::vector<double> x ; >> std::vector<int> y ; >> for( int i=0; i<30; i++){ >> x.push_back( sqrt( i ) ) ; >> y.push_back( i ) ; >> } >> >> // Now let's create a data frame >> DataFrame df = DataFrame::create( >> _["x"] = x, >> _["y"] = y >> ) ; >> >> // storing df as the first element of out >> out[0] = df ; >> >> return out ; >> } > > > Forgot to mention. You would just put the code above in a .cpp file and call > sourceCpp on it. > > sourceCpp( "file.cpp" ) > example( 3 ) > > >> You can also do it like this acknowleding what a data frame really is >> (just a list of vectors): >> >> List df = List::create( >> _["x"] = x, >> _["y"] = y >> ) ; >> df.attr( "class" ) = "data.frame" ; >> df.attr( "row.names") = IntegerVector::create( >> IntegerVector::get_na(), -30 ) ; >> >> >> The key thing here is that we accumulate data into std::vector<double> >> and std::vector<int> which know how to grow efficiently. Looping around >> with SET_LENGTH will allocate and copy data at each iteration of the >> loop which will lead to disastrous performance. >> >> Romain >> >>> Thanks again! >>> >>> George Vega Yon >>> +56 9 7 647 2552 >>> http://ggvega.cl >>> >>> >>> 2013/11/7 Romain Francois <rom...@r-enthusiasts.com>: >>>> >>>> Hello, >>>> >>>> Any particular reason you're not using Rcpp? You would have access to >>>> nice >>>> abstraction instead of these MACROS all over the place. >>>> >>>> The cost of these abstractions is close to 0. >>>> >>>> Looping around and SET_LENGTH is going to be quite expensive. I would >>>> urge >>>> you to accumulate data in data structures that know how to grow >>>> efficiently, >>>> i.e. a std::vector and then convert that to an R vector when you're done >>>> with them. >>>> >>>> Romain >>>> >>>> Le 07/11/2013 14:03, George Vega Yon a écrit : >>>> >>>>> Hi! >>>>> >>>>> I didn't wanted to do this but I think that this is the easiest way >>>>> for you to understand my problem (thanks again for all the comments >>>>> that you have made). Here is a copy of the function that I'm working >>>>> on. This may be tedious to analyze, so I understand if you don't feel >>>>> keen to give it a time. Having dedicated many hours to this (as a new >>>>> user of both C and R C API), I would be very pleased to know what am I >>>>> doing wrong here. >>>>> >>>>> G0 is a Nx2 matrix. The first column is a group id (can be shared with >>>>> several observations) and the second tells how many individuals are in >>>>> that group. This matrix can look something like this: >>>>> >>>>> id_group nreps >>>>> 1 3 >>>>> 1 3 >>>>> 1 3 >>>>> 2 1 >>>>> 3 1 >>>>> 4 2 >>>>> 5 1 >>>>> 6 1 >>>>> 4 2 >>>>> ... >>>>> >>>>> L0 is list of two column data.frames with different sizes. The first >>>>> column (id) are row indexes (with values 1 to N) and the second column >>>>> are real numbers. L0 can look something like this >>>>> [[1]] >>>>> id lambda >>>>> 3 0.5 >>>>> 15 0.3 >>>>> 25 0.2 >>>>> [[2]] >>>>> id lambda >>>>> 15 0.8 >>>>> 40 0.2 >>>>> ... >>>>> [[N]] >>>>> id lambda >>>>> 80 1 >>>>> >>>>> TE0 is a int scalar in {0,1,2} >>>>> >>>>> T0 is a dichotomous vector of length N that can look something like >>>>> this >>>>> [1] 0 1 0 1 1 1 0 ... >>>>> [N] 1 >>>>> >>>>> L1 (the expected output) is a modified version of L0, that, for >>>>> instance can look something like this (note the rows marked with "*") >>>>> >>>>> [[1]] >>>>> id lambda >>>>> 3 0.5 >>>>> *15 0.15 (15 was in the same group of 50, so I added this new row and >>>>> divided the value of lambda by two) >>>>> 25 0.2 >>>>> *50 0.15 >>>>> [[2]] >>>>> id lambda >>>>> 15 0.8 >>>>> 40 0.2 >>>>> ... >>>>> [[N]] >>>>> id lambda >>>>> *80 0.333 (80 shared group id with 30 and 100, so lambda is divided >>>>> by 3) >>>>> *30 0.333 >>>>> *100 0.333 >>>>> >>>>> That said, the function is as follows >>>>> >>>>> SEXP distribute_lambdas( >>>>> SEXP G0, // Groups ids (matrix of Nx2). First column = Group Id, >>>>> second column: Elements in the group >>>>> SEXP L0, // List of N two-column dataframes with different >>>>> number of >>>>> rows >>>>> SEXP TE0, // Treatment effect (int scalar): ATE(0) ATT(1) ATC(2) >>>>> SEXP T0 // Treat var (bool vector, 0/1, of size N) >>>>> ) >>>>> { >>>>> >>>>> int i, j, l, m; >>>>> const int *G = INTEGER_POINTER(PROTECT(G0 = AS_INTEGER(G0 ))); >>>>> const int *T = INTEGER_POINTER(PROTECT(T0 = AS_INTEGER(T0 ))); >>>>> const int *TE= INTEGER_POINTER(PROTECT(TE0= AS_INTEGER(TE0))); >>>>> double *L, val; >>>>> int *I, nlambdas, nreps; >>>>> >>>>> const int n = length(T0); >>>>> >>>>> PROTECT_INDEX pin0, pin1; >>>>> SEXP L1; >>>>> PROTECT(L1 = allocVector(VECSXP,n)); >>>>> SEXP id, lambda; >>>>> >>>>> // Fixing size >>>>> for(i=0;i<n;i++) >>>>> { >>>>> SET_VECTOR_ELT(L1, i, allocVector(VECSXP, 2)); >>>>> // SET_VECTOR_ELT(VECTOR_ELT(L1,i), 0, NEW_INTEGER(100)); >>>>> // SET_VECTOR_ELT(VECTOR_ELT(L1,i), 1, NEW_NUMERIC(100)); >>>>> } >>>>> >>>>> // For over the list, i.e observations >>>>> for(i=0;i<n;i++) >>>>> { >>>>> >>>>> R_CheckUserInterrupt(); >>>>> >>>>> // Checking if has to be analyzed. >>>>> if ( >>>>> ((TE[0] == 1 & !T[i]) | (TE[0] == 2 & T[i])) | >>>>> (length(VECTOR_ELT(L0,i)) != 2) >>>>> ) >>>>> { >>>>> SET_VECTOR_ELT(L1,i,R_NilValue); >>>>> continue; >>>>> } >>>>> >>>>> // Checking how many rows does the i-th data.frame has >>>>> nlambdas = length(VECTOR_ELT(VECTOR_ELT(L0,i),0)); >>>>> >>>>> // Pointing to the data.frame's origianl values >>>>> I = >>>>> INTEGER_POINTER(AS_INTEGER(PROTECT(VECTOR_ELT(VECTOR_ELT(L0,i),0)))); >>>>> L = >>>>> NUMERIC_POINTER(AS_NUMERIC(PROTECT(VECTOR_ELT(VECTOR_ELT(L0,i),1)))); >>>>> >>>>> // Creating a copy of the pointed values >>>>> PROTECT_WITH_INDEX(id = >>>>> duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),0)), >>>>> &pin0); >>>>> >>>>> PROTECT_WITH_INDEX(lambda=duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),1)), >>>>> &pin1); >>>>> >>>>> // Over the rows of the i-th data.frame >>>>> nreps=0; >>>>> for(l=0;l<nlambdas;l++) >>>>> { >>>>> // If the current lambda id is repeated, ie ther are more >>>>> individuals >>>>> // with the same covariates, then enter. >>>>> if (G[n+I[l]-1] > 1) >>>>> { >>>>> /* Changing the length of the object */ >>>>> REPROTECT(SET_LENGTH(id, length(lambda) + G[n+I[l]-1] -1), >>>>> pin0); >>>>> REPROTECT(SET_LENGTH(lambda,length(lambda) + G[n+I[l]-1] -1), >>>>> pin1); >>>>> >>>>> // Getting the new value >>>>> val = L[l]/G[n+I[l] - 1]; >>>>> REAL(lambda)[l] = val; >>>>> >>>>> // Looping over the full set of groups >>>>> m = -1,j = -1; >>>>> while(m < (G[n+I[l]-1] - 1)) >>>>> { >>>>> // Looking for individuals in the same group >>>>> if (G[++j] != G[I[l]-1]) continue; >>>>> >>>>> // If it is the current lambda, then do not asign it >>>>> if (j == (I[l] - 1)) continue; >>>>> >>>>> INTEGER(id)[length(id) - (G[n+I[l]-1] - 1) + ++m] = j+1; >>>>> REAL(lambda)[length(id) - (G[n+I[l]-1] - 1) + m] = val; >>>>> } >>>>> >>>>> nreps+=1; >>>>> } >>>>> } >>>>> >>>>> if (nreps) >>>>> { >>>>> // Replacing elements from of the list (modified) >>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id)); >>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda)); >>>>> } >>>>> else { >>>>> // Setting the list with the old elements >>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, >>>>> duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),0))); >>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, >>>>> duplicate(VECTOR_ELT(VECTOR_ELT(L0,i),1))); >>>>> } >>>>> >>>>> // Unprotecting elements >>>>> UNPROTECT(4); >>>>> } >>>>> >>>>> Rprintf("Exito\n") ; >>>>> UNPROTECT(4); >>>>> >>>>> return L1; >>>>> } >>>>> >>>>> Thanks again in advanced. >>>>> >>>>> George Vega Yon >>>>> +56 9 7 647 2552 >>>>> http://ggvega.cl >>>>> >>>>> 2013/11/5 George Vega Yon <g.vega...@gmail.com>: >>>>>> >>>>>> >>>>>> Either way, understanding that it may not be the best way of do it, is >>>>>> there anything wrong in what I'm doing?? >>>>>> George Vega Yon >>>>>> +56 9 7 647 2552 >>>>>> http://ggvega.cl >>>>>> >>>>>> >>>>>> 2013/11/5 Gabriel Becker <gmbec...@ucdavis.edu>: >>>>>>> >>>>>>> >>>>>>> George, >>>>>>> >>>>>>> My point is you don't need to create them and then grow them.... >>>>>>> >>>>>>> >>>>>>> for(i=0;i<n;i++) >>>>>>> { >>>>>>> // Creating the "id" and "lambda" vectors. I do this in every >>>>>>> repetition >>>>>>> of >>>>>>> // the loop. >>>>>>> >>>>>>> // ... Some other instructions where I set the value of an >>>>>>> integer >>>>>>> // z, which tells how much do the vectors have to grow ... >>>>>>> >>>>>>> PROTECT(id=allocVector(INTSXP, 4 +z)); >>>>>>> PROTECT(lambda=allocVector(REALSXP, 4 +z)); >>>>>>> >>>>>>> >>>>>>> // ... some lines where I fill the vectors ... >>>>>>> >>>>>>> // Storing the new vectors at the i-th element of the list >>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id)); >>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda)); >>>>>>> >>>>>>> // Unprotecting the "id" and "lambda" vectors >>>>>>> UNPROTECT(2); >>>>>>> } >>>>>>> >>>>>>> ~G >>>>>>> >>>>>>> >>>>>>> On Tue, Nov 5, 2013 at 1:56 PM, George Vega Yon <g.vega...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Gabriel, >>>>>>>> >>>>>>>> While the length (in terms of number of SEXP elements it stores) >>>>>>>> of L1 >>>>>>>> doesn't changes, the vectors within L1 do (sorry if I didn't >>>>>>>> explained >>>>>>>> it well before). >>>>>>>> >>>>>>>> The post was about a SEXP object that grows, in my case, every >>>>>>>> pair of >>>>>>>> vectors in L1 (id and lambda) can change lengths, this is why I need >>>>>>>> to reprotect them. I populate the i-th element of L1 by creating the >>>>>>>> vectors "id" and "lambda", setting the length of these according to >>>>>>>> some rule (that's the part where lengths change)... here is a >>>>>>>> reduced >>>>>>>> form of my code: >>>>>>>> >>>>>>>> //////////////////////////////////////// C >>>>>>>> //////////////////////////////////////// >>>>>>>> const int = length(L0); >>>>>>>> SEXP L1; >>>>>>>> PROTECT(L1 = allocVector(VECSXP,n)); >>>>>>>> SEXP id, lambda; >>>>>>>> >>>>>>>> // Fixing size >>>>>>>> for(i=0;i<n;i++) >>>>>>>> SET_VECTOR_ELT(L1, i, allocVector(VECSXP, 2)); >>>>>>>> >>>>>>>> for(i=0;i<n;i++) >>>>>>>> { >>>>>>>> // Creating the "id" and "lambda" vectors. I do this in every >>>>>>>> repetition >>>>>>>> of >>>>>>>> // the loop. >>>>>>>> PROTECT_WITH_INDEX(id=allocVector(INTSXP, 4), &ipx0); >>>>>>>> PROTECT_WITH_INDEX(lambda=allocVector(REALSXP, 4), &ipx1); >>>>>>>> >>>>>>>> // ... Some other instructions where I set the value of an >>>>>>>> integer >>>>>>>> // z, which tells how much do the vectors have to grow ... >>>>>>>> >>>>>>>> REPROTECT(SET_LENGTH(id, length(lambda) + z), ipx0); >>>>>>>> REPROTECT(SET_LENGTH(lambda,length(lambda) + z), ipx1); >>>>>>>> >>>>>>>> // ... some lines where I fill the vectors ... >>>>>>>> >>>>>>>> // Storing the new vectors at the i-th element of the list >>>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 0, duplicate(id)); >>>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1, i), 1, duplicate(lambda)); >>>>>>>> >>>>>>>> // Unprotecting the "id" and "lambda" vectors >>>>>>>> UNPROTECT(2); >>>>>>>> } >>>>>>>> >>>>>>>> UNPROTECT(1); >>>>>>>> >>>>>>>> return L1; >>>>>>>> //////////////////////////////////////// C >>>>>>>> //////////////////////////////////////// >>>>>>>> >>>>>>>> I can't set the length from the start because every pair of >>>>>>>> vectors in >>>>>>>> L1 have different lengths, lengths that I cannot tell before >>>>>>>> starting >>>>>>>> the loop. >>>>>>>> >>>>>>>> Thanks for your help, >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> George Vega Yon >>>>>>>> +56 9 7 647 2552 >>>>>>>> http://ggvega.cl >>>>>>>> >>>>>>>> >>>>>>>> 2013/11/5 Gabriel Becker <gmbec...@ucdavis.edu>: >>>>>>>>> >>>>>>>>> >>>>>>>>> George, >>>>>>>>> >>>>>>>>> I don't see the relevance of the stackoverflow post you linked. >>>>>>>>> In the >>>>>>>>> post, >>>>>>>>> the author wanted to change the length of an existing "mother list" >>>>>>>>> (matrix, >>>>>>>>> etc), while you specifically state that the length of L1 will not >>>>>>>>> change. >>>>>>>>> >>>>>>>>> You say that the child lists (vectors if they are >>>>>>>>> INTSXP/REALSXP) are >>>>>>>>> variable, but that is not what the linked post was about unless >>>>>>>>> I am >>>>>>>>> completely missing something. >>>>>>>>> >>>>>>>>> I can't really say more without knowing the details of how the >>>>>>>>> vectors >>>>>>>>> are >>>>>>>>> being created and why they cannot just have the right length >>>>>>>>> from the >>>>>>>>> start. >>>>>>>>> >>>>>>>>> As for the error, that is a weird one. I imagine it means that a >>>>>>>>> SEXP >>>>>>>>> thinks >>>>>>>>> that it has a type other than ones defined in Rinternals. I can't >>>>>>>>> speak >>>>>>>>> to >>>>>>>>> how that could have happened from what you posted though. >>>>>>>>> >>>>>>>>> Sorry I can't be of more help, >>>>>>>>> ~G >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Nov 4, 2013 at 8:00 PM, George Vega Yon >>>>>>>>> <g.vega...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dear R-devel, >>>>>>>>>> >>>>>>>>>> A couple of weeks ago I started to use the R C API for package >>>>>>>>>> development. Without knowing much about C, I've been able to write >>>>>>>>>> some routines sucessfully... until now. >>>>>>>>>> >>>>>>>>>> My problem consists in dynamically creating a list ("L1") of lists >>>>>>>>>> using .Call, the tricky part is that each element of the "mother >>>>>>>>>> list" >>>>>>>>>> contains two vectors (INTSXP and REALEXP types) with varying >>>>>>>>>> sizes; >>>>>>>>>> sizes that I set while I'm looping over another list's ("L1") >>>>>>>>>> elements >>>>>>>>>> (input list). The steps I've follow are: >>>>>>>>>> >>>>>>>>>> FIRST: Create the "mother list" of size "n=length(L0)" (doesn't >>>>>>>>>> change) and protect it as >>>>>>>>>> PROTECT(L1=allocVector(VECEXP, length(L0))) >>>>>>>>>> and filling it with vectors of length two: >>>>>>>>>> for(i=0;i<n;i++) SET_VECTOR_ELT(L1,i, allocVector(VECSXP, 2)); >>>>>>>>>> >>>>>>>>>> then, for each element of the mother list: >>>>>>>>>> >>>>>>>>>> for(i=0;i<n;i++) { >>>>>>>>>> >>>>>>>>>> SECOND: By reading this post in Stackoverflow >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://stackoverflow.com/questions/7458364/growing-an-r-matrix-inside-a-c-loop/7458516#7458516 >>>>>>>>>> >>>>>>>>>> I understood that it was necesary to (1) create the "child >>>>>>>>>> lists" and >>>>>>>>>> protecting them with PROTECT_WITH_INDEX, and (2) changing its size >>>>>>>>>> using SETLENGTH (Rf_lengthgets) and REPROTECT ing the lists in >>>>>>>>>> order >>>>>>>>>> to tell the GC that the vectors had change. >>>>>>>>>> >>>>>>>>>> THIRD: Once my two vectors are done ("id" and "lambda"), assign >>>>>>>>>> them >>>>>>>>>> to the i-th element of the "mother list" L1 using >>>>>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1,i), 0, duplicate(id)); >>>>>>>>>> SET_VECTOR_ELT(VECTOR_ELT(L1,i), 1, duplicate(lambda)); >>>>>>>>>> >>>>>>>>>> and unprotecting the elements protected with index: UNPROTECT(2); >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> FOURTH: Unprotecting the "mother list" (L1) and return it to R >>>>>>>>>> >>>>>>>>>> With small datasets this works fine, but after trying with bigger >>>>>>>>>> ones >>>>>>>>>> R (my code) keeps failing and returning a strange error that I >>>>>>>>>> haven't >>>>>>>>>> been able to identify (or find in the web) >>>>>>>>>> >>>>>>>>>> "unimplemented type (29) in 'duplicate'" >>>>>>>>>> >>>>>>>>>> This happens right after I try to use the returned list from my >>>>>>>>>> routine (trying to print it or building a data-frame). >>>>>>>>>> >>>>>>>>>> Does anyone have an idea of what am I doing wrong? >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> PS: I didn't wanted to copy the entire function... but if you >>>>>>>>>> need it >>>>>>>>>> I can do it. >>>>>>>>>> >>>>>>>>>> George Vega Yon >>>>>>>>>> +56 9 7 647 2552 >>>>>>>>>> http://ggvega.cl >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-devel@r-project.org mailing list >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Gabriel Becker >>>>>>>>> Graduate Student >>>>>>>>> Statistics Department >>>>>>>>> University of California, Davis >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Gabriel Becker >>>>>>> Graduate Student >>>>>>> Statistics Department >>>>>>> University of California, Davis >>>>> >>>>> >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>> >>>> >>>> >>>> -- >>>> Romain Francois >>>> Professional R Enthusiast >>>> +33(0) 6 28 91 30 30 >>>> >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >> >> > > > -- > Romain Francois > Professional R Enthusiast > +33(0) 6 28 91 30 30 > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel