I don't think memcpy works well for VECSXP. The elements being overwritten need to have their reference counts decreased and the new elements need to have theirs increased.
Also, I don't entirely know how accurate everything I'm about to say is, but I think you need to be using SET_TRUELENGTH and SET_GROWABLE_BIT along with SETLENGTH. There's an example here: https://github.com/wch/r-source/blob/744b5d34e1b8eb839e5d49d91ab21c1fe6800856/src/main/subassign.c#L257 The example uses SET_STDVEC_LENGTH which shouldn't be used, just replace it with SETLENGTH. So in your code, I'd replace: SETLENGTH(modelspace, nUnique); with SET_GROWABLE_BIT(modelspace); SET_TRUELENGTH(modelspace, nModels); SETLENGTH(modelspace, nUnique); On Wed, Jan 15, 2025, 10:30 Merlise Clyde, Ph.D. <cl...@duke.edu> wrote: > Thanks for the added explanation Iris and Tomas! > > So looking at the code for xlengthgets, it does appear that I may take a > memory hit for multiple large objects due to the second allocation before > the old objects are possibly garbage collected. There are about 12 such > instances per function that are returned (I do use a counter for keeping > track of the number of PROTECTED and to UNPROTECT for bookkeeping :-). > For memory limited machines, the alloc/copy was a problem for memory usage > - and if I recall was one of the reasons in 2008 I switched to SETLENGTH, > which doesn't seem to do an allocation ??? If there is going to be an > absolute ban on SETLENGTH in packages I'll probably need to address memory > management differently for those cases. > > I did see a note before the function def'n of xlengthgets: > > /* (if it is vectorizable). We could probably be fairly */ > /* clever with memory here if we wanted to. */ > > It would seem that memcpy would be more efficient for at least some of the > types (REALSPX, INTSPX) unless I am missing something - but any way to be > more clever with VECSPX ? > > best, > Merlise > > > > Merlise Clyde (she/her/hers) > Professor of Statistical Science and Director of Graduate Studies > Duke University > > ________________________________________ > From: Iris Simmons <ikwsi...@gmail.com> > Sent: Wednesday, January 15, 2025 1:00 AM > To: Merlise Clyde, Ph.D. <cl...@duke.edu> > Cc: r-package-devel@r-project.org <r-package-devel@r-project.org> > Subject: Re: [R-pkg-devel] Replacement for SETLENGTH > > Hi Merlise! > > > Referring to here: > > > https://github.com/wch/r-source/blob/bb5a829466f77a3e1d03541747d149d65e900f2b/src/main/builtin.c#L834 > > It seems as though the object is only re-used if the new length is > equal to the old length. > > If you use Rf_lengthgets, you will need to protect the return value. > The code you wrote that uses protect indexes looks correct, and the > reprotect is good because you no longer need the old object. > > 2 is the correct amount to unprotect. PROTECT and PROTECT_WITH_INDEX > (as far as I know) are the only functions that increase the size of > the protect stack, and so the only calls that need to be unprotected. > Typically, people define `int nprotect = 0;` at the start of their > functions, add `nprotect++;` after each PROTECT and PROTECT_WITH_INDEX > call, and add `UNPROTECT(nprotect);` immediately before each return or > function end. That makes it easier to keep track. > > I typically use R_PreserveObject and R_ReleaseObject to protect > objects without a need to bind them somewhere in my package's > namespace. This would be that .onLoad() uses R_PreserveObject to > protect some objects, and .onUnload uses R_ReleaseObject to release > the protected objects. I probably would not use that for what you're > describing. > > > Regards, > Iris > > On Tue, Jan 14, 2025 at 11:26 PM Merlise Clyde, Ph.D. <cl...@duke.edu> > wrote: > > > > I am trying to determine the best way to eliminate the use of SETLENGTH > to truncate over allocated vectors in my package BAS to eliminate the NOTES > about non-API calls in anticipation of R 4.5.0. > > > > From WRE: "At times it can be useful to allocate a larger initial > result vector and resize it to a shorter length if that is sufficient. The > functions Rf_lengthgets and Rf_xlengthgets accomplish this; they are > analogous to using length(x) <- n in R. Typically these functions return a > freshly allocated object, but in some cases they may re-use the supplied > object." > > > > it looks like using > > > > x = Rf_lengthgets(x, newsize); > > SET_VECTOR_ELT(result, 0, x); > > > > before returning works to resize without a performance hit that incurs > with a copy. (will this always re-use the supplied object if newsize < old > size?) > > > > There is no mention in section 5.9.2 about the need for re-protection of > the object, but it seems to be mentioned in some packages as well as a > really old thread about SET_LENGTH that looks like a non-API MACRO to > lengthgets, > > > > indeed if I call gc() and then rerun my test I have had some > non-reproducible aborts in R Studio on my M3 Mac (caught once in R -d lldb) > > > > Do I need to do something more like > > > > PROTECT_INDEX ipx0;. > > PROTECT_WITH_INDEX(x0 = allocVector(REALSXP, old_size), &ipx0); > > > > PROTECT_INDEX ipx1;. > > PROTECT_WITH_INDEX(x1 = allocVector(REALSXP, old_size), &ipx1); > > > > # fill in values in x0 and x1up to new_size (random) < old_size > > ... > > REPROTECT(x0 = Rf_lengthgets(x0, new_size), ipx0); > > REPROTECT(x1 = Rf_lengthgets(x1, new_size), ipx1); > > > > SET_VECTOR_ELT(result, 0, x0); > > SET_VECTOR_ELT(result, 1, x1); > > ... > > UNPROTECT(2); # or is this 4? > > return(result); > > > > > > There is also a mention in WRE of R_PreserveObject and R_ReleaseObject - > > > > looking for advice if this is needed, or which approach is better/more > stable to replace SETLENGTH? (I have many many instances that need to be > updated, so trying to get some clarity here before updating and running > code through valgrind or other sanitizers to catch any memory issues before > submitting an update to CRAN. > > > > best, > > Merlise > > > > > > > > > > > > > > > > ______________________________________________ > > R-package-devel@r-project.org mailing list > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!OToaGQ!ohDoxcAn5uIC25d42XhBz8Kd4YftOJDBoEW1NK9FOmgZpcmv0XIy5fQRm24-s_D8m9O_lR6jo6FcKiA$ [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel