It appears that all these gunclone_deep() etc aren't documented in any proper way (and this is a root cause of this; same applies to memleaks we have in libsingular interface). Can you post a link to docs you are reading?
I can only find an email exchange from 2018 where similar issues are discussed. It seems that gunclone_deep should be used sparingly. The last message,from Jeroen Demeyer, in that thread says: > On 2018-08-20 18:14, Karim Belabas wrote: In libpari code, it's usually easy to explicitly call a destructor [ e.g. obj_free ] any time a function such as 'ellinit' was used > It's not easy if the list of functions "such as 'ellinit'" is not known. <https://pari.math.u-bordeaux.fr/archives/pari-dev-1808/msg00012.html> On 2 September 2024 03:30:14 BST, Marc Culler <marc.cul...@gmail.com> wrote: >Unsurprisingly, changing gunclone to gunclone_deep in Gen.__dealloc__ is >not a full fix. It stops this leak but it also causes a crash when running >the doctests. On macOS the crash error is "freeing a pointer which was not >allocated". On linux the error is "corrupted size vs. prev_size in >fastbins". > >I won't force this list to participate in the full debugging experience. >The basic problem has been identified. The remedy is far from clear. > >- Marc > >On Sun, Sep 1, 2024, 5:37 PM Marc Culler <marc.cul...@gmail.com> wrote: > >> This is now fixed in cypari. The fix consisted of replacing gunclone by >> gunclone_deep in the Gen.__dealloc__ method. The same change would >> probably help in cypari2, although I would not expect it to fix the other >> memory leaks reported in issue #112. >> >> Here is the explanation. The Pari GEN returned by pari.ellinit is of type >> t_VEC with entries of various types including t_INT and t_VECSMALL. The >> function pari.ellrootno returns a Python int, but it has side effects. It >> also inserts items into the t_VEC GEN representing the elliptic curve. >> That t_VEC GEN is managed by a Python Gen object, and that Python Gen >> object would pass the t_VEC GEN to gunclone in its __dealloc method. >> However, that was not sufficient to cause the new items which were inserted >> into the t_VEC gen by pari.ellrootno to also be uncloned. Consequently, >> those extra GENs remained on the Pari heap forever. Once the python object >> was gone, there was no event which would cause them to be destroyed. >> >> If you read the Pari documentation you will "learn" that gunclone_deep is >> only useful in the context of the peculiar memory model used by gp. Well, >> apparently it is also useful in the context of the peculiar memory model >> used by cypari and cypari2. >> >> I tested the cypari fix on linux ubuntu using the test function described >> earlier in this thread. With 10**8 iterations it never used over 8.4MB of >> memory. >> >> - Marc >> >> On Sunday, September 1, 2024 at 12:07:26 PM UTC-6 Marc Culler wrote: >> >>> My experiments with cypari so far indicate the opposite of what I >>> expected. (This is with one small but significant bug fix that I found >>> along the way.) The main observations are: >>> (1) Without the call to pari.ellrootno there is no memory leak. The >>> call to pari.getheap returns the same value before and after creating many >>> elliptic curves in a for loop. >>> (2) Calling pari.ellrootno(e) in a loop, with the same elliptic curve >>> e, also does not increase the Pari heap size. >>> (3) With both the call to pari.ellinit and the call to pari.ellrootno >>> in the loop, every Python Gen object created in the loop gets deallocated >>> and its Pari GEN gets uncloned with reference count 1, meaning it is >>> removed from the heap. Nonetheless, the heap grows considerably during the >>> loop. >>> >>> I conclude that Pari is creating and cloning its own GENs (i.e. GENs not >>> managed by any Python object) during the call to ellrootno and is not >>> uncloning all of those GENs. But it only does that with a newly created >>> curve. >>> >>> >>> - Marc >>> On Sunday, September 1, 2024 at 10:24:33 AM UTC-6 Marc Culler wrote: >>> >>>> I would say that the getheap behavior is a symptom of the same memory >>>> management bug(s). Also, it is not as simple as you suggest. It is not >>>> always true that every call to getheap leaves something on the heap: >>>> >>>> >>> from cypari import pari >>>> >>> pari.getheap() >>>> [4, 41] >>>> >>> pari.getheap() >>>> [5, 58] >>>> >>> pari.getheap() >>>> [5, 58] >>>> >>> pari.getheap() >>>> [5, 58] >>>> >>>> It is also not true that the cypari and cypari2 behaviors are >>>> equivalent, although they are similar. (Running the test loop with 10**5 >>>> iterations does not use anything close to 28 GB with cypari). I believe >>>> that some, but definitely not all, of the memory management issues -- >>>> namely the most glaring ones described in issue #112 -- were improved in >>>> cypari by removing the code which tries to keep wrapped Pari GENs on the >>>> stack instead of moving them to the heap. >>>> >>>> The behavior where Pari GENs go on the heap and never get removed is >>>> intertwined with python memory management. Python and the Pari heap use >>>> independent reference counting schemes. The __dealloc__ method of a Python >>>> Gen object calls gunclone to reduce the reference count of the Pari heap >>>> GEN which is referenced by the python Gen object. However, as demonstred >>>> in issue #112, cypari2 was creating other internal Python objects which >>>> held references to a Python Gen wrapping a vector or matrix entry. That >>>> prevented those Python Gens from being destroyed and therefore prevented >>>> the Pari GENs wrapped by those Python Gens from being removed from the Pari >>>> heap. >>>> >>>> Apart from the initial call to gclone when a Python Gen is created, I >>>> haven't found any code within cypari which could increase the reference >>>> count of a Pari GEN on the Pari heap. Unless Pari sometimes increases that >>>> reference count, this would suggest that the core issue lies with Python >>>> reference counts. Evidently Python Gen objects are not being dealloc'ed >>>> because other Python objects hold references to them, and this is >>>> preventing Pari from removing GENs from its heap. >>>> >>>> On Saturday, August 31, 2024 at 1:11:45 PM UTC-5 dim...@gmail.com wrote: >>>> >>>>> On Sat, Aug 31, 2024 at 4:35 AM Marc Culler <marc....@gmail.com> >>>>> wrote: >>>>> > >>>>> > As Dima says, and as the issue he mentions supports, the current >>>>> cypari2 code which attempts to keep Pari Gens on the Pari stack as much as >>>>> possible is badly broken. There are many situations where Python Gen >>>>> objects cannot be garbage-collected after being destroyed. I am sure that >>>>> is a big part of this problem. But I don't think it is the whole story. >>>>> > >>>>> > CyPari has returned to the older design which moves the Pari Gen >>>>> wrapped by a Python Gen to the pari heap when the python object is >>>>> created. >>>>> This eliminates the leaks reported in cypari2 issue #112. But in this >>>>> context, I am seeing 12 GB of memory (including several gigabytes of swap) >>>>> in use after I do the following in ipython: >>>>> > >>>>> > In [1]: from cypari import * >>>>> > In [2]: def test(N): >>>>> > ...: for a in range(1, N): >>>>> > ...: e = pari.ellinit([a, 0]) >>>>> > ...: m = pari.ellrootno(e) >>>>> > In [3]: %time test(10**5) >>>>> > CPU times: user 699 ms, sys: 38.3 ms, total: 737 ms >>>>> > Wall time: 757 ms >>>>> > In [4]: %time test(10**6) >>>>> > CPU times: user 7.47 s, sys: 392 ms, total: 7.86 s >>>>> > Wall time: 7.93 s >>>>> > In [5]: %time test(10**7) >>>>> > CPU times: user 1min 41s, sys: 6.62 s, total: 1min 47s >>>>> > Wall time: 1min 49s >>>>> >>>>> the picture is very similar to cypari2. >>>>> (with cypari2, one has to run >>>>> >>>>> pari=Pari() >>>>> >>>>> after the >>>>> >>>>> from cypari2 import * >>>>> >>>>> but otherwise it's the same code) >>>>> >>>>> You can inspect the Pari heap, by calling >>>>> pari.getheap() >>>>> >>>>> each call to test() gets you about 3 new objects per call on the heap, >>>>> so after 10^5 calls you get around 300000 >>>>> objects there (and with 10^6 calls, around 3 million are added). The >>>>> following is with cypari >>>>> >>>>> In [4]: %time test(10**5) >>>>> CPU times: user 1.46 s, sys: 85.8 ms, total: 1.55 s >>>>> Wall time: 1.55 s >>>>> In [5]: pari.getheap() >>>>> Out[5]: [300001, 14394782] >>>>> In [6]: %time test(10**6) >>>>> CPU times: user 14.9 s, sys: 756 ms, total: 15.7 s >>>>> Wall time: 15.7 s >>>>> In [7]: pari.getheap() >>>>> Out[7]: [3299999, 163655656] >>>>> >>>>> With cypari2, similar: >>>>> >>>>> In [9]: pari.getheap() # 10^5 >>>>> Out[9]: [299969, 14392931] >>>>> >>>>> In [12]: pari.getheap() # 10^6 >>>>> Out[12]: [3299662, 163635286] >>>>> >>>>> >>>>> And gc.collect() does not do anything, in either case, Pari heap >>>>> remains this big. >>>>> >>>>> As well, with cypari, a call to pari.getheap() adds 1 object there, a >>>>> bug, I guess. >>>>> (this does not happen with cypari2) >>>>> In [14]: pari.getheap() >>>>> Out[14]: [3300004, 163655741] >>>>> In [15]: pari.getheap() >>>>> Out[15]: [3300005, 163655758] >>>>> In [16]: pari.getheap() >>>>> Out[16]: [3300006, 163655775] >>>>> In [17]: pari.getheap() >>>>> Out[17]: [3300007, 163655792] >>>>> In [18]: pari.getheap() >>>>> Out[18]: [3300008, 163655809] >>>>> >>>>> Looks like a memory management bug in both, cypari and cypari2. >>>>> >>>>> Dima >>>>> >>>>> > >>>>> > - Marc >>>>> > >>>>> > On Thursday, August 29, 2024 at 1:19:05 PM UTC-5 dim...@gmail.com >>>>> wrote: >>>>> >> >>>>> >> It would be good to reproduce this with cypari2 alone. >>>>> >> cypari2 is known to have similar kind (?) of problems: >>>>> >> https://github.com/sagemath/cypari2/issues/112 >>>>> >> >>>>> >> >>>>> >> On Thu, Aug 29, 2024 at 6:47 PM Nils Bruin <nbr...@sfu.ca> wrote: >>>>> >> > >>>>> >> > On Thursday 29 August 2024 at 09:51:04 UTC-7 Georgi Guninski >>>>> wrote: >>>>> >> > >>>>> >> > I observe that the following does not leak: >>>>> >> > >>>>> >> > E=EllipticCurve([5*13,0]) #no leak >>>>> >> > rn=E.root_number() >>>>> >> > >>>>> >> > >>>>> >> > How do you know that doesn't leak? Do you mean that repeated >>>>> execution of those commands in the same session does not swell memory use? >>>>> >> > >>>>> >> > >>>>> >> > The size of the leak is suspiciously close to a power of two. >>>>> >> > >>>>> >> > >>>>> >> > I don't think you can draw conclusions from that. Processes >>>>> generally request memory in large blocks from the operating system, to >>>>> amortize the high overhead in the operation. It may even be the case that >>>>> 128 Mb is the chunk size involved here! The memory allocated to a process >>>>> by the operating system isn't a fully accurate measure of memory >>>>> allocation >>>>> use in the process either: a heap manager can decide it's cheaper to >>>>> request some new pages from the operating system than to reorganize its >>>>> heap and reuse the fragmented space on it. I think for this loop, memory >>>>> allocation consistently swells with repeated execution, so there probably >>>>> really is something leaking. But given that it's not in GC-tracked objects >>>>> on the python heap, one would probably need valgrind information or a keen >>>>> look at the code involved to locate where it's coming from. >>>>> >> > >>>>> >> > -- >>>>> >> > You received this message because you are subscribed to the Google >>>>> Groups "sage-devel" group. >>>>> >> > To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to sage-devel+...@googlegroups.com. >>>>> >> > To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/sage-devel/e63e2ec9-106a-4ddd-ab16-5c6db4fe83b4n%40googlegroups.com. >>>>> >>>>> > >>>>> > -- >>>>> > You received this message because you are subscribed to the Google >>>>> Groups "sage-devel" group. >>>>> > To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to sage-devel+...@googlegroups.com. >>>>> > To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/sage-devel/8f309f36-5a39-4677-a137-60a724e0d970n%40googlegroups.com. >>>>> >>>>> >>>> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "sage-devel" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/sage-devel/fWBls6YbXmw/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> sage-devel+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/sage-devel/5211d1b1-0bfc-4e1b-bd0a-8f499539c4cfn%40googlegroups.com >> <https://groups.google.com/d/msgid/sage-devel/5211d1b1-0bfc-4e1b-bd0a-8f499539c4cfn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > >-- >You received this message because you are subscribed to the Google Groups >"sage-devel" group. >To unsubscribe from this group and stop receiving emails from it, send an >email to sage-devel+unsubscr...@googlegroups.com. >To view this discussion on the web visit >https://groups.google.com/d/msgid/sage-devel/CALcZXRHMLSGacwEkVKELv7oV6yae6UXD5g2EEjLxeo0RaCrfGw%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/46342D3B-F00A-4CB1-AFA5-B7E34A088F30%40gmail.com.