Brian Granger wrote: > On Thu, Apr 10, 2008 at 10:27 AM, Michael.Abshoff > <[EMAIL PROTECTED]> wrote: >> Brian Granger wrote: >> >>> Hi, >>> >> Hi Brian, >> >>
Hi Brian, >> >>> (dual posted to sage and cython) >>> >>> A few of us (ipython and mpi4py devs) are wondering what the >>> best/safest way of allocating dynamic memory in a local scope >>> (method/function) is when using cython. An example would be if you >>> need an array of c ints that is locally scoped. >>> >>> The big question is how to make sure that the memory gets freed - even >>> if something goes wrong in the function/method. That is, you want to >>> prevent memory leaks. It looks like in sage, the >>> sage_malloc/sage_free functions are used for this purpose: >>> >> They generally aren't used in most of the code. The idea for those >> functions is that in the future we can wrap other allocators like slab >> allocators. > > Maybe so, but I am am not very familiar with the sage codebase and I > quickly found numerous examples of sage_malloc :) Also, from Williams > reponse it sounds like the idea is that sage code _would_ use these > functions. Also, how else would you allocate dynamic memory? Well, in the end you end up using sbrk() anyway, but I don't see what is wrong with malloc itself? sage_malloc was introduced a while back to make it possible to switch to a slab allocator like omalloc potentially to see if there is any benefit from it. >> >>> from sage/graphs/graph_isom.pyx: >>> >>> 176 def incorporate_permutation(self, gamma): >>> 202 cdef int *_gamma = <int *> sage_malloc( n * sizeof(int) ) >>> 203 if not _gamma: >>> 204 raise MemoryError("Error allocating memory.") >>> 205 for k from 0 <= k < n: >>> 206 _gamma[k] = gamma[k] >>> 207 self._incorporate_permutation(_gamma, n) >>> 208 sage_free(_gamma) >>> >>> Because sage_malloc is #defined to malloc in stdsage.h, I think there >>> is a significant potential for memory leaks in code like this. Are we >>> thinking correctly on this issue? Isn't this a huge problem? >>> >> Well, I don't see an advantage in using Python's allocator there. It is >> likely slower for large allocations and make debugging memory issues much >> more complicated since issues like pointer corruption is significantly >> harder to debug. > > I am not concerned about performance here, but rather memory leaks. I > don't think it is a good idea to trade memory leaks for performance. Absolutely not, but if you want to write exception safe extensions just write them in exception safe C++ using autoptr & friends. While you might not be too concerned about performance here the issue is still debuggability. If you ran standard Python under valgrind (after disabling pymalloc) you will get [EMAIL PROTECTED]:/scratch/mabshoff/release-cycle/sage-3.0.alpha2/local/bin$ valgrind --tool=memcheck --leak-resolution=high ./python ==12347== Memcheck, a memory error detector. ==12347== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==12347== Using LibVEX rev 1812, a library for dynamic binary translation. ==12347== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==12347== Using valgrind-3.4.0.SVN, a dynamic binary instrumentation framework. ==12347== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==12347== For more details, rerun with: -v ==12347== Python 2.5.1 (r251:54863, Apr 6 2008, 21:59:15) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> ==12347== ==12347== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 9 from 2) ==12347== malloc/free: in use at exit: 599,274 bytes in 2,441 blocks. ==12347== malloc/free: 12,890 allocs, 10,449 frees, 2,420,712 bytes allocated. ==12347== For counts of detected errors, rerun with: -v ==12347== searching for pointers to 2,441 not-freed blocks. ==12347== checked 998,864 bytes. ==12347== ==12347== LEAK SUMMARY: ==12347== definitely lost: 0 bytes in 0 blocks. ==12347== possibly lost: 15,736 bytes in 54 blocks. ==12347== still reachable: 583,538 bytes in 2,387 blocks. ==12347== suppressed: 0 bytes in 0 blocks. So we have roughly 2,400 blocks of memory that Python itself cannot properly deallocate due to problems with their own garbage collection. I will spare you the numbers for Sage (they are much worst due to still to be solved problems with Cython and extensions in general) and starting to poke for leaks in that pile of crap is not something I find appealing. Sure, once all mallocs in Sage are converted to sage_malloc one could switch and see what happens, but I can guarantee you that debugging some problem stemming from us doing something stupid with malloc compared to tracking it down inside Python is not a contest. Check out #1337 in our track to see such a case. And by the way: The python interpreter itself does leak some small bits of memory while running. So the above while it looks like a really good idea is far from a fool proof solution. >> >> >>> Lisandro Dalcin (author of mpi4py) came up with the following trick >>> that, while more complicated, prevents memory leaks: >>> >>> cdef extern from "Python.h": >>> object PyString_FromStringAndSize(char*,Py_ssize_t) >>> char* PyString_AS_STRING(object) >>> >>> cdef inline object pyalloc_i(int size, int **i): >>> if size < 0: size = 0 >>> cdef Py_ssize_t n = size * sizeof(int) >>> cdef object ob = PyString_FromStringAndSize(NULL, n) >>> i[0] = <int*> PyString_AS_STRING(ob) >>> return ob >>> >>> and now >>> >>> def foo(sequence): >>> cdef int size = len(sequence), >>> cdef int *buf = NULL >>> cdef object tmp = pyalloc_i(size, &buf) >>> >>> This could probably be adapted into a malloc-like function. What do >>> people think? >>> > >> We valgrind the complete test suite at least weekly and with ever >> increasing doctest coverage I don't see that we have a problem. Python's >> memory management has also some serious issues and I doubt it will offer any >> advantage for anything but loads of small allocs. And by switching to a slab >> allocator on our end will fix that problem. > > But, test test suite doesn't test for all of the odd input that users > will feed to sage. These are the cases that will leak memory and > there is not possible way to test for all of them. Also debugging > memory leaks is super nasty. Compared to that pain, having a slightly > slower memory allocator is not a big deal. Well, as long as you write code in C memory leaks when something goes wrong is something you have to live with. And python is far from perfect regarding memory management too IMHO as I point out above. Another issue is that once we have extensions that do work with threads we no longer can use Python's allocation since it isn't thread safe. Adding some more checks to the Sage codebase around allocations is something that ought to be done, but on the list of things to fix potential memory leaks from garbage input is low on the list of my personal priority as long as we have real leaks to deal with. Feel free to try out the above and report back if it fixes issues and how much of an impact on performance it has. > Brian Cheers, Michael >>> Thanks, >>> >>> Brian >>> >> Cheers, >> >> Michael >> >> >>> _______________________________________________ >>> Cython-dev mailing list >>> [EMAIL PROTECTED] >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >>> >> > --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---