Haha! I was expecting you or @carnival (sorry, don't know the name) to respond. Been following your issues/PRs as they are quite interesting to read!
Those results are quite surprising! I was expecting the GC to perform far worse than that in comparison. You have successfully scratched my itch on wondering if it would have been an improvement! Thanks for the numbers and the explanation, it is much appreciated. The mind does tend to wonder when your code has been running for 12 hours. On Saturday, 1 August 2015 23:59:25 UTC+2, Yichao Yu wrote: > On Sat, Aug 1, 2015 at 5:10 PM, Michael Louwrens > <[email protected] <javascript:>> wrote: > > I haven't tested the new generational GC so this may all be moot but: > > > > I had a recent thought. Wouldn't a manual free method that frees the > object > > and tells the GC that it this object doesn't exist anymore be useful in > some > > cases? > > I can imagine anywhere where you have temporary objects allocated it > would > > be helpful or when you have large arrays/arrays of objects which you > know > > after a certain point will never be referenced again. > > > > This would probably end up being abused as a premature optimization and > > potentially cause crashes if freed memory is accessed I do realize. > There > > also will not be too many valid uses of it. I do feel that it could have > > some useful functionality however. Those that use gc_disable would then > be > > able to free memory. > > > > Just a thought! Even if it is shot down the explanation will be here and > > others with a similar thought will hopefully be able to find it (If a > > similar one exists - I could not find it. My Google-fu is a bit lacking > > though, just link me to the discussion if so thanks!) > > Havening looked at memory management in julia recently, my impression > is that these manually memory management techniques cannot improve the > performance on top of a GC. > > I'll first show the evidence with numbers. > > ``` > julia> f_gc(n) = for i in 1:n > Ref(1) > end > f_gc (generic function with 1 method) > > julia> f_malloc(n) = for i in 1:n > ptr = ccall(:malloc, Ptr{Void}, (Csize_t,), 16) > ccall(:free, Void, (Ptr{Void},), ptr) > end > f_malloc (generic function with 1 method) > > julia> @time f_gc(1) > 0.002025 seconds (154 allocations: 10.496 KB) > > julia> @time f_malloc(1) > 0.002245 seconds (4 allocations: 160 bytes) > > julia> gc(); @time f_malloc(100_000_000) > 2.179464 seconds (4 allocations: 160 bytes) > > julia> gc(); @time f_gc(100_000_000) > 0.527786 seconds (100.00 M allocations: 1.490 GB, 10.38% gc time) > ``` > > I suppose `f_malloc` is what people would want to transform `f_gc` > into (the size 16 includes the tag on 64bit platform, changing to > smaller number doesn't change the time by a lot) > > The timing above clearly shows that the gc is faster than manual > memory management. You can even check the code_llvm output to make > sure julia didn't generate bad code for the malloc version. (Note that > if you write the malloc in c, it will likely be much faster but that's > because the compiler can remove the call to malloc and free if it can > figure out the result pointer doesn't escape, and it should in this > simple case. This optimization is also possible to do in julia and is > an on going effort e.g. https://github.com/JuliaLang/julia/pull/12205) > > As surprising as it seems for people from c/c++, there is good reasons > behind it. I'm going to describe my understanding of it but others > with more experience on memory management should correct me if I make > any mistakes (likely ;-p ). > > To understand this issue, it is important to know that freeing memory > can be as expensive as allocating memory. You may expect that the > garbage collector don't need to do anything we you free a piece of > memory but that is actually not the case. By definition, a free memory > block means that the program should be able to allocate in it, which > means that the allocation should be able to find it. Therefore, what > needs to happen when you manually free a piece of memory is that the > underlying system will push it in some data structure and pop it out > on allocation. Unless your program is using more and more memory, none > of these needs any system call and the push and pop operation are > likely similarly expensive. > > With this in mind, the reason a GC is faster is that it can do these > expensive operations in a more effecient way. Building a free list (as > in julia GC) is expensive partly because of the memory access pattern > and since the GC can build the free list in one go, it can do this in > a linear order that is more cache efficient than what would happen for > manual memory management. It usually also have more knowledge about > the object layout etc which can enable further optimization. >
