@Brian thanks a lot for confirmation!!!

On Saturday, November 25, 2023 at 12:09:25 AM UTC+8 Brian Candler wrote:

> Yes, it says it will not scan the *entries of the map* (i.e. the keys and 
> the values held *within* the map).
>
> But if the map points to (say) 100,000 separate hash buckets, then the 
> memory allocated for those buckets clearly must be marked as not eligible 
> for garbage collection.
>
> On Friday, 24 November 2023 at 15:43:25 UTC Zhao Weng wrote:
>
>> So in this case, if I understand it correctly, statement of this doc: 
>> https://go101.org/optimizations/6-map.html
>>
>> If the key type and element type of a map both don't contain pointers, 
>> then in the scan phase of a GC cycle, the garbage collector will not scan 
>> the entries of the map. This could save much time.
>> just mean we can save further garbage tracking, but buckets will be 
>> tracked no matter what, Am I right?
>>
>> On Friday, November 24, 2023 at 9:20:34 PM UTC+8 Brian Candler wrote:
>>
>>> A map is a dynamic data structure, which allocates more pieces (buckets) 
>>> as the map grows. Google "golang map internals" to find articles and videos 
>>> which explain the data structure in more detail.
>>>
>>> Clearly, all parts of that structure must be tracked by the GC, 
>>> otherwise they would get garbage collected.
>>>
>>> If the keys and values are ints, then they're stored directly in hash 
>>> slots and no further garbage tracking is required. If they're pointers, 
>>> then additionally the GC would have to track the memory used by the objects 
>>> pointed to.
>>>
>>> On Friday, 24 November 2023 at 10:44:20 UTC Zhao Weng wrote:
>>>
>>>> with a new program I seems to reproduce the problem:
>>>>
>>>> ```go
>>>> package main
>>>>
>>>> import (
>>>> "fmt"
>>>> "runtime"
>>>> "time"
>>>> )
>>>>
>>>> func main() {
>>>> m := make(map[int]int)
>>>> for i := 0; i < 10e6; i++ {
>>>> m[i] = 2 * i
>>>> }
>>>>
>>>> st := time.Now()
>>>> runtime.GC()
>>>> fmt.Printf("GC in %f seconds, len(m):%d\n", time.Since(st).Seconds(), 
>>>> len(m))
>>>>
>>>> // delete old data and add new data, while keeping len(m) == 10e6
>>>> for run := 1; run < 11; run++ {
>>>> for i := 0; i < 10e6; i++ {
>>>> delete(m, i*run)
>>>> }
>>>> for i := 0; i < 10e6; i++ {
>>>> m[i*(run+1)] = i * (run + 1)
>>>> }
>>>> }
>>>>
>>>> st = time.Now()
>>>> runtime.GC()
>>>> fmt.Printf("after delete and refill 10 runs, GC in %f seconds, len(m):
>>>> %d\n", time.Since(st).Seconds(), len(m))
>>>>
>>>> // delete old data and add new data, while keeping len(m) == 10e6
>>>> for run := 11; run < 101; run++ {
>>>> for i := 0; i < 10e6; i++ {
>>>> delete(m, i*run)
>>>> }
>>>> for i := 0; i < 10e6; i++ {
>>>> m[i*(run+1)] = i * (run + 1)
>>>> }
>>>> }
>>>> st = time.Now()
>>>> runtime.GC()
>>>> fmt.Printf("after delete and refill 100 runs, GC in %f seconds, len(m):
>>>> %d\n", time.Since(st).Seconds(), len(m))
>>>> }
>>>> ```
>>>>
>>>> And the output is:
>>>> ```bash
>>>> GC in 0.003220 seconds, len(m):10000000
>>>> after delete and refill 10 runs, GC in 0.023603 seconds, len(m):10000000
>>>> after delete and refill 100 runs, GC in 0.053591 seconds, 
>>>> len(m):10000000
>>>> ```
>>>>
>>>> for each run I empty the map[int]int and refill it with 10 million 
>>>> item, and as it goes GC duration seems to increase.
>>>>
>>>> On Friday, November 24, 2023 at 6:07:56 PM UTC+8 Zhao Weng wrote:
>>>>
>>>>> And sorry for long monologue missing the whole point. It all boil down 
>>>>> to one simple question:
>>>>>
>>>>> Does larger map[int]int do increase GC scan overhead by having more 
>>>>> buckets and old_buckets (which is reference by pointer) ?
>>>>>
>>>>> The trivial program above seems to suggest positive since it shows a 
>>>>> huge map[int]int can have many inuse_objects.
>>>>>
>>>>> For context, I have a long running service which has a long-lived 
>>>>> map[int]int in memory, we constantly add new data to it and maintain its 
>>>>> size by calling delete() when oversize,  we assume there should not 
>>>>> result 
>>>>> in GC overhead growth as time goes by. 
>>>>>
>>>>> Thanks
>>>>> On Friday, November 24, 2023 at 5:54:24 PM UTC+8 Zhao Weng wrote:
>>>>>
>>>>>> Sorry about the redundant spam like mail I send to the mail box, I do 
>>>>>> delete the conversation on the google groups, I send later emails for 
>>>>>> correcting typos and replace screenshot to text. I'm so sorry for that.
>>>>>>
>>>>>> On Friday, November 24, 2023 at 3:13:04 PM UTC+8 Kurtis Rader wrote:
>>>>>>
>>>>>>> You only need to ask for help once. You sent essentially the same 
>>>>>>> message twice. Also, your example program doesn't provide any time for 
>>>>>>> the 
>>>>>>> GC to run in any meaningful manner as far as I can tell. So I am 
>>>>>>> confused 
>>>>>>> how your trivial program relates to a real world program. That is, 
>>>>>>> while 
>>>>>>> simple reproductions of a problem are always a good idea that does not 
>>>>>>> seem 
>>>>>>> to be the case with your example reproduction. No real program would 
>>>>>>> instantiate a huge map then immediately terminate. I think you need to 
>>>>>>> provide more context and a more realistic program to reproduce the 
>>>>>>> problem.
>>>>>>>
>>>>>>> On Thu, Nov 23, 2023 at 10:10 PM 'Zhao Weng' via golang-nuts <
>>>>>>> golan...@googlegroups.com> wrote:
>>>>>>>
>>>>>>>> Dear Gophers:
>>>>>>>>
>>>>>>>> I have some questions about GC and map[int]int, please help.
>>>>>>>>
>>>>>>>> consider the following program:
>>>>>>>>
>>>>>>>> ```go
>>>>>>>> package main
>>>>>>>>
>>>>>>>> import (
>>>>>>>> "fmt"
>>>>>>>> "os"
>>>>>>>> "runtime/pprof"
>>>>>>>> )
>>>>>>>>
>>>>>>>> func main() {
>>>>>>>> f, err := os.OpenFile("memory.pb.gz", 
>>>>>>>> os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0666)
>>>>>>>> if err != nil {
>>>>>>>> panic(err)
>>>>>>>> }
>>>>>>>>
>>>>>>>> m := make(map[int]int)
>>>>>>>> for i := 0; i < 10e6; i++ {
>>>>>>>> m[i] = 2 * i
>>>>>>>> }
>>>>>>>>
>>>>>>>> if err := pprof.WriteHeapProfile(f); err != nil {
>>>>>>>> panic(err)
>>>>>>>> }
>>>>>>>> }
>>>>>>>> ```
>>>>>>>>
>>>>>>>> after inserting 10 million key/value pair, the result of this 
>>>>>>>> memory profile show inuse_objects count ~ 100k
>>>>>>>>
>>>>>>>> ```bash
>>>>>>>> go tool pprof -inuse_objects memory.pb.gz 
>>>>>>>> Type: inuse_objects
>>>>>>>> Time: Nov 23, 2023 at 3:40pm (CST)
>>>>>>>> Entering interactive mode (type "help" for commands, "o" for 
>>>>>>>> options)
>>>>>>>> (pprof) top
>>>>>>>> Showing nodes accounting for 127450, 100% of 127450 total
>>>>>>>>       flat  flat%   sum%        cum   cum%
>>>>>>>>     127450   100%   100%     127450   100%  main.main
>>>>>>>>          
>>>>>>>> ```
>>>>>>>>
>>>>>>>> in my view this inuse_objects is found by GC by following the 
>>>>>>>> bucket and old_buckets pointer of hmap
>>>>>>>>
>>>>>>>> ```go
>>>>>>>> // A header for a Go map.
>>>>>>>> type hmap struct {
>>>>>>>> // Note: the format of the hmap is also encoded in 
>>>>>>>> cmd/compile/internal/reflectdata/reflect.go.
>>>>>>>> // Make sure this stays in sync with the compiler's definition.
>>>>>>>> count int // # live cells == size of map. Must be first (used by 
>>>>>>>> len() builtin)
>>>>>>>> flags uint8
>>>>>>>> B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B 
>>>>>>>> items)
>>>>>>>> noverflow uint16 // approximate number of overflow buckets; see 
>>>>>>>> incrnoverflow for details
>>>>>>>> hash0 uint32 // hash seed
>>>>>>>>
>>>>>>>> buckets unsafe.Pointer // array of 2^B Buckets. may be nil if 
>>>>>>>> count==0.
>>>>>>>> oldbuckets unsafe.Pointer // previous bucket array of half the 
>>>>>>>> size, non-nil only when growing
>>>>>>>> nevacuate uintptr // progress counter for evacuation (buckets less 
>>>>>>>> than this have been evacuated)
>>>>>>>>
>>>>>>>> extra *mapextra // optional fields
>>>>>>>> }
>>>>>>>> ```
>>>>>>>>
>>>>>>>> but according to this doc: 
>>>>>>>> https://go101.org/optimizations/6-map.html
>>>>>>>>
>>>>>>>> If the key type and element type of a map both don't contain 
>>>>>>>> pointers, then in the scan phase of a GC cycle, the garbage collector 
>>>>>>>> will 
>>>>>>>> not scan the entries of the map. This could save much time.
>>>>>>>>
>>>>>>>> My question is:
>>>>>>>> maybe GC should not follow the bucket and old_buckets pointer of 
>>>>>>>> hmap?
>>>>>>>> or the doc above just forbid GC from scan the entries not the 
>>>>>>>> bucket and old_buckets pointer of hmap, and larger map[int]int do 
>>>>>>>> increase 
>>>>>>>> GC scan overhead by having more buckets and old_buckets?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "golang-nuts" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to golang-nuts...@googlegroups.com.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/golang-nuts/b7049ce2-d5a8-49bc-ae44-7a73777494f4n%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/golang-nuts/b7049ce2-d5a8-49bc-ae44-7a73777494f4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Kurtis Rader
>>>>>>> Caretaker of the exceptional canines Junior and Hank
>>>>>>>
>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/b0d0feef-fbe2-49b5-94d6-1a2e3fc216f7n%40googlegroups.com.

Reply via email to