@Brian thanks a lot for confirmation!!! On Saturday, November 25, 2023 at 12:09:25 AM UTC+8 Brian Candler wrote:
> Yes, it says it will not scan the *entries of the map* (i.e. the keys and > the values held *within* the map). > > But if the map points to (say) 100,000 separate hash buckets, then the > memory allocated for those buckets clearly must be marked as not eligible > for garbage collection. > > On Friday, 24 November 2023 at 15:43:25 UTC Zhao Weng wrote: > >> So in this case, if I understand it correctly, statement of this doc: >> https://go101.org/optimizations/6-map.html >> >> If the key type and element type of a map both don't contain pointers, >> then in the scan phase of a GC cycle, the garbage collector will not scan >> the entries of the map. This could save much time. >> just mean we can save further garbage tracking, but buckets will be >> tracked no matter what, Am I right? >> >> On Friday, November 24, 2023 at 9:20:34 PM UTC+8 Brian Candler wrote: >> >>> A map is a dynamic data structure, which allocates more pieces (buckets) >>> as the map grows. Google "golang map internals" to find articles and videos >>> which explain the data structure in more detail. >>> >>> Clearly, all parts of that structure must be tracked by the GC, >>> otherwise they would get garbage collected. >>> >>> If the keys and values are ints, then they're stored directly in hash >>> slots and no further garbage tracking is required. If they're pointers, >>> then additionally the GC would have to track the memory used by the objects >>> pointed to. >>> >>> On Friday, 24 November 2023 at 10:44:20 UTC Zhao Weng wrote: >>> >>>> with a new program I seems to reproduce the problem: >>>> >>>> ```go >>>> package main >>>> >>>> import ( >>>> "fmt" >>>> "runtime" >>>> "time" >>>> ) >>>> >>>> func main() { >>>> m := make(map[int]int) >>>> for i := 0; i < 10e6; i++ { >>>> m[i] = 2 * i >>>> } >>>> >>>> st := time.Now() >>>> runtime.GC() >>>> fmt.Printf("GC in %f seconds, len(m):%d\n", time.Since(st).Seconds(), >>>> len(m)) >>>> >>>> // delete old data and add new data, while keeping len(m) == 10e6 >>>> for run := 1; run < 11; run++ { >>>> for i := 0; i < 10e6; i++ { >>>> delete(m, i*run) >>>> } >>>> for i := 0; i < 10e6; i++ { >>>> m[i*(run+1)] = i * (run + 1) >>>> } >>>> } >>>> >>>> st = time.Now() >>>> runtime.GC() >>>> fmt.Printf("after delete and refill 10 runs, GC in %f seconds, len(m): >>>> %d\n", time.Since(st).Seconds(), len(m)) >>>> >>>> // delete old data and add new data, while keeping len(m) == 10e6 >>>> for run := 11; run < 101; run++ { >>>> for i := 0; i < 10e6; i++ { >>>> delete(m, i*run) >>>> } >>>> for i := 0; i < 10e6; i++ { >>>> m[i*(run+1)] = i * (run + 1) >>>> } >>>> } >>>> st = time.Now() >>>> runtime.GC() >>>> fmt.Printf("after delete and refill 100 runs, GC in %f seconds, len(m): >>>> %d\n", time.Since(st).Seconds(), len(m)) >>>> } >>>> ``` >>>> >>>> And the output is: >>>> ```bash >>>> GC in 0.003220 seconds, len(m):10000000 >>>> after delete and refill 10 runs, GC in 0.023603 seconds, len(m):10000000 >>>> after delete and refill 100 runs, GC in 0.053591 seconds, >>>> len(m):10000000 >>>> ``` >>>> >>>> for each run I empty the map[int]int and refill it with 10 million >>>> item, and as it goes GC duration seems to increase. >>>> >>>> On Friday, November 24, 2023 at 6:07:56 PM UTC+8 Zhao Weng wrote: >>>> >>>>> And sorry for long monologue missing the whole point. It all boil down >>>>> to one simple question: >>>>> >>>>> Does larger map[int]int do increase GC scan overhead by having more >>>>> buckets and old_buckets (which is reference by pointer) ? >>>>> >>>>> The trivial program above seems to suggest positive since it shows a >>>>> huge map[int]int can have many inuse_objects. >>>>> >>>>> For context, I have a long running service which has a long-lived >>>>> map[int]int in memory, we constantly add new data to it and maintain its >>>>> size by calling delete() when oversize, we assume there should not >>>>> result >>>>> in GC overhead growth as time goes by. >>>>> >>>>> Thanks >>>>> On Friday, November 24, 2023 at 5:54:24 PM UTC+8 Zhao Weng wrote: >>>>> >>>>>> Sorry about the redundant spam like mail I send to the mail box, I do >>>>>> delete the conversation on the google groups, I send later emails for >>>>>> correcting typos and replace screenshot to text. I'm so sorry for that. >>>>>> >>>>>> On Friday, November 24, 2023 at 3:13:04 PM UTC+8 Kurtis Rader wrote: >>>>>> >>>>>>> You only need to ask for help once. You sent essentially the same >>>>>>> message twice. Also, your example program doesn't provide any time for >>>>>>> the >>>>>>> GC to run in any meaningful manner as far as I can tell. So I am >>>>>>> confused >>>>>>> how your trivial program relates to a real world program. That is, >>>>>>> while >>>>>>> simple reproductions of a problem are always a good idea that does not >>>>>>> seem >>>>>>> to be the case with your example reproduction. No real program would >>>>>>> instantiate a huge map then immediately terminate. I think you need to >>>>>>> provide more context and a more realistic program to reproduce the >>>>>>> problem. >>>>>>> >>>>>>> On Thu, Nov 23, 2023 at 10:10 PM 'Zhao Weng' via golang-nuts < >>>>>>> golan...@googlegroups.com> wrote: >>>>>>> >>>>>>>> Dear Gophers: >>>>>>>> >>>>>>>> I have some questions about GC and map[int]int, please help. >>>>>>>> >>>>>>>> consider the following program: >>>>>>>> >>>>>>>> ```go >>>>>>>> package main >>>>>>>> >>>>>>>> import ( >>>>>>>> "fmt" >>>>>>>> "os" >>>>>>>> "runtime/pprof" >>>>>>>> ) >>>>>>>> >>>>>>>> func main() { >>>>>>>> f, err := os.OpenFile("memory.pb.gz", >>>>>>>> os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0666) >>>>>>>> if err != nil { >>>>>>>> panic(err) >>>>>>>> } >>>>>>>> >>>>>>>> m := make(map[int]int) >>>>>>>> for i := 0; i < 10e6; i++ { >>>>>>>> m[i] = 2 * i >>>>>>>> } >>>>>>>> >>>>>>>> if err := pprof.WriteHeapProfile(f); err != nil { >>>>>>>> panic(err) >>>>>>>> } >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> after inserting 10 million key/value pair, the result of this >>>>>>>> memory profile show inuse_objects count ~ 100k >>>>>>>> >>>>>>>> ```bash >>>>>>>> go tool pprof -inuse_objects memory.pb.gz >>>>>>>> Type: inuse_objects >>>>>>>> Time: Nov 23, 2023 at 3:40pm (CST) >>>>>>>> Entering interactive mode (type "help" for commands, "o" for >>>>>>>> options) >>>>>>>> (pprof) top >>>>>>>> Showing nodes accounting for 127450, 100% of 127450 total >>>>>>>> flat flat% sum% cum cum% >>>>>>>> 127450 100% 100% 127450 100% main.main >>>>>>>> >>>>>>>> ``` >>>>>>>> >>>>>>>> in my view this inuse_objects is found by GC by following the >>>>>>>> bucket and old_buckets pointer of hmap >>>>>>>> >>>>>>>> ```go >>>>>>>> // A header for a Go map. >>>>>>>> type hmap struct { >>>>>>>> // Note: the format of the hmap is also encoded in >>>>>>>> cmd/compile/internal/reflectdata/reflect.go. >>>>>>>> // Make sure this stays in sync with the compiler's definition. >>>>>>>> count int // # live cells == size of map. Must be first (used by >>>>>>>> len() builtin) >>>>>>>> flags uint8 >>>>>>>> B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B >>>>>>>> items) >>>>>>>> noverflow uint16 // approximate number of overflow buckets; see >>>>>>>> incrnoverflow for details >>>>>>>> hash0 uint32 // hash seed >>>>>>>> >>>>>>>> buckets unsafe.Pointer // array of 2^B Buckets. may be nil if >>>>>>>> count==0. >>>>>>>> oldbuckets unsafe.Pointer // previous bucket array of half the >>>>>>>> size, non-nil only when growing >>>>>>>> nevacuate uintptr // progress counter for evacuation (buckets less >>>>>>>> than this have been evacuated) >>>>>>>> >>>>>>>> extra *mapextra // optional fields >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> but according to this doc: >>>>>>>> https://go101.org/optimizations/6-map.html >>>>>>>> >>>>>>>> If the key type and element type of a map both don't contain >>>>>>>> pointers, then in the scan phase of a GC cycle, the garbage collector >>>>>>>> will >>>>>>>> not scan the entries of the map. This could save much time. >>>>>>>> >>>>>>>> My question is: >>>>>>>> maybe GC should not follow the bucket and old_buckets pointer of >>>>>>>> hmap? >>>>>>>> or the doc above just forbid GC from scan the entries not the >>>>>>>> bucket and old_buckets pointer of hmap, and larger map[int]int do >>>>>>>> increase >>>>>>>> GC scan overhead by having more buckets and old_buckets? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "golang-nuts" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to golang-nuts...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/golang-nuts/b7049ce2-d5a8-49bc-ae44-7a73777494f4n%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/golang-nuts/b7049ce2-d5a8-49bc-ae44-7a73777494f4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Kurtis Rader >>>>>>> Caretaker of the exceptional canines Junior and Hank >>>>>>> >>>>>> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/b0d0feef-fbe2-49b5-94d6-1a2e3fc216f7n%40googlegroups.com.