Hey, We finally figured it all out. The cause to the leak is a bug in a finalizer that is unrelated to sqlite3. This finalizer blocks the runtime's finalizer goroutine, leading to objects with finalizers to leak. We find it very surprising that the runtime does nothing to warn about this. Especially when GO's mfinal routine is not visible in a normal build in the goroutine profile. We intend to bring this to a discussion with the community. IMO, the runtime should expose an API to tell how much time had passed since the current finalizer func started. In addition, it should mention this edge case explicitly. You only need one rogue package to block and all the packages are exposed to a leak.
Please let me know your thoughts. Thanks. On Monday, 10 March 2025 at 09:41:37 UTC+2 Robert Engels wrote: > Hi. You may have to provide a small reproducible test case… > > The graph you provided seems to show a memory leak. I would use goref on > the rows instances to find the roots. If the memory is off heap that means > either the driver has a bug or the driver is being used incorrectly. > > [image: goref.png] > > cloudwego/goref: Go heap object reference analysis tool > <https://github.com/cloudwego/goref> > github.com <https://github.com/cloudwego/goref> > <https://github.com/cloudwego/goref> > > > On Mar 10, 2025, at 1:07 AM, Elad Gavra <gav...@gmail.com> wrote: > > > > Thank you for the replies. > > Robert, I forgot to mention we verified there are no locks/any sqlite > routine that is stuck (using the routines profile). > > Michael, thanks I'll have a look. We shall see how the debug deployment > will go. I'll also check out your suggestion (problem is this reproduces > only at customer env). By the way, the author of sqlite 3 package removed > the finalizer for rows but the reason was "redundant call". > > Thanks! > > On Mon, Mar 10, 2025, 06:14 Robert Engels <ren...@ix.netcom.com> wrote: > >> I suspect that they added the SetFinalizer calls to help those using that >> driver that weren’t properly managing the driver resources (connections, >> queries, etc. ). >> >> On Mar 9, 2025, at 10:52 PM, 'Michael Knyszek' via golang-nuts < >> golan...@googlegroups.com> wrote: >> >> I suspect this fact is going to be the most relevant thing to your >> investigation: >> >> > Upgrading the sqlite3 driver which had one non-negligible change: >> adding SetFinalizer to all these objects. >> >> See https://go.dev/doc/gc-guide#Common_finalizer_issues for a variety of >> ways finalizers can cause leaks (on both the Go and C side). Go 1.24's >> cleanups (runtime.AddCleanup) might work be better, provided a >> deterministic execution order isn't required. (This point might be moot >> since it's in go-sqlite3, which sounds like it's something you don't have >> control over.) >> >> I have a patch that provides a finalizer/cleanup leak detector by setting >> GODEBUG=detectcleanupleaks=1, if you want to try it. It's >> https://go.dev/cl/634599. Happy to explain how to patch and build the Go >> toolchain if you're up for it. >> >> On Sunday, March 9, 2025 at 5:49:03 PM UTC-4 robert engels wrote: >> >>> Looks to me like you are reading a lot of rows under a lock, and never >>> releasing the lock, so the rows remain in memory. >>> >>> I don’t know the internals of the SQLite very well, but my understanding >>> is that it is not really a “driver” in the traditional sense that >>> communicates with a db - but rather it is the implementation as well. Since >>> it is the implementation, holding the lock seems reasonable to also hold >>> the rows. >>> >>> On Mar 9, 2025, at 2:59 PM, Gavra <gav...@gmail.com> wrote: >>> >>> Hi, >>> So we have been trying really hard to understand a major leak in our >>> product. >>> We are using this sqlite3 driver: https://github.com/mattn/go-sqlite3 >>> The pprof heap profile indicates the source is the call to >>> SQLiteRows.Columns func and two additional calls, one on SQLiteStmt and the >>> other on SQLiteConn. >>> According to the code >>> 1. SQLiteRows references SQLiteStmt which then references to SQLiteConn >>> 2. The SQLiteRows instance is the sole object holding a ref to the >>> allocation by Columns(). >>> This is a strong indication that refs to SQLiteRows are leaked. >>> We can confirm the leak is increasing over time and does not appear to >>> reflect a burst or large data. >>> We thought we forgot to close rows or somehow appened refs to rows etc >>> but we ruled it out completely since. Note the heap profile only shows >>> SQLite allocations, no app objects allocated. We verified that forgetting >>> to call SQLiteRows.Close leaks only memory in CGO which means it is not >>> visible in the heap profile. >>> So on one hand we are convinced someone is holding a reference to >>> SQLiteRows but on the other hand it is not our application and not the >>> SQLite driver. >>> >>> We tracked back our source code changes and noticed that correlate to >>> the appearance of this issue: >>> 1. Upgrading go: 1.22.6 to 1.22.9 >>> 2. Upgrading the sqlite3 driver which had one non-negligible change: >>> adding SetFinalizer to all these objects. >>> >>> This is a very weird thing to suggest, but we think this could be caused >>> by the go runtime, somehow. >>> I attached a screenshot of the heap profile, focused on the major leak >>> around the sqlite3 driver. >>> (our current plans is to use goref or dlv on a core dump to understand >>> who is holding these references but that could take a few more days). >>> >>> Thank you. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "golang-nuts" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to golang-nuts...@googlegroups.com. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/golang-nuts/8a8b608b-e501-403a-b7a7-0d0bda657e4cn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/golang-nuts/8a8b608b-e501-403a-b7a7-0d0bda657e4cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> <Screenshot 2025-03-09 at 21.40.22.png> >>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "golang-nuts" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to golang-nuts...@googlegroups.com. >> To view this discussion visit >> https://groups.google.com/d/msgid/golang-nuts/b5c236d6-1a90-4776-b33b-1234d0b58e02n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/golang-nuts/b5c236d6-1a90-4776-b33b-1234d0b58e02n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "golang-nuts" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/golang-nuts/uL68-fxg2K4/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> golang-nuts...@googlegroups.com. >> To view this discussion visit >> https://groups.google.com/d/msgid/golang-nuts/2B907B52-C0D4-4C42-8513-95BA4BFF3CF0%40ix.netcom.com >> >> <https://groups.google.com/d/msgid/golang-nuts/2B907B52-C0D4-4C42-8513-95BA4BFF3CF0%40ix.netcom.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/7529397c-3744-40a1-ad1e-765b3262664an%40googlegroups.com.