Kleber, I finally had some time to narrow down what commit fixes this issue for us today, below is the commit:
commit 7514c0362ffdd9af953ae94334018e7356b31313 Merge: 9322c47b21b9 428fc0aff4e5 Author: Linus Torvalds <torva...@linux-foundation.org> Date: Sat Sep 5 13:28:40 2020 -0700 Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "19 patches. Subsystems affected by this patch series: MAINTAINERS, ipc, fork, checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration, hugetlb)" * emailed patches from Andrew Morton <a...@linux-foundation.org>: include/linux/log2.h: add missing () around n in roundup_pow_of_two() mm/khugepaged.c: fix khugepaged's request size in collapse_file mm/hugetlb: fix a race between hugetlb sysctl handlers mm/hugetlb: try preferred node first when alloc gigantic page from cma mm/migrate: preserve soft dirty in remove_migration_pte() mm/migrate: remove unnecessary is_zone_device_page() check mm/rmap: fixup copying of soft dirty and uffd ptes mm/migrate: fixup setting UFFD_WP flag mm: madvise: fix vma user-after-free checkpatch: fix the usage of capture group ( ... ) fork: adjust sysctl_max_threads definition to match prototype ipc: adjust proc_ipc_sem_dointvec definition to match prototype mm: track page table modifications in __apply_to_page_range() MAINTAINERS: IA64: mark Status as Odd Fixes only MAINTAINERS: add LLVM maintainers MAINTAINERS: update Cavium/Marvell entries mm: slub: fix conversion of freelist_corrupted() mm: memcg: fix memcg reclaim soft lockup memcg: fix use-after-free in uncharge_batch I also verified that the latest available Ubuntu 18.04 kernel as of today (5.4.0-1054.57~18.04.1) still hits this memory leak issue for us. Please let me know if you need any further information from us to hopefully get this fix pulled into the supported Ubuntu 18.04 AWS kernel. Thanks, Paul -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1925261 Title: memory leak on AWS kernels when using docker Status in linux-aws package in Ubuntu: New Status in linux-aws source package in Focal: New Bug description: Ever since the "ubuntu-bionic-18.04-amd64-server-20200729" EC2 Ubuntu AMI was released which has the "5.3.0-1032-aws" kernel we have been hitting a 100% repro memory leak that causes our app that is running under docker to be OOM killed. The scenario is that we have an app running in a docker container and it occasionally catches a crash happening within itself and when that happens it creates another process which triggers a gdb dump of that parent app. Normally this works fine but under these specific kernels it causes the memory usage to grow and grow until it hits the maximum allowed memory for the container at which point the container is killed. I have tested using several of the latest available Ubuntu AMIs including the latest "ubuntu-bionic-18.04-amd64-server-20210415" which has the "5.4.0-1045-aws" kernel and the bug still exists. I also tested a bunch of the mainline kernels and found the fix was introduced for this memory leak in the v5.9-rc4 kernel (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.9-rc4/CHANGES). Do you all have any idea if or when that set of changes will be backported into a supported kernel for Ubuntu 18.04 or 20.04? Release we are running: root@<redacted>:~# lsb_release -rd Description: Ubuntu 18.04.5 LTS Release: 18.04 Docker / containerd.io versions: - containerd.io: 1.4.4-1 - docker-ce: 5:20.10.5~3-0~ubuntu-bionic Latest supported kernel I tried which still sees the memory leak: root@hostname:~# apt-cache policy linux-aws linux-aws: Installed: 5.4.0.1045.27 Candidate: 5.4.0.1045.27 Version table: *** 5.4.0.1045.27 500 500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages 100 /var/lib/dpkg/status 4.15.0.1007.7 500 500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 Packages Thanks, Paul To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1925261/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp