apport information ** Attachment added: "UdevDb.txt" https://bugs.launchpad.net/bugs/2023143/+attachment/5678294/+files/UdevDb.txt
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2023143 Title: Memory leak on large server Status in linux package in Ubuntu: New Bug description: Hi, We are trying to diagnose a kernel memory look on a production Ubuntu 22.04.2 LTS. We have tried several official Ubuntu kernels, 5.15aws, 5.19aws and now even 6.2.0-1004-aws (all Ubuntu signed): ``` # cat /proc/version_signature Ubuntu 6.2.0-1004.4-aws 6.2.6 ``` This is a production server so we'll appreciate any and all help diagnosing and solving this issue! The server is an u-112 instance with 12TB RAM, and is losing 1TB+ of memory a day to a kernel leak. For example, currently with an uptime of 3.5 days, we have 1.8Ti available, however RSS+slabs is only 4.1TB. all active process together take about 4TB of RAM (`ps -eo rss | awk 'BEGIN {x=0} {x = x + $1} END {print x}'` gives 4088636708). From slabtop we see about 100GB are consumed by slab (`slabtop -o -s t | head`: ) ``` Active / Total Objects (% used) : 303580174 / 332642344 (91.3%) Active / Total Slabs (% used) : 6697552 / 6697552 (100.0%) Active / Total Caches (% used) : 158 / 215 (73.5%) Active / Total Size (% used) : 112801663.93K / 121442845.45K (92.9%) Minimum / Average / Maximum Object : 0.01K / 0.36K / 16.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 67537280 59696907 88% 0.03K 527635 128 2110540K kmalloc-32 65247564 65241398 99% 0.31K 1279364 51 20469824K arc_buf_hdr_t_full 58270446 58040685 99% 0.10K 747057 78 5976456K abd_t 16697268 13731405 82% 0.38K 397554 42 6360864K dmu_buf_impl_t 15982912 10366686 64% 0.50K 249733 64 7991456K kmalloc-512 14975616 11605380 77% 0.06K 233994 64 935976K kmalloc-64 ``` In /proc/meminfo: ``` MemTotal: 12656421408 kB MemFree: 1975976204 kB MemAvailable: 1968415088 kB Buffers: 1087956 kB Cached: 101168004 kB SwapCached: 17912340 kB Active: 101022084 kB Inactive: 4129984264 kB Active(anon): 94623216 kB Inactive(anon): 4104673512 kB Active(file): 6398868 kB Inactive(file): 25310752 kB Unevictable: 338908 kB Mlocked: 332132 kB SwapTotal: 4294967292 kB SwapFree: 3500705532 kB Zswap: 0 kB Zswapped: 0 kB Dirty: 2908 kB Writeback: 0 kB AnonPages: 4123489132 kB Mapped: 3761620 kB Shmem: 70756156 kB KReclaimable: 10319220 kB Slab: 122355620 kB SReclaimable: 10319220 kB SUnreclaim: 112036400 kB KernelStack: 1793296 kB PageTables: 21748556 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 10623177996 kB Committed_AS: 6775476544 kB VmallocTotal: 34359738367 kB VmallocUsed: 296984480 kB VmallocChunk: 0 kB Percpu: 1326080 kB HardwareCorrupted: 0 kB AnonHugePages: 1630980096 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 2056036 kB DirectMap2M: 40935424 kB DirectMap1G: 12814647296 kB ``` Its not a tmpfs/shm fs issue either: ``` df -h | grep -E 'tmpfs|shm' tmpfs 256G 70G 187G 27% /dev/shm tmpfs 256G 3.4M 256G 1% /run tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 8.0G 24K 8.0G 1% /run/user/10102 tmpfs 8.0G 24K 8.0G 1% /run/user/1002 tmpfs 8.0G 24K 8.0G 1% /run/user/10030 tmpfs 8.0G 24K 8.0G 1% /run/user/10194 tmpfs 8.0G 24K 8.0G 1% /run/user/10200 tmpfs 8.0G 24K 8.0G 1% /run/user/10136 tmpfs 8.0G 24K 8.0G 1% /run/user/10198 tmpfs 8.0G 24K 8.0G 1% /run/user/10143 tmpfs 8.0G 24K 8.0G 1% /run/user/10188 tmpfs 8.0G 24K 8.0G 1% /run/user/10124 tmpfs 8.0G 24K 8.0G 1% /run/user/10174 tmpfs 8.0G 24K 8.0G 1% /run/user/10165 tmpfs 8.0G 24K 8.0G 1% /run/user/10197 tmpfs 8.0G 24K 8.0G 1% /run/user/10183 tmpfs 8.0G 24K 8.0G 1% /run/user/10033 tmpfs 8.0G 24K 8.0G 1% /run/user/10023 tmpfs 8.0G 24K 8.0G 1% /run/user/10133 tmpfs 8.0G 24K 8.0G 1% /run/user/10185 tmpfs 8.0G 24K 8.0G 1% /run/user/10201 tmpfs 8.0G 24K 8.0G 1% /run/user/1004 tmpfs 8.0G 24K 8.0G 1% /run/user/10014 ``` --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: N/A CasperMD5CheckResult: unknown DistroRelease: Ubuntu 22.04 Ec2AMI: ami-08c40ec9ead489470 Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-east-1d Ec2InstanceType: u-12tb1.112xlarge Ec2Kernel: unavailable Ec2Ramdisk: unavailable IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci: Error: [Errno 2] No such file or directory: 'lspci' Lspci-vt: Error: [Errno 2] No such file or directory: 'lspci' Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Amazon EC2 u-12tb1.112xlarge NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: LC_CTYPE=C.UTF-8 TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.2.0-1004-aws root=PARTUUID=cbb5015f-ca94-467b-91ae-cce97828a042 ro quiet mitigations=off console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1 ProcVersionSignature: Ubuntu 6.2.0-1004.4-aws 6.2.6 RelatedPackageVersions: linux-restricted-modules-6.2.0-1004-aws N/A linux-backports-modules-6.2.0-1004-aws N/A linux-firmware N/A RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: jammy ec2-images Uname: Linux 6.2.0-1004-aws x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: False dmi.bios.date: 10/16/2017 dmi.bios.release: 1.0 dmi.bios.vendor: Amazon EC2 dmi.bios.version: 1.0 dmi.board.asset.tag: i-0b8914fe51e3d7555 dmi.board.vendor: Amazon EC2 dmi.chassis.asset.tag: Amazon EC2 dmi.chassis.type: 1 dmi.chassis.vendor: Amazon EC2 dmi.modalias: dmi:bvnAmazonEC2:bvr1.0:bd10/16/2017:br1.0:svnAmazonEC2:pnu-12tb1.112xlarge:pvr:rvnAmazonEC2:rn:rvr:cvnAmazonEC2:ct1:cvr:sku: dmi.product.name: u-12tb1.112xlarge dmi.sys.vendor: Amazon EC2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023143/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp