Hi,

On Thu, Jul 02, 2020 at 11:52:56AM -0700, Cong Wang wrote:
> When we clone a socket in sk_clone_lock(), its sk_cgrp_data is
> copied, so the cgroup refcnt must be taken too. And, unlike the
> sk_alloc() path, sock_update_netprioidx() is not called here.
> Therefore, it is safe and necessary to grab the cgroup refcnt
> even when cgroup_sk_alloc is disabled.
> 
> sk_clone_lock() is in BH context anyway, the in_interrupt()
> would terminate this function if called there. And for sk_alloc()
> skcd->val is always zero. So it's safe to factor out the code
> to make it more readable.
> 
> The global variable 'cgroup_sk_alloc_disabled' is used to determine
> whether to take these reference counts. It is impossible to make
> the reference counting correct unless we save this bit of information
> in skcd->val. So, add a new bit there to record whether the socket
> has already taken the reference counts. This obviously relies on
> kmalloc() to align cgroup pointers to at least 4 bytes,
> ARCH_KMALLOC_MINALIGN is certainly larger than that.
> 
> This bug seems to be introduced since the beginning, commit
> d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
> tried to fix it but not compeletely. It seems not easy to trigger until
> the recent commit 090e28b229af
> ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged.
> 

This patch causes all my s390 boot tests to crash. Reverting it fixes
the problem. Please see bisect results and and crash log below.

Guenter

---
bisect results (from pending-fixes branch) in -next repository):

# bad: [1432f824c2db44ef35b26caa9f81dd05211a75fc] Merge remote-tracking branch 
'drm-misc-fixes/for-linux-next-fixes'
# good: [dcb7fd82c75ee2d6e6f9d8cc71c52519ed52e258] Linux 5.8-rc4
git bisect start 'HEAD' 'v5.8-rc4'
# bad: [fe12f8184e7265e2d24e5ed5b255275dfe4c1c04] Merge remote-tracking branch 
'net/master'
git bisect bad fe12f8184e7265e2d24e5ed5b255275dfe4c1c04
# good: [474112d57c70520ebd81a5ca578fee1d93fafd07] Documentation: networking: 
ipvs-sysctl: drop doubled word
git bisect good 474112d57c70520ebd81a5ca578fee1d93fafd07
# good: [6d12075ddeedc38d25c5b74e929e686158da728c] Merge tag 
'mtd/fixes-for-5.8-rc5' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
git bisect good 6d12075ddeedc38d25c5b74e929e686158da728c
# good: [74478ea4ded519db35cb1f059948b1e713bb4abf] net: ipa: fix QMI structure 
definition bugs
git bisect good 74478ea4ded519db35cb1f059948b1e713bb4abf
# bad: [9c29e36152748fd623fcff6cc8f538550f9eeafc] mptcp: fix DSS map generation 
on fin retransmission
git bisect bad 9c29e36152748fd623fcff6cc8f538550f9eeafc
# good: [aea23c323d89836bcdcee67e49def997ffca043b] ipv6: Fix use of anycast 
address with loopback
git bisect good aea23c323d89836bcdcee67e49def997ffca043b
# bad: [28b18e4eb515af7c6661c3995c6e3c34412c2874] net: sky2: initialize return 
of gm_phy_read
git bisect bad 28b18e4eb515af7c6661c3995c6e3c34412c2874
# bad: [ad0f75e5f57ccbceec13274e1e242f2b5a6397ed] cgroup: fix cgroup_sk_alloc() 
for sk_clone_lock()
git bisect bad ad0f75e5f57ccbceec13274e1e242f2b5a6397ed
# first bad commit: [ad0f75e5f57ccbceec13274e1e242f2b5a6397ed] cgroup: fix 
cgroup_sk_alloc() for sk_clone_lock()

---
Crash log:

[   22.390674] Run /sbin/init as init process
[   22.497551] Unable to handle kernel pointer dereference in virtual kernel 
address space
[   22.497738] Failing address: 5010f0b45fa93000 TEID: 5010f0b45fa93803
[   22.497813] Fault in home space mode while using kernel ASCE.
[   22.497958] AS:0000000001774007 R3:0000000000000024
[   22.498300] Oops: 0038 ilc:3 [#1] SMP
[   22.498405] Modules linked in:
[   22.499027] CPU: 0 PID: 153 Comm: init Not tainted 
5.8.0-rc4-00328-g1432f824c2db4 #1
[   22.499112] Hardware name: QEMU 2964 QEMU (KVM/Linux)
[   22.499261] Krnl PSW : 0704e00180000000 0000000000259be0 
(cgroup_sk_free+0xa8/0x1e8)
[   22.499405]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
[   22.499506] Krnl GPRS: 0000000048a38585 5010f0b45fa93094 0000000000000002 
000000001c228bd8
[   22.499585]            000000001c228bb0 0000000000000000 0000000000000000 
00000000011c2eda
[   22.499665]            fffffffffffff000 00000000011c1f72 00000000011deef0 
0000000000014040
[   22.499744]            000000001c228100 0000000000e76bf0 0000000000259c82 
000003e0002c3c00
[   22.500270] Krnl Code: 0000000000259bd2: a72a0001            ahi     %r2,1
[   22.500270]            0000000000259bd6: 502003a8            st      %r2,936
[   22.500270]           #0000000000259bda: e31003b80008        ag      %r1,952
[   22.500270]           >0000000000259be0: e32010000004        lg      
%r2,0(%r1)
[   22.500270]            0000000000259be6: a7f40004            brc     
15,0000000000259bee
[   22.500270]            0000000000259bea: b9040023            lgr     %r2,%r3
[   22.500270]            0000000000259bee: b9040032            lgr     %r3,%r2
[   22.500270]            0000000000259bf2: b9040042            lgr     %r4,%r2
[   22.500635] Call Trace:
[   22.500748]  [<0000000000259be0>] cgroup_sk_free+0xa8/0x1e8
[   22.500835] ([<0000000000259bb4>] cgroup_sk_free+0x7c/0x1e8)
[   22.500914]  [<0000000000b24e16>] __sk_destruct+0x196/0x260
[   22.500999]  [<0000000000cadc18>] unix_release_sock+0x358/0x460
[   22.501073]  [<0000000000cadd5a>] unix_release+0x3a/0x60
[   22.501149]  [<0000000000b1a63a>] __sock_release+0x62/0xf8
[   22.501223]  [<0000000000b1a6f8>] sock_close+0x28/0x38
[   22.501299]  [<000000000045101e>] __fput+0x126/0x2a8
[   22.501374]  [<000000000017e088>] task_work_run+0x78/0xc8
[   22.501449]  [<000000000010a596>] do_notify_resume+0x9e/0xa8
[   22.501526]  [<0000000000de555a>] system_call+0xe6/0x2d4
[   22.501657] INFO: lockdep is turned off.
[   22.501736] Last Breaking-Event-Address:
[   22.501814]  [<0000000000259c86>] cgroup_sk_free+0x14e/0x1e8
[   22.502169] Kernel panic - not syncing: Fatal exception: panic_on_oops

Reply via email to