[PATCH 5.10 45/54] bpf: Fix 32 bit src register truncation on div/mod
From: Daniel Borkmann commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream. While reviewing a different fix, John and I noticed an oddity in one of the BPF program dumps that stood out, for example: # bpftool p d x i 13 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 [...] In line 2 we noticed that the mov32 would 32 bit truncate the original src register for the div/mod operation. While for the two operations the dst register is typically marked unknown e.g. from adjust_scalar_min_max_vals() the src register is not, and thus verifier keeps tracking original bounds, simplified: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = -1 1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=invP-1 R1_w=invP-1 R10=fp0 2: (3c) w0 /= w1 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R1_w=invP-1 R10=fp0 3: (77) r1 >>= 32 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R1_w=invP4294967295 R10=fp0 4: (bf) r0 = r1 5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0 5: (95) exit processed 6 insns (limit 100) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Runtime result of r0 at exit is 0 instead of expected -1. Remove the verifier mov32 src rewrite in div/mod and replace it with a jmp32 test instead. After the fix, we result in the following code generation when having dividend r1 and divisor r6: div, 64 bit: div, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2 3: (ac) w1 ^= w1 3: (ac) w1 ^= w1 4: (05) goto pc+14: (05) goto pc+1 5: (3f) r1 /= r6 5: (3c) w1 /= w6 6: (b7) r0 = 0 6: (b7) r0 = 0 7: (95) exit 7: (95) exit mod, 64 bit: mod, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1 3: (9f) r1 %= r6 3: (9c) w1 %= w6 4: (b7) r0 = 0 4: (b7) r0 = 0 5: (95) exit 5: (95) exit x86 in particular can throw a 'divide error' exception for div instruction not only for divisor being zero, but also for the case when the quotient is too large for the designated register. For the edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT since we always zero edx (rdx). Hence really the only protection needed is against divisor being zero. Fixes: 68fda450a7df ("bpf: fix 32-bit divide by zero") Co-developed-by: John Fastabend Signed-off-by: John Fastabend Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman --- kernel/bpf/verifier.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10866,30 +10866,28 @@ static int fixup_bpf_calls(struct bpf_ve insn->code == (BPF_ALU | BPF_MOD | BPF_X) || insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; - struct bpf_insn mask_and_div[] = { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + bool isdiv = BPF_OP(insn->code) == BPF_DIV; + struct bpf_insn *patchlet; + struct bpf_insn chk_and_div[] = { /* Rx div 0 -> 0 */ - BPF_JMP_IMM(BPF_JNE, insn->src_reg, 0, 2), + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | +BPF_JNE | BPF_K, insn->src_reg, +0, 2, 0), BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), BPF_JMP_IMM(BPF_JA, 0, 0, 1), *insn, }; - struct bpf_insn mask_and_mod[] = { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + struct bpf_insn chk_and_mod[] = { /* Rx mod 0 -> Rx */ - BPF_JMP_IMM(BPF_JEQ, insn->src_reg, 0, 1), + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | +BPF_JEQ | BPF_K, insn->src_reg, +0, 1, 0), *insn, }; - struct bpf_i
[PATCH 5.10 44/54] bpf: Fix verifier jmp32 pruning decision logic
From: Daniel Borkmann commit fd675184fc7abfd1e1c52d23e8e900676b5a1c1a upstream. Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: func#0 @0 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = 808464450 1: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b4) w4 = 808464432 2: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP808464432 R10=fp0 2: (9c) w4 %= w0 3: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R10=fp0 3: (66) if w4 s> 0x30303030 goto pc+0 R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x),s32_max_value=808464432) R10=fp0 4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x),s32_max_value=808464432) R10=fp0 4: (7f) r0 >>= r0 5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x),s32_max_value=808464432) R10=fp0 5: (9c) w4 %= w0 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 7: (d6) if w0 s<= 0x303030 goto pc+1 9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 9: (95) exit propagating r0 from 6 to 7: safe 4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=808464433,umax_value=2147483647,var_off=(0x0; 0x7fff)) R10=fp0 4: (7f) r0 >>= r0 5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=808464433,umax_value=2147483647,var_off=(0x0; 0x7fff)) R10=fp0 5: (9c) w4 %= w0 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 propagating r0 7: safe propagating r0 from 6 to 7: safe processed 15 insns (limit 100) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 The underlying program was xlated as follows: # bpftool p d x i 10 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 5: (66) if w4 s> 0x30303030 goto pc+0 6: (7f) r0 >>= r0 7: (bc) w0 = w0 8: (15) if r0 == 0x0 goto pc+1 9: (9c) w4 %= w0 10: (66) if w0 s> 0x3030 goto pc+0 11: (d6) if w0 s<= 0x303030 goto pc+1 12: (05) goto pc-1 13: (95) exit The verifier rewrote original instructions it recognized as dead code with 'goto pc-1', but reality differs from verifier simulation in that we are actually able to trigger a hang due to hitting the 'goto pc-1' instructions. Taking a closer look at the verifier analysis, the reason is that it misjudges its pruning decision at the first 'from 6 to 7: safe' occasion. What happens is that while both old/cur registers are marked as precise, they get misjudged for the jmp32 case as range_within() yields true, meaning that the prior verification path with a wider register bound could be verified successfully and therefore the current path with a narrower register bound is deemed safe as well whereas in reality it's not. R0 old/cur path's bounds compare as follows: old: smin_value=0x8000,smax_value=0x7fff,umin_value=0x0,umax_value=0x,var_off=(0x0; 0x) cur: smin_value=0x8000,smax_value=0x7fff7fff,umin_value=0x0,umax_value=0x7fff,var_off=(0x0; 0x7fff) old: s32_min_value=0x8000,s32_max_value=0x3030,u32_min_value=0x,u32_max_value=0x cur: s32_min_value=0x3031,s32_max_value=0x7fff,u32_min_value=0x3031,u32_max_value=0x7fff The 64 bit bounds generally look okay and while the information that got propagated from 32 to 64 bit looks correct as well, it's not precise enough for judging a conditional jmp32. Given the latter only operates on subregisters we also need to take these into account as well for a range_within() probe in order to be able to prune paths. Extending the range_within() constraint to both bounds will be able to tell us that the old signed 32 bit bounds are not wider than the cur signed 32 bit bounds. With the fix in place, the program will now verify the 'goto' branch case as it should have been: [...] 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 7: (d6) if w0 s<= 0x303030 goto pc+1 9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 9: (95) exit 7: R0_w=invP(id=0,smax
[PATCH 5.10 46/54] bpf: Fix verifier jsgt branch analysis on max bound
From: Daniel Borkmann commit ee114dd64c0071500345439fc79dd5e0f9d106ed upstream. Fix incorrect is_branch{32,64}_taken() analysis for the jsgt case. The return code for both will tell the caller whether a given conditional jump is taken or not, e.g. 1 means branch will be taken [for the involved registers] and the goto target will be executed, 0 means branch will not be taken and instead we fall-through to the next insn, and last but not least a -1 denotes that it is not known at verification time whether a branch will be taken or not. Now while the jsgt has the branch-taken case correct with reg->s32_min_value > sval, the branch-not-taken case is off-by-one when testing for reg->s32_max_value < sval since the branch will also be taken for reg->s32_max_value == sval. The jgt branch analysis, for example, gets this right. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Fixes: 4f7b3e82589e ("bpf: improve verifier branch analysis") Signed-off-by: Daniel Borkmann Reviewed-by: John Fastabend Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman --- kernel/bpf/verifier.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6822,7 +6822,7 @@ static int is_branch32_taken(struct bpf_ case BPF_JSGT: if (reg->s32_min_value > sval) return 1; - else if (reg->s32_max_value < sval) + else if (reg->s32_max_value <= sval) return 0; break; case BPF_JLT: @@ -6895,7 +6895,7 @@ static int is_branch64_taken(struct bpf_ case BPF_JSGT: if (reg->smin_value > sval) return 1; - else if (reg->smax_value < sval) + else if (reg->smax_value <= sval) return 0; break; case BPF_JLT:
[PATCH 5.10 50/54] Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
From: Johannes Weiner commit e82553c10b084153f9bf0af333c0a1550fd7 upstream. This reverts commit 536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can cause writers to memory.high to get stuck in the kernel forever, performing page reclaim and consuming excessive amounts of CPU cycles. Before the patch, a write to memory.high would first put the new limit in place for the workload, and then reclaim the requested delta. After the patch, the kernel tries to reclaim the delta before putting the new limit into place, in order to not overwhelm the workload with a sudden, large excess over the limit. However, if reclaim is actively racing with new allocations from the uncurbed workload, it can keep the write() working inside the kernel indefinitely. This is causing problems in Facebook production. A privileged system-level daemon that adjusts memory.high for various workloads running on a host can get unexpectedly stuck in the kernel and essentially turn into a sort of involuntary kswapd for one of the workloads. We've observed that daemon busy-spin in a write() for minutes at a time, neglecting its other duties on the system, and expending privileged system resources on behalf of a workload. To remedy this, we have first considered changing the reclaim logic to break out after a couple of loops - whether the workload has converged to the new limit or not - and bound the write() call this way. However, the root cause that inspired the sequence change in the first place has been fixed through other means, and so a revert back to the proven limit-setting sequence, also used by memory.max, is preferable. The sequence was changed to avoid extreme latencies in the workload when the limit was lowered: the sudden, large excess created by the limit lowering would erroneously trigger the penalty sleeping code that is meant to throttle excessive growth from below. Allocating threads could end up sleeping long after the write() had already reclaimed the delta for which they were being punished. However, erroneous throttling also caused problems in other scenarios at around the same time. This resulted in commit b3ff92916af3 ("mm, memcg: reclaim more aggressively before high allocator throttling"), included in the same release as the offending commit. When allocating threads now encounter large excess caused by a racing write() to memory.high, instead of entering punitive sleeps, they will simply be tasked with helping reclaim down the excess, and will be held no longer than it takes to accomplish that. This is in line with regular limit enforcement - i.e. if the workload allocates up against or over an otherwise unchanged limit from below. With the patch breaking userspace, and the root cause addressed by other means already, revert it again. Link: https://lkml.kernel.org/r/20210122184341.292461-1-han...@cmpxchg.org Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high") Signed-off-by: Johannes Weiner Reported-by: Tejun Heo Acked-by: Chris Down Acked-by: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Cc: Michal Koutný Cc: [5.8+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/memcontrol.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6320,6 +6320,8 @@ static ssize_t memory_high_write(struct if (err) return err; + page_counter_set_high(&memcg->memory, high); + for (;;) { unsigned long nr_pages = page_counter_read(&memcg->memory); unsigned long reclaimed; @@ -6343,10 +6345,7 @@ static ssize_t memory_high_write(struct break; } - page_counter_set_high(&memcg->memory, high); - memcg_wb_domain_size_changed(memcg); - return nbytes; }
[PATCH 5.10 00/54] 5.10.16-rc1 review
This is the start of the stable review cycle for the 5.10.16 release. There are 54 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Sat, 13 Feb 2021 15:01:39 +. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.16-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 5.10.16-rc1 Phillip Lougher squashfs: add more sanity checks in xattr id lookup Phillip Lougher squashfs: add more sanity checks in inode lookup Phillip Lougher squashfs: add more sanity checks in id lookup Phillip Lougher squashfs: avoid out of bounds writes in decompressors Johannes Weiner Revert "mm: memcontrol: avoid workload stalls when lowering memory.high" Joachim Henke nilfs2: make splice write available again Ville Syrjälä drm/i915: Skip vswing programming for TBT Ville Syrjälä drm/i915: Fix ICL MG PHY vswing handling Daniel Borkmann bpf: Fix verifier jsgt branch analysis on max bound Daniel Borkmann bpf: Fix 32 bit src register truncation on div/mod Daniel Borkmann bpf: Fix verifier jmp32 pruning decision logic Mark Brown regulator: Fix lockdep warning resolving supplies Baolin Wang blk-cgroup: Use cond_resched() when destroy blkgs Qii Wang i2c: mediatek: Move suspend and resume handling to NOIRQ phase Dave Wysochanski SUNRPC: Handle 0 length opaque XDR object data properly Dave Wysochanski SUNRPC: Move simple_get_bytes and simple_get_netobj into private header Johannes Berg iwlwifi: queue: bail out on invalid freeing Johannes Berg iwlwifi: mvm: guard against device removal in reprobe Luca Coelho iwlwifi: pcie: add rules to match Qu with Hr2 Gregory Greenman iwlwifi: mvm: invalidate IDs of internal stations at mvm start Johannes Berg iwlwifi: pcie: fix context info memory leak Emmanuel Grumbach iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap Johannes Berg iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time() Sara Sharon iwlwifi: mvm: skip power command when unbinding vif during CSA Libin Yang ASoC: Intel: sof_sdw: set proper flags for Dell TGL-H SKU 0A5E Eliot Blennerhassett ASoC: ak4458: correct reset polarity Bard Liao ALSA: hda: intel-dsp-config: add PCI id for TGL-H Trond Myklebust pNFS/NFSv4: Improve rejection of out-of-order layouts Trond Myklebust pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process() Pan Bian chtls: Fix potential resource leak Ricardo Ribalda ASoC: Intel: Skylake: Zero snd_ctl_elem_value Shay Bar mac80211: 160MHz with extended NSS BW in CSA Ben Skeggs drm/nouveau/nvif: fix method count when pushing an array James Schulman ASoC: wm_adsp: Fix control name parsing for multi-fw David Collins regulator: core: avoid regulator_resolve_supply() race condition Cong Wang af_key: relax availability checks for skb size calculation Raoni Fassina Firmino powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics Kent Gibson gpiolib: cdev: clear debounce period if line set to output Pavel Begunkov io_uring: drop mm/files between task_work_submit Pavel Begunkov io_uring: reinforce cancel on flush during exit Pavel Begunkov io_uring: fix sqo ownership false positive warning Pavel Begunkov io_uring: fix list corruption for splice file_get Hao Xu io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE Pavel Begunkov io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE Pavel Begunkov io_uring: replace inflight_wait with tctx->wait Pavel Begunkov io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE Jens Axboe io_uring: if we see flush on exit, cancel related tasks Jens Axboe io_uring: account io_uring internal files as REQ_F_INFLIGHT Pavel Begunkov io_uring: fix files cancellation Pavel Begunkov io_uring: always batch cancel in *cancel_files() Pavel Begunkov io_uring: pass files into kill timeouts/poll Pavel Begunkov io_uring: don't iterate io_uring_cancel_files() Pavel Begunkov io_uring: add a {task,files} pair matching helper Pavel Begunkov io_uring: simplify io_task_match() - Diffstat: Makefile | 4 +- arch/powerpc/kernel/vdso.c | 2 +- arch/powerpc/kernel/vdso64/sigtramp.S | 11 +- arch/powerpc/kernel/vdso64/vdso64.lds.S| 1 + block/blk-cgroup.c
[PATCH 5.10 05/54] io_uring: always batch cancel in *cancel_files()
From: Pavel Begunkov [ Upstream commit f6edbabb8359798c541b0776616c5eab3a840d3d ] Instead of iterating over each request and cancelling it individually in io_uring_cancel_files(), try to cancel all matching requests and use ->inflight_list only to check if there anything left. In many cases it should be faster, and we can reuse a lot of code from task cancellation. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io-wq.c| 10 fs/io-wq.h|1 fs/io_uring.c | 139 -- 3 files changed, 20 insertions(+), 130 deletions(-) --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -1078,16 +1078,6 @@ enum io_wq_cancel io_wq_cancel_cb(struct return IO_WQ_CANCEL_NOTFOUND; } -static bool io_wq_io_cb_cancel_data(struct io_wq_work *work, void *data) -{ - return work == data; -} - -enum io_wq_cancel io_wq_cancel_work(struct io_wq *wq, struct io_wq_work *cwork) -{ - return io_wq_cancel_cb(wq, io_wq_io_cb_cancel_data, (void *)cwork, false); -} - struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) { int ret = -ENOMEM, node; --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -130,7 +130,6 @@ static inline bool io_wq_is_hashed(struc } void io_wq_cancel_all(struct io_wq *wq); -enum io_wq_cancel io_wq_cancel_work(struct io_wq *wq, struct io_wq_work *cwork); typedef bool (work_cancel_fn)(struct io_wq_work *, void *); --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1496,15 +1496,6 @@ static void io_kill_timeout(struct io_ki } } -static bool io_task_match(struct io_kiocb *req, struct task_struct *tsk) -{ - struct io_ring_ctx *ctx = req->ctx; - - if (!tsk || req->task == tsk) - return true; - return (ctx->flags & IORING_SETUP_SQPOLL); -} - /* * Returns true if we found and killed one or more timeouts */ @@ -8524,112 +8515,31 @@ static int io_uring_release(struct inode return 0; } -/* - * Returns true if 'preq' is the link parent of 'req' - */ -static bool io_match_link(struct io_kiocb *preq, struct io_kiocb *req) -{ - struct io_kiocb *link; - - if (!(preq->flags & REQ_F_LINK_HEAD)) - return false; - - list_for_each_entry(link, &preq->link_list, link_list) { - if (link == req) - return true; - } - - return false; -} - -/* - * We're looking to cancel 'req' because it's holding on to our files, but - * 'req' could be a link to another request. See if it is, and cancel that - * parent request if so. - */ -static bool io_poll_remove_link(struct io_ring_ctx *ctx, struct io_kiocb *req) -{ - struct hlist_node *tmp; - struct io_kiocb *preq; - bool found = false; - int i; - - spin_lock_irq(&ctx->completion_lock); - for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) { - struct hlist_head *list; - - list = &ctx->cancel_hash[i]; - hlist_for_each_entry_safe(preq, tmp, list, hash_node) { - found = io_match_link(preq, req); - if (found) { - io_poll_remove_one(preq); - break; - } - } - } - spin_unlock_irq(&ctx->completion_lock); - return found; -} - -static bool io_timeout_remove_link(struct io_ring_ctx *ctx, - struct io_kiocb *req) -{ - struct io_kiocb *preq; - bool found = false; - - spin_lock_irq(&ctx->completion_lock); - list_for_each_entry(preq, &ctx->timeout_list, timeout.list) { - found = io_match_link(preq, req); - if (found) { - __io_timeout_cancel(preq); - break; - } - } - spin_unlock_irq(&ctx->completion_lock); - return found; -} +struct io_task_cancel { + struct task_struct *task; + struct files_struct *files; +}; -static bool io_cancel_link_cb(struct io_wq_work *work, void *data) +static bool io_cancel_task_cb(struct io_wq_work *work, void *data) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); + struct io_task_cancel *cancel = data; bool ret; - if (req->flags & REQ_F_LINK_TIMEOUT) { + if (cancel->files && (req->flags & REQ_F_LINK_TIMEOUT)) { unsigned long flags; struct io_ring_ctx *ctx = req->ctx; /* protect against races with linked timeouts */ spin_lock_irqsave(&ctx->completion_lock, flags); - ret = io_match_link(req, data); + ret = io_match_task(req, cancel->task, cancel->files); spin_unlock_irqrestore(&ctx->completion_lock, flags); } else { - ret = io_match_link(req, data); + ret = io_match_task(req, cancel->task, cancel->files);
[PATCH 5.10 42/54] blk-cgroup: Use cond_resched() when destroy blkgs
From: Baolin Wang [ Upstream commit 6c635caef410aa757befbd8857c1eadde5cc22ed ] On !PREEMPT kernel, we can get below softlockup when doing stress testing with creating and destroying block cgroup repeatly. The reason is it may take a long time to acquire the queue's lock in the loop of blkcg_destroy_blkgs(), or the system can accumulate a huge number of blkgs in pathological cases. We can add a need_resched() check on each loop and release locks and do cond_resched() if true to avoid this issue, since the blkcg_destroy_blkgs() is not called from atomic contexts. [ 4757.010308] watchdog: BUG: soft lockup - CPU#11 stuck for 94s! [ 4757.010698] Call trace: [ 4757.010700] blkcg_destroy_blkgs+0x68/0x150 [ 4757.010701] cgwb_release_workfn+0x104/0x158 [ 4757.010702] process_one_work+0x1bc/0x3f0 [ 4757.010704] worker_thread+0x164/0x468 [ 4757.010705] kthread+0x108/0x138 Suggested-by: Tejun Heo Signed-off-by: Baolin Wang Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- block/blk-cgroup.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 54fbe1e80cc41..f13688c4b9317 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1017,6 +1017,8 @@ static void blkcg_css_offline(struct cgroup_subsys_state *css) */ void blkcg_destroy_blkgs(struct blkcg *blkcg) { + might_sleep(); + spin_lock_irq(&blkcg->lock); while (!hlist_empty(&blkcg->blkg_list)) { @@ -1024,14 +1026,20 @@ void blkcg_destroy_blkgs(struct blkcg *blkcg) struct blkcg_gq, blkcg_node); struct request_queue *q = blkg->q; - if (spin_trylock(&q->queue_lock)) { - blkg_destroy(blkg); - spin_unlock(&q->queue_lock); - } else { + if (need_resched() || !spin_trylock(&q->queue_lock)) { + /* +* Given that the system can accumulate a huge number +* of blkgs in pathological cases, check to see if we +* need to rescheduling to avoid softlockup. +*/ spin_unlock_irq(&blkcg->lock); - cpu_relax(); + cond_resched(); spin_lock_irq(&blkcg->lock); + continue; } + + blkg_destroy(blkg); + spin_unlock(&q->queue_lock); } spin_unlock_irq(&blkcg->lock); -- 2.27.0
[PATCH 5.10 47/54] drm/i915: Fix ICL MG PHY vswing handling
From: Ville Syrjälä commit a2a5f5628e5494ca9353f761f7fe783dfa82fb9a upstream. The MH PHY vswing table does have all the entries these days. Get rid of the old hacks in the code which claim otherwise. This hack was totally bogus anyway. The correct way to handle the lack of those two entries would have been to declare our max vswing and pre-emph to both be level 2. Cc: José Roberto de Souza Cc: Clinton Taylor Fixes: 9f7ffa297978 ("drm/i915/tc/icl: Update TC vswing tables") Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20201207203512.1718-1-ville.syrj...@linux.intel.com Reviewed-by: Imre Deak Reviewed-by: José Roberto de Souza (cherry picked from commit 5ec346476e795089b7dac8ab9dcee30c8d80ad84) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/display/intel_ddi.c |7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -2605,12 +2605,11 @@ static void icl_mg_phy_ddi_vswing_sequen ddi_translations = icl_get_mg_buf_trans(encoder, type, rate, &n_entries); - /* The table does not have values for level 3 and level 9. */ - if (level >= n_entries || level == 3 || level == 9) { + if (level >= n_entries) { drm_dbg_kms(&dev_priv->drm, "DDI translation not found for level %d. Using %d instead.", - level, n_entries - 2); - level = n_entries - 2; + level, n_entries - 1); + level = n_entries - 1; } /* Set MG_TX_LINK_PARAMS cri_use_fs32 to 0. */
[PATCH 5.10 06/54] io_uring: fix files cancellation
From: Pavel Begunkov [ Upstream commit bee749b187ac57d1faf00b2ab356ff322230fce8 ] io_uring_cancel_files()'s task check condition mistakenly got flipped. 1. There can't be a request in the inflight list without IO_WQ_WORK_FILES, kill this check to keep the whole condition simpler. 2. Also, don't call the function for files==NULL to not do such a check, all that staff is already handled well by its counter part, __io_uring_cancel_task_requests(). With that just flip the task check. Also, it iowq-cancels all request of current task there, don't forget to set right ->files into struct io_task_cancel. Fixes: c1973b38bf639 ("io_uring: cancel only requests of current task") Reported-by: syzbot+c0d52d0b3c0c3ffb9...@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |8 1 file changed, 4 insertions(+), 4 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8571,15 +8571,14 @@ static void io_uring_cancel_files(struct struct files_struct *files) { while (!list_empty_careful(&ctx->inflight_list)) { - struct io_task_cancel cancel = { .task = task, .files = NULL, }; + struct io_task_cancel cancel = { .task = task, .files = files }; struct io_kiocb *req; DEFINE_WAIT(wait); bool found = false; spin_lock_irq(&ctx->inflight_lock); list_for_each_entry(req, &ctx->inflight_list, inflight_entry) { - if (req->task == task && - (req->work.flags & IO_WQ_WORK_FILES) && + if (req->task != task || req->work.identity->files != files) continue; found = true; @@ -8665,10 +8664,11 @@ static void io_uring_cancel_task_request io_cancel_defer_files(ctx, task, files); io_cqring_overflow_flush(ctx, true, task, files); - io_uring_cancel_files(ctx, task, files); if (!files) __io_uring_cancel_task_requests(ctx, task); + else + io_uring_cancel_files(ctx, task, files); if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) { atomic_dec(&task->io_uring->in_idle);
[PATCH 5.10 48/54] drm/i915: Skip vswing programming for TBT
From: Ville Syrjälä commit eaf5bfe37db871031232d2bf2535b6ca92afbad8 upstream. In thunderbolt mode the PHY is owned by the thunderbolt controller. We are not supposed to touch it. So skip the vswing programming as well (we already skipped the other steps not applicable to TBT). Touching this stuff could supposedly interfere with the PHY programming done by the thunderbolt controller. Cc: sta...@vger.kernel.org Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20210128155948.13678-1-ville.syrj...@linux.intel.com Reviewed-by: Imre Deak (cherry picked from commit f8c6b615b921d8a1bcd74870f9105e62b0bceff3) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/display/intel_ddi.c |6 ++ 1 file changed, 6 insertions(+) --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -2597,6 +2597,9 @@ static void icl_mg_phy_ddi_vswing_sequen u32 n_entries, val; int ln, rate = 0; + if (enc_to_dig_port(encoder)->tc_mode == TC_PORT_TBT_ALT) + return; + if (type != INTEL_OUTPUT_HDMI) { struct intel_dp *intel_dp = enc_to_intel_dp(encoder); @@ -2741,6 +2744,9 @@ tgl_dkl_phy_ddi_vswing_sequence(struct i u32 n_entries, val, ln, dpcnt_mask, dpcnt_val; int rate = 0; + if (enc_to_dig_port(encoder)->tc_mode == TC_PORT_TBT_ALT) + return; + if (type != INTEL_OUTPUT_HDMI) { struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
[PATCH 5.10 31/54] iwlwifi: mvm: skip power command when unbinding vif during CSA
From: Sara Sharon [ Upstream commit bf544e9aa570034e094a8a40d5f9e1e2c4916d18 ] In the new CSA flow, we remain associated during CSA, but still do a unbind-bind to the vif. However, sending the power command right after when vif is unbound but still associated causes FW to assert (0x3400) since it cannot tell the LMAC id. Just skip this command, we will send it again in a bit, when assigning the new context. Signed-off-by: Sara Sharon Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.64a2254ac5c3.Iaa3a9050bf3d7c9cd5beaf561e932e6defc12ec3@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c index b627e7da7ac9d..d42165559df6e 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c @@ -4249,6 +4249,9 @@ static void __iwl_mvm_unassign_vif_chanctx(struct iwl_mvm *mvm, iwl_mvm_binding_remove_vif(mvm, vif); out: + if (fw_has_capa(&mvm->fw->ucode_capa, IWL_UCODE_TLV_CAPA_CHANNEL_SWITCH_CMD) && + switching_chanctx) + return; mvmvif->phy_ctxt = NULL; iwl_mvm_power_update_mac(mvm); } -- 2.27.0
[PATCH 5.10 49/54] nilfs2: make splice write available again
From: Joachim Henke commit a35d8f016e0b68634035217d06d1c53863456b50 upstream. Since 5.10, splice() or sendfile() to NILFS2 return EINVAL. This was caused by commit 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops"). This patch initializes the splice_write field in file_operations, like most file systems do, to restore the functionality. Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryus...@gmail.com Signed-off-by: Joachim Henke Signed-off-by: Ryusuke Konishi Tested-by: Ryusuke Konishi Cc: [5.10+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/nilfs2/file.c |1 + 1 file changed, 1 insertion(+) --- a/fs/nilfs2/file.c +++ b/fs/nilfs2/file.c @@ -141,6 +141,7 @@ const struct file_operations nilfs_file_ /* .release = nilfs_release_file, */ .fsync = nilfs_sync_file, .splice_read= generic_file_splice_read, + .splice_write = iter_file_splice_write, }; const struct inode_operations nilfs_file_inode_operations = {
[PATCH 5.10 09/54] io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
From: Pavel Begunkov [ Upstream commit a1bb3cd58913338e1b627ea6b8c03c2ae82d293f ] If the tctx inflight number haven't changed because of cancellation, __io_uring_task_cancel() will continue leaving the task in TASK_UNINTERRUPTIBLE state, that's not expected by __io_uring_files_cancel(). Ensure we always call finish_wait() before retrying. Cc: sta...@vger.kernel.org # 5.9+ Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8829,15 +8829,15 @@ void __io_uring_task_cancel(void) prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE); /* -* If we've seen completions, retry. This avoids a race where -* a completion comes in before we did prepare_to_wait(). +* If we've seen completions, retry without waiting. This +* avoids a race where a completion comes in before we did +* prepare_to_wait(). */ - if (inflight != tctx_inflight(tctx)) - continue; - schedule(); + if (inflight == tctx_inflight(tctx)) + schedule(); + finish_wait(&tctx->wait, &wait); } while (1); - finish_wait(&tctx->wait, &wait); atomic_dec(&tctx->in_idle); io_uring_remove_task_files(tctx);
[PATCH 5.10 08/54] io_uring: if we see flush on exit, cancel related tasks
From: Jens Axboe [ Upstream commit 84965ff8a84f0368b154c9b367b62e59c1193f30 ] Ensure we match tasks that belong to a dead or dying task as well, as we need to reap those in addition to those belonging to the exiting task. Cc: sta...@vger.kernel.org # 5.9+ Reported-by: Josef Grieb Signed-off-by: Jens Axboe Signed-off-by: Pavel Begunkov Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1014,8 +1014,12 @@ static bool io_match_task(struct io_kioc { struct io_kiocb *link; - if (task && head->task != task) + if (task && head->task != task) { + /* in terms of cancelation, always match if req task is dead */ + if (head->task->flags & PF_EXITING) + return true; return false; + } if (!files) return true; if (__io_match_files(head, files)) @@ -8844,6 +8848,9 @@ static int io_uring_flush(struct file *f struct io_uring_task *tctx = current->io_uring; struct io_ring_ctx *ctx = file->private_data; + if (fatal_signal_pending(current) || (current->flags & PF_EXITING)) + io_uring_cancel_task_requests(ctx, NULL); + if (!tctx) return 0;
[PATCH 5.10 41/54] i2c: mediatek: Move suspend and resume handling to NOIRQ phase
From: Qii Wang [ Upstream commit de96c3943f591018727b862f51953c1b6c55bcc3 ] Some i2c device driver indirectly uses I2C driver when it is now being suspended. The i2c devices driver is suspended during the NOIRQ phase and this cannot be changed due to other dependencies. Therefore, we also need to move the suspend handling for the I2C controller driver to the NOIRQ phase as well. Signed-off-by: Qii Wang Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-mt65xx.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c index 0818d3e507347..2ffd2f354d0ae 100644 --- a/drivers/i2c/busses/i2c-mt65xx.c +++ b/drivers/i2c/busses/i2c-mt65xx.c @@ -1275,7 +1275,8 @@ static int mtk_i2c_probe(struct platform_device *pdev) mtk_i2c_clock_disable(i2c); ret = devm_request_irq(&pdev->dev, irq, mtk_i2c_irq, - IRQF_TRIGGER_NONE, I2C_DRV_NAME, i2c); + IRQF_NO_SUSPEND | IRQF_TRIGGER_NONE, + I2C_DRV_NAME, i2c); if (ret < 0) { dev_err(&pdev->dev, "Request I2C IRQ %d fail\n", irq); @@ -1302,7 +1303,16 @@ static int mtk_i2c_remove(struct platform_device *pdev) } #ifdef CONFIG_PM_SLEEP -static int mtk_i2c_resume(struct device *dev) +static int mtk_i2c_suspend_noirq(struct device *dev) +{ + struct mtk_i2c *i2c = dev_get_drvdata(dev); + + i2c_mark_adapter_suspended(&i2c->adap); + + return 0; +} + +static int mtk_i2c_resume_noirq(struct device *dev) { int ret; struct mtk_i2c *i2c = dev_get_drvdata(dev); @@ -1317,12 +1327,15 @@ static int mtk_i2c_resume(struct device *dev) mtk_i2c_clock_disable(i2c); + i2c_mark_adapter_resumed(&i2c->adap); + return 0; } #endif static const struct dev_pm_ops mtk_i2c_pm = { - SET_SYSTEM_SLEEP_PM_OPS(NULL, mtk_i2c_resume) + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_i2c_suspend_noirq, + mtk_i2c_resume_noirq) }; static struct platform_driver mtk_i2c_driver = { -- 2.27.0
[PATCH 5.10 07/54] io_uring: account io_uring internal files as REQ_F_INFLIGHT
From: Jens Axboe [ Upstream commit 02a13674fa0e8dd326de8b9f4514b41b03d99003 ] We need to actively cancel anything that introduces a potential circular loop, where io_uring holds a reference to itself. If the file in question is an io_uring file, then add the request to the inflight list. Cc: sta...@vger.kernel.org # 5.9+ Signed-off-by: Jens Axboe Signed-off-by: Pavel Begunkov Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 32 1 file changed, 24 insertions(+), 8 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1000,6 +1000,9 @@ static inline void io_clean_op(struct io static inline bool __io_match_files(struct io_kiocb *req, struct files_struct *files) { + if (req->file && req->file->f_op == &io_uring_fops) + return true; + return ((req->flags & REQ_F_WORK_INITIALIZED) && (req->work.flags & IO_WQ_WORK_FILES)) && req->work.identity->files == files; @@ -1398,11 +1401,14 @@ static bool io_grab_identity(struct io_k return false; atomic_inc(&id->files->count); get_nsproxy(id->nsproxy); - req->flags |= REQ_F_INFLIGHT; - spin_lock_irq(&ctx->inflight_lock); - list_add(&req->inflight_entry, &ctx->inflight_list); - spin_unlock_irq(&ctx->inflight_lock); + if (!(req->flags & REQ_F_INFLIGHT)) { + req->flags |= REQ_F_INFLIGHT; + + spin_lock_irq(&ctx->inflight_lock); + list_add(&req->inflight_entry, &ctx->inflight_list); + spin_unlock_irq(&ctx->inflight_lock); + } req->work.flags |= IO_WQ_WORK_FILES; } if (!(req->work.flags & IO_WQ_WORK_MM) && @@ -5886,8 +5892,10 @@ static void io_req_drop_files(struct io_ struct io_ring_ctx *ctx = req->ctx; unsigned long flags; - put_files_struct(req->work.identity->files); - put_nsproxy(req->work.identity->nsproxy); + if (req->work.flags & IO_WQ_WORK_FILES) { + put_files_struct(req->work.identity->files); + put_nsproxy(req->work.identity->nsproxy); + } spin_lock_irqsave(&ctx->inflight_lock, flags); list_del(&req->inflight_entry); spin_unlock_irqrestore(&ctx->inflight_lock, flags); @@ -6159,6 +6167,15 @@ static struct file *io_file_get(struct i file = __io_file_get(state, fd); } + if (file && file->f_op == &io_uring_fops) { + io_req_init_async(req); + req->flags |= REQ_F_INFLIGHT; + + spin_lock_irq(&ctx->inflight_lock); + list_add(&req->inflight_entry, &ctx->inflight_list); + spin_unlock_irq(&ctx->inflight_lock); + } + return file; } @@ -8578,8 +8595,7 @@ static void io_uring_cancel_files(struct spin_lock_irq(&ctx->inflight_lock); list_for_each_entry(req, &ctx->inflight_list, inflight_entry) { - if (req->task != task || - req->work.identity->files != files) + if (!io_match_task(req, task, files)) continue; found = true; break;
[PATCH 5.10 51/54] squashfs: avoid out of bounds writes in decompressors
From: Phillip Lougher commit e812cbb15adbbbee176baa1e8bda53059bf0 upstream. Patch series "Squashfs: fix BIO migration regression and add sanity checks". Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block usage to BIO" patch, which has produced a number of Sysbot/Syzkaller reports. Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption issues which have produced Sysbot reports in the id, inode and xattr lookup code. Each patch has been tested against the Sysbot reproducers using the given kernel configuration. They have the appropriate "Reported-by:" lines added. Additionally, all of the reproducer filesystems are indirectly fixed by patch [4/4] due to the fact they all have xattr corruption which is now detected there. Additional testing with other configurations and architectures (32bit, big endian), and normal filesystems has also been done to trap any inadvertent regressions caused by the additional sanity checks. This patch (of 4): This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Sysbot/Syskaller has reported a number of "out of bounds writes" and "unable to handle kernel paging request in squashfs_decompress" errors which have been identified as a regression introduced by the above patch. Specifically, the patch removed the following sanity check if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) This check did two things: 1. It ensured any reads were not beyond the end of the filesystem 2. It ensured that the "length" field read from the filesystem was within the expected maximum length. Without this any corrupted values can over-run allocated buffers. Link: https://lkml.kernel.org/r/20210204130249.4495-1-phil...@squashfs.org.uk Link: https://lkml.kernel.org/r/20210204130249.4495-2-phil...@squashfs.org.uk Fixes: 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: syzbot+6fba78f99b9afd4b5...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: Philippe Liard Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/block.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -196,9 +196,15 @@ int squashfs_read_data(struct super_bloc length = SQUASHFS_COMPRESSED_SIZE(length); index += 2; - TRACE("Block @ 0x%llx, %scompressed size %d\n", index, + TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2, compressed ? "" : "un", length); } + if (length < 0 || length > output->length || + (index + length) > msblk->bytes_used) { + res = -EIO; + goto out; + } + if (next_index) *next_index = index + length;
[PATCH 5.10 32/54] iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time()
From: Johannes Berg [ Upstream commit 5c56d862c749669d45c256f581eac4244be00d4d ] We need to take the mutex to call iwl_mvm_get_sync_time(), do it. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.4bb5ccf881a6.I62973cbb081e80aa5b0447a5c3b9c3251a65cf6b@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c index f043eefabb4ec..7b1d2dac6ceb8 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c @@ -514,7 +514,10 @@ static ssize_t iwl_dbgfs_os_device_timediff_read(struct file *file, const size_t bufsz = sizeof(buf); int pos = 0; + mutex_lock(&mvm->mutex); iwl_mvm_get_sync_time(mvm, &curr_gp2, &curr_os); + mutex_unlock(&mvm->mutex); + do_div(curr_os, NSEC_PER_USEC); diff = curr_os - curr_gp2; pos += scnprintf(buf + pos, bufsz - pos, "diff=%lld\n", diff); -- 2.27.0
[PATCH 5.10 43/54] regulator: Fix lockdep warning resolving supplies
From: Mark Brown [ Upstream commit 14a71d509ac809dcf56d7e3ca376b15d17bd0ddd ] With commit eaa7995c529b54 (regulator: core: avoid regulator_resolve_supply() race condition) we started holding the rdev lock while resolving supplies, an operation that requires holding the regulator_list_mutex. This results in lockdep warnings since in other places we take the list mutex then the mutex on an individual rdev. Since the goal is to make sure that we don't call set_supply() twice rather than a concern about the cost of resolution pull the rdev lock and check for duplicate resolution down to immediately before we do the set_supply() and drop it again once the allocation is done. Fixes: eaa7995c529b54 (regulator: core: avoid regulator_resolve_supply() race condition) Reported-by: Marek Szyprowski Tested-by: Marek Szyprowski Signed-off-by: Mark Brown Link: https://lore.kernel.org/r/20210122132042.10306-1-broo...@kernel.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/core.c | 29 + 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 2c31f04ff950f..35098dbd32a3c 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1823,17 +1823,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (rdev->supply) return 0; - /* -* Recheck rdev->supply with rdev->mutex lock held to avoid a race -* between rdev->supply null check and setting rdev->supply in -* set_supply() from concurrent tasks. -*/ - regulator_lock(rdev); - - /* Supply just resolved by a concurrent task? */ - if (rdev->supply) - goto out; - r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); @@ -1885,12 +1874,29 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) goto out; } + /* +* Recheck rdev->supply with rdev->mutex lock held to avoid a race +* between rdev->supply null check and setting rdev->supply in +* set_supply() from concurrent tasks. +*/ + regulator_lock(rdev); + + /* Supply just resolved by a concurrent task? */ + if (rdev->supply) { + regulator_unlock(rdev); + put_device(&r->dev); + goto out; + } + ret = set_supply(rdev, r); if (ret < 0) { + regulator_unlock(rdev); put_device(&r->dev); goto out; } + regulator_unlock(rdev); + /* * In set_machine_constraints() we may have turned this regulator on * but we couldn't propagate to the supply if it hadn't been resolved @@ -1906,7 +1912,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) } out: - regulator_unlock(rdev); return ret; } -- 2.27.0
[PATCH 5.4 17/24] i2c: mediatek: Move suspend and resume handling to NOIRQ phase
From: Qii Wang [ Upstream commit de96c3943f591018727b862f51953c1b6c55bcc3 ] Some i2c device driver indirectly uses I2C driver when it is now being suspended. The i2c devices driver is suspended during the NOIRQ phase and this cannot be changed due to other dependencies. Therefore, we also need to move the suspend handling for the I2C controller driver to the NOIRQ phase as well. Signed-off-by: Qii Wang Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-mt65xx.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c index 5a9f0d17f52c8..e1ef0122ef759 100644 --- a/drivers/i2c/busses/i2c-mt65xx.c +++ b/drivers/i2c/busses/i2c-mt65xx.c @@ -1008,7 +1008,8 @@ static int mtk_i2c_probe(struct platform_device *pdev) mtk_i2c_clock_disable(i2c); ret = devm_request_irq(&pdev->dev, irq, mtk_i2c_irq, - IRQF_TRIGGER_NONE, I2C_DRV_NAME, i2c); + IRQF_NO_SUSPEND | IRQF_TRIGGER_NONE, + I2C_DRV_NAME, i2c); if (ret < 0) { dev_err(&pdev->dev, "Request I2C IRQ %d fail\n", irq); @@ -1035,7 +1036,16 @@ static int mtk_i2c_remove(struct platform_device *pdev) } #ifdef CONFIG_PM_SLEEP -static int mtk_i2c_resume(struct device *dev) +static int mtk_i2c_suspend_noirq(struct device *dev) +{ + struct mtk_i2c *i2c = dev_get_drvdata(dev); + + i2c_mark_adapter_suspended(&i2c->adap); + + return 0; +} + +static int mtk_i2c_resume_noirq(struct device *dev) { int ret; struct mtk_i2c *i2c = dev_get_drvdata(dev); @@ -1050,12 +1060,15 @@ static int mtk_i2c_resume(struct device *dev) mtk_i2c_clock_disable(i2c); + i2c_mark_adapter_resumed(&i2c->adap); + return 0; } #endif static const struct dev_pm_ops mtk_i2c_pm = { - SET_SYSTEM_SLEEP_PM_OPS(NULL, mtk_i2c_resume) + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_i2c_suspend_noirq, + mtk_i2c_resume_noirq) }; static struct platform_driver mtk_i2c_driver = { -- 2.27.0
[PATCH 5.4 21/24] Fix unsynchronized access to sev members through svm_register_enc_region
From: Peter Gonda commit 19a23da53932bc8011220bd8c410cb76012de004 upstream. Grab kvm->lock before pinning memory when registering an encrypted region; sev_pin_memory() relies on kvm->lock being held to ensure correctness when checking and updating the number of pinned pages. Add a lockdep assertion to help prevent future regressions. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Paolo Bonzini Cc: Joerg Roedel Cc: Tom Lendacky Cc: Brijesh Singh Cc: Sean Christopherson Cc: x...@kernel.org Cc: k...@vger.kernel.org Cc: sta...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active") Signed-off-by: Peter Gonda V2 - Fix up patch description - Correct file paths svm.c -> sev.c - Add unlock of kvm->lock on sev_pin_memory error V1 - https://lore.kernel.org/kvm/20210126185431.1824530-1-pgo...@google.com/ Message-Id: <20210127161524.2832400-1-pgo...@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/svm.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1835,6 +1835,8 @@ static struct page **sev_pin_memory(stru struct page **pages; unsigned long first, last; + lockdep_assert_held(&kvm->lock); + if (ulen == 0 || uaddr + ulen < uaddr) return NULL; @@ -7091,12 +7093,21 @@ static int svm_register_enc_region(struc if (!region) return -ENOMEM; + mutex_lock(&kvm->lock); region->pages = sev_pin_memory(kvm, range->addr, range->size, ®ion->npages, 1); if (!region->pages) { ret = -ENOMEM; + mutex_unlock(&kvm->lock); goto e_free; } + region->uaddr = range->addr; + region->size = range->size; + + mutex_lock(&kvm->lock); + list_add_tail(®ion->list, &sev->regions_list); + mutex_unlock(&kvm->lock); + /* * The guest may change the memory encryption attribute from C=0 -> C=1 * or vice versa for this memory range. Lets make sure caches are @@ -7105,13 +7116,6 @@ static int svm_register_enc_region(struc */ sev_clflush_pages(region->pages, region->npages); - region->uaddr = range->addr; - region->size = range->size; - - mutex_lock(&kvm->lock); - list_add_tail(®ion->list, &sev->regions_list); - mutex_unlock(&kvm->lock); - return ret; e_free:
[PATCH 5.4 18/24] blk-cgroup: Use cond_resched() when destroy blkgs
From: Baolin Wang [ Upstream commit 6c635caef410aa757befbd8857c1eadde5cc22ed ] On !PREEMPT kernel, we can get below softlockup when doing stress testing with creating and destroying block cgroup repeatly. The reason is it may take a long time to acquire the queue's lock in the loop of blkcg_destroy_blkgs(), or the system can accumulate a huge number of blkgs in pathological cases. We can add a need_resched() check on each loop and release locks and do cond_resched() if true to avoid this issue, since the blkcg_destroy_blkgs() is not called from atomic contexts. [ 4757.010308] watchdog: BUG: soft lockup - CPU#11 stuck for 94s! [ 4757.010698] Call trace: [ 4757.010700] blkcg_destroy_blkgs+0x68/0x150 [ 4757.010701] cgwb_release_workfn+0x104/0x158 [ 4757.010702] process_one_work+0x1bc/0x3f0 [ 4757.010704] worker_thread+0x164/0x468 [ 4757.010705] kthread+0x108/0x138 Suggested-by: Tejun Heo Signed-off-by: Baolin Wang Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- block/blk-cgroup.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 3d34ac02d76ef..cb3d44d200055 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1089,6 +1089,8 @@ static void blkcg_css_offline(struct cgroup_subsys_state *css) */ void blkcg_destroy_blkgs(struct blkcg *blkcg) { + might_sleep(); + spin_lock_irq(&blkcg->lock); while (!hlist_empty(&blkcg->blkg_list)) { @@ -1096,14 +1098,20 @@ void blkcg_destroy_blkgs(struct blkcg *blkcg) struct blkcg_gq, blkcg_node); struct request_queue *q = blkg->q; - if (spin_trylock(&q->queue_lock)) { - blkg_destroy(blkg); - spin_unlock(&q->queue_lock); - } else { + if (need_resched() || !spin_trylock(&q->queue_lock)) { + /* +* Given that the system can accumulate a huge number +* of blkgs in pathological cases, check to see if we +* need to rescheduling to avoid softlockup. +*/ spin_unlock_irq(&blkcg->lock); - cpu_relax(); + cond_resched(); spin_lock_irq(&blkcg->lock); + continue; } + + blkg_destroy(blkg); + spin_unlock(&q->queue_lock); } spin_unlock_irq(&blkcg->lock); -- 2.27.0
Re: [PATCH] arm64: Fix warning in mte_get_random_tag()
On 2/11/21 1:35 PM, Ard Biesheuvel wrote: > On Thu, 11 Feb 2021 at 13:57, Vincenzo Frascino > wrote: >> >> The simplification of mte_get_random_tag() caused the introduction of the >> warning below: >> >> In file included from arch/arm64/include/asm/kasan.h:9, >> from include/linux/kasan.h:16, >> from mm/kasan/common.c:14: >> mm/kasan/common.c: In function ‘mte_get_random_tag’: >> arch/arm64/include/asm/mte-kasan.h:45:9: warning: ‘addr’ is used >> uninitialized [-Wuninitialized] >>45 | asm(__MTE_PREAMBLE "irg %0, %0" >> | >> >> Fix the warning initializing the address to NULL. >> >> Note: mte_get_random_tag() returns a tag and it never dereferences the >> address, >> hence 'addr' can be safely initialized to NULL. >> >> Fixes: c8f8de4c0887 ("arm64: kasan: simplify and inline MTE functions") >> Cc: Catalin Marinas >> Cc: Will Deacon >> Cc: Andrey Konovalov >> Cc: Andrew Morton >> Signed-off-by: Vincenzo Frascino >> --- >> >> This patch is based on linux-next/akpm >> >> arch/arm64/include/asm/mte-kasan.h | 7 ++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/include/asm/mte-kasan.h >> b/arch/arm64/include/asm/mte-kasan.h >> index 3d58489228c0..b2850b750726 100644 >> --- a/arch/arm64/include/asm/mte-kasan.h >> +++ b/arch/arm64/include/asm/mte-kasan.h >> @@ -40,7 +40,12 @@ static inline u8 mte_get_mem_tag(void *addr) >> /* Generate a random tag. */ >> static inline u8 mte_get_random_tag(void) >> { >> - void *addr; >> + /* >> +* mte_get_random_tag() returns a tag and it >> +* never dereferences the address, hence addr >> +* can be safely initialized to NULL. >> +*/ >> + void *addr = NULL; >> >> asm(__MTE_PREAMBLE "irg %0, %0" >> : "+r" (addr)); >> -- >> 2.30.0 >> > > Might it be better to simply change the asm constraint to "=r" ? > Indeed, did not notice the "+r". I will change it accordingly and post v2. Thanks! -- Regards, Vincenzo
[PATCH 5.4 24/24] squashfs: add more sanity checks in xattr id lookup
From: Phillip Lougher commit 506220d2ba21791314af569211ffd8870b8208fa upstream. Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. [phil...@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/270245655.754655.1612770082...@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-5-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+2ccea6339d3683608...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/xattr_id.c | 66 ++--- 1 file changed, 57 insertions(+), 9 deletions(-) --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_b struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_XATTR_BLOCK(index); int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]); + u64 start_block; struct squashfs_xattr_id id; int err; + if (index >= msblk->xattr_ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->xattr_id_table[block]); + err = squashfs_read_metadata(sb, &id, &start_block, &offset, sizeof(id)); if (err < 0) @@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_b /* * Read uncompressed xattr id lookup table indexes from disk into memory */ -__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, +__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, u64 *xattr_table_start, int *xattr_ids) { - unsigned int len; + struct squashfs_sb_info *msblk = sb->s_fs_info; + unsigned int len, indexes; struct squashfs_xattr_id_table *id_table; + __le64 *table; + u64 start, end; + int n; - id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + id_table = squashfs_read_table(sb, table_start, sizeof(*id_table)); if (IS_ERR(id_table)) return (__le64 *) id_table; @@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(str if (*xattr_ids == 0) return ERR_PTR(-EINVAL); - /* xattr_table should be less than start */ - if (*xattr_table_start >= start) + len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); + + /* +* The computed size of the index table (len bytes) should exactly +* match the table start and end points +*/ + start = table_start + sizeof(*id_table); + end = msblk->bytes_used; + + if (len != (end - start)) return ERR_PTR(-EINVAL); - len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + table = squashfs_read_table(sb, start, len); + if (IS_ERR(table)) + return table; + + /* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed xattr id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than table_start, and again the difference +* shouls be SQUASHFS_METADATA_SIZE or less. +* +* Finally xattr_table_start should be less than table[0]. +*/ + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } - TRACE("In read_xattr_index_table, length %d\n", len); + if (*xattr_table_start >= le64_to_cpu(table[0])) { + kfree(table); +
[PATCH 5.4 19/24] regulator: Fix lockdep warning resolving supplies
From: Mark Brown [ Upstream commit 14a71d509ac809dcf56d7e3ca376b15d17bd0ddd ] With commit eaa7995c529b54 (regulator: core: avoid regulator_resolve_supply() race condition) we started holding the rdev lock while resolving supplies, an operation that requires holding the regulator_list_mutex. This results in lockdep warnings since in other places we take the list mutex then the mutex on an individual rdev. Since the goal is to make sure that we don't call set_supply() twice rather than a concern about the cost of resolution pull the rdev lock and check for duplicate resolution down to immediately before we do the set_supply() and drop it again once the allocation is done. Fixes: eaa7995c529b54 (regulator: core: avoid regulator_resolve_supply() race condition) Reported-by: Marek Szyprowski Tested-by: Marek Szyprowski Signed-off-by: Mark Brown Link: https://lore.kernel.org/r/20210122132042.10306-1-broo...@kernel.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/core.c | 29 + 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 5e0490e18b46a..5b9d570df85cc 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1782,17 +1782,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (rdev->supply) return 0; - /* -* Recheck rdev->supply with rdev->mutex lock held to avoid a race -* between rdev->supply null check and setting rdev->supply in -* set_supply() from concurrent tasks. -*/ - regulator_lock(rdev); - - /* Supply just resolved by a concurrent task? */ - if (rdev->supply) - goto out; - r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); @@ -1844,12 +1833,29 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) goto out; } + /* +* Recheck rdev->supply with rdev->mutex lock held to avoid a race +* between rdev->supply null check and setting rdev->supply in +* set_supply() from concurrent tasks. +*/ + regulator_lock(rdev); + + /* Supply just resolved by a concurrent task? */ + if (rdev->supply) { + regulator_unlock(rdev); + put_device(&r->dev); + goto out; + } + ret = set_supply(rdev, r); if (ret < 0) { + regulator_unlock(rdev); put_device(&r->dev); goto out; } + regulator_unlock(rdev); + /* * In set_machine_constraints() we may have turned this regulator on * but we couldn't propagate to the supply if it hadn't been resolved @@ -1865,7 +1871,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) } out: - regulator_unlock(rdev); return ret; } -- 2.27.0
[PATCH 5.4 04/24] mac80211: 160MHz with extended NSS BW in CSA
From: Shay Bar [ Upstream commit dcf3c8fb32ddbfa3b8227db38aa6746405bd4527 ] Upon receiving CSA with 160MHz extended NSS BW from associated AP, STA should set the HT operation_mode based on new_center_freq_seg1 because it is later used as ccfs2 in ieee80211_chandef_vht_oper(). Signed-off-by: Aviad Brikman Signed-off-by: Shay Bar Link: https://lore.kernel.org/r/20201222064714.24888-1-shay@celeno.com Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/spectmgmt.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/mac80211/spectmgmt.c b/net/mac80211/spectmgmt.c index 5fe2b645912f6..132f8423addaa 100644 --- a/net/mac80211/spectmgmt.c +++ b/net/mac80211/spectmgmt.c @@ -132,16 +132,20 @@ int ieee80211_parse_ch_switch_ie(struct ieee80211_sub_if_data *sdata, } if (wide_bw_chansw_ie) { + u8 new_seg1 = wide_bw_chansw_ie->new_center_freq_seg1; struct ieee80211_vht_operation vht_oper = { .chan_width = wide_bw_chansw_ie->new_channel_width, .center_freq_seg0_idx = wide_bw_chansw_ie->new_center_freq_seg0, - .center_freq_seg1_idx = - wide_bw_chansw_ie->new_center_freq_seg1, + .center_freq_seg1_idx = new_seg1, /* .basic_mcs_set doesn't matter */ }; - struct ieee80211_ht_operation ht_oper = {}; + struct ieee80211_ht_operation ht_oper = { + .operation_mode = + cpu_to_le16(new_seg1 << + IEEE80211_HT_OP_MODE_CCFS2_SHIFT), + }; /* default, for the case of IEEE80211_VHT_CHANWIDTH_USE_HT, * to the previously parsed chandef -- 2.27.0
[PATCH 5.4 23/24] squashfs: add more sanity checks in inode lookup
From: Phillip Lougher commit eabac19e40c095543def79cb6ffeb3a8588aaff4 upstream. Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. [phil...@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/527909353.754618.1612769948...@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-4-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+04419e3ff19d2970e...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/export.c | 41 + 1 file changed, 33 insertions(+), 8 deletions(-) --- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(s struct squashfs_sb_info *msblk = sb->s_fs_info; int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1); int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1); - u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]); + u64 start; __le64 ino; int err; TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num); + if (ino_num == 0 || (ino_num - 1) >= msblk->inodes) + return -EINVAL; + + start = le64_to_cpu(msblk->inode_lookup_table[blk]); + err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino)); if (err < 0) return err; @@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table u64 lookup_table_start, u64 next_table, unsigned int inodes) { unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes); + unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes); + int n; __le64 *table; + u64 start, end; TRACE("In read_inode_lookup_table, length %d\n", length); @@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table if (inodes == 0) return ERR_PTR(-EINVAL); - /* length bytes should not extend into the next table - this check -* also traps instances where lookup_table_start is incorrectly larger -* than the next table start + /* +* The computed size of the lookup table (length bytes) should exactly +* match the table start and end points */ - if (lookup_table_start + length > next_table) + if (length != (next_table - lookup_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, lookup_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first inode lookup table metadata block, -* this should be less than lookup_table_start +* table0], table[1], ... table[indexes - 1] store the locations +* of the compressed inode lookup blocks. Each entry should be +* less than the next (i.e. table[0] < table[1]), and the difference +* between them should be SQUASHFS_METADATA_SIZE or less. +* table[indexes - 1] should be less than lookup_table_start, and +* again the difference should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); }
[PATCH 5.4 22/24] squashfs: add more sanity checks in id lookup
From: Phillip Lougher commit f37aa4c7366e23f91b81d00bafd6a7ab54e4a381 upstream. Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Link: https://lkml.kernel.org/r/20210204130249.4495-3-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+b06d57ba83f604522...@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee98...@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0...@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0f...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/id.c | 40 fs/squashfs/squashfs_fs_sb.h |1 + fs/squashfs/super.c |6 +++--- fs/squashfs/xattr.h | 10 +- 4 files changed, 45 insertions(+), 12 deletions(-) --- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -35,10 +35,15 @@ int squashfs_get_id(struct super_block * struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_ID_BLOCK(index); int offset = SQUASHFS_ID_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->id_table[block]); + u64 start_block; __le32 disk_id; int err; + if (index >= msblk->ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->id_table[block]); + err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset, sizeof(disk_id)); if (err < 0) @@ -56,7 +61,10 @@ __le64 *squashfs_read_id_index_table(str u64 id_table_start, u64 next_table, unsigned short no_ids) { unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids); + unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids); + int n; __le64 *table; + u64 start, end; TRACE("In read_id_index_table, length %d\n", length); @@ -67,20 +75,36 @@ __le64 *squashfs_read_id_index_table(str return ERR_PTR(-EINVAL); /* -* length bytes should not extend into the next table - this check -* also traps instances where id_table_start is incorrectly larger -* than the next table start +* The computed size of the index table (length bytes) should exactly +* match the table start and end points */ - if (id_table_start + length > next_table) + if (length != (next_table - id_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, id_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first id lookup table metadata block, this -* should be less than id_table_start +* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than id_table_start, and again the difference +* should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } --- a/fs/squashfs/squashfs_fs_sb.h +++ b/fs/squashfs/squashfs_fs_sb.h @@ -64,5 +64,6 @@ struct squashfs_sb_info { unsigned intinodes; unsigned intfragments; int xattr_ids; + unsigned int
[PATCH 5.4 05/24] ASoC: Intel: Skylake: Zero snd_ctl_elem_value
From: Ricardo Ribalda [ Upstream commit 1d8fe0648e118fd495a2cb393a34eb8d428e7808 ] Clear struct snd_ctl_elem_value before calling ->put() to avoid any data leak. Signed-off-by: Ricardo Ribalda Reviewed-by: Cezary Rojewski Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20210121171644.131059-2-riba...@chromium.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/intel/skylake/skl-topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/intel/skylake/skl-topology.c b/sound/soc/intel/skylake/skl-topology.c index 2cb719893324a..1940b17f27efa 100644 --- a/sound/soc/intel/skylake/skl-topology.c +++ b/sound/soc/intel/skylake/skl-topology.c @@ -3632,7 +3632,7 @@ static void skl_tplg_complete(struct snd_soc_component *component) sprintf(chan_text, "c%d", mach->mach_params.dmic_num); for (i = 0; i < se->items; i++) { - struct snd_ctl_elem_value val; + struct snd_ctl_elem_value val = {}; if (strstr(texts[i], chan_text)) { val.value.enumerated.item[0] = i; -- 2.27.0
Re: [PATCH ghak124 v3] audit: log nftables configuration change events
Hi, On Thu, Jun 04, 2020 at 09:20:49AM -0400, Richard Guy Briggs wrote: > iptables, ip6tables, arptables and ebtables table registration, > replacement and unregistration configuration events are logged for the > native (legacy) iptables setsockopt api, but not for the > nftables netlink api which is used by the nft-variant of iptables in > addition to nftables itself. > > Add calls to log the configuration actions in the nftables netlink api. As discussed offline already, these audit notifications are pretty hefty performance-wise. In an internal report, 300% restore time of a ruleset containing 70k set elements is measured. If I'm not mistaken, iptables emits a single audit log per table, ipset doesn't support audit at all. So I wonder how much audit logging is required at all (for certification or whatever reason). How much granularity is desired? I personally would notify once per transaction. This is easy and quick. Once per table or chain should be acceptable, as well. At the very least, we should not have to notify once per each element. This is the last resort of fast ruleset adjustments. If we lose it, people are better off with ipset IMHO. Unlike nft monitor, auditd is not designed to be disabled "at will". So turning it off for performance-critical workloads is no option. Cheers, Phil
[PATCH 4.19 12/24] iwlwifi: pcie: fix context info memory leak
From: Johannes Berg [ Upstream commit 2d6bc752cc2806366d9a4fd577b3f6c1f7a7e04e ] If the image loader allocation fails, we leak all the previously allocated memory. Fix this. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.97172cbaa67c.I3473233d0ad01a71aa9400832fb2b9f494d88a11@changeid Signed-off-by: Sasha Levin --- .../net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c index 6783b20d9681b..a1cecf4a0e820 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c @@ -159,8 +159,10 @@ int iwl_pcie_ctxt_info_gen3_init(struct iwl_trans *trans, /* Allocate IML */ iml_img = dma_alloc_coherent(trans->dev, trans->iml_len, &trans_pcie->iml_dma_addr, GFP_KERNEL); - if (!iml_img) - return -ENOMEM; + if (!iml_img) { + ret = -ENOMEM; + goto err_free_ctxt_info; + } memcpy(iml_img, trans->iml, trans->iml_len); @@ -177,6 +179,11 @@ int iwl_pcie_ctxt_info_gen3_init(struct iwl_trans *trans, return 0; +err_free_ctxt_info: + dma_free_coherent(trans->dev, sizeof(*trans_pcie->ctxt_info_gen3), + trans_pcie->ctxt_info_gen3, + trans_pcie->ctxt_info_dma_addr); + trans_pcie->ctxt_info_gen3 = NULL; err_free_prph_info: dma_free_coherent(trans->dev, sizeof(*prph_info), -- 2.27.0
[PATCH 5.4 20/24] bpf: Fix 32 bit src register truncation on div/mod
From: Daniel Borkmann commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream. While reviewing a different fix, John and I noticed an oddity in one of the BPF program dumps that stood out, for example: # bpftool p d x i 13 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 [...] In line 2 we noticed that the mov32 would 32 bit truncate the original src register for the div/mod operation. While for the two operations the dst register is typically marked unknown e.g. from adjust_scalar_min_max_vals() the src register is not, and thus verifier keeps tracking original bounds, simplified: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = -1 1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=invP-1 R1_w=invP-1 R10=fp0 2: (3c) w0 /= w1 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R1_w=invP-1 R10=fp0 3: (77) r1 >>= 32 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R1_w=invP4294967295 R10=fp0 4: (bf) r0 = r1 5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0 5: (95) exit processed 6 insns (limit 100) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Runtime result of r0 at exit is 0 instead of expected -1. Remove the verifier mov32 src rewrite in div/mod and replace it with a jmp32 test instead. After the fix, we result in the following code generation when having dividend r1 and divisor r6: div, 64 bit: div, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2 3: (ac) w1 ^= w1 3: (ac) w1 ^= w1 4: (05) goto pc+14: (05) goto pc+1 5: (3f) r1 /= r6 5: (3c) w1 /= w6 6: (b7) r0 = 0 6: (b7) r0 = 0 7: (95) exit 7: (95) exit mod, 64 bit: mod, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1 3: (9f) r1 %= r6 3: (9c) w1 %= w6 4: (b7) r0 = 0 4: (b7) r0 = 0 5: (95) exit 5: (95) exit x86 in particular can throw a 'divide error' exception for div instruction not only for divisor being zero, but also for the case when the quotient is too large for the designated register. For the edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT since we always zero edx (rdx). Hence really the only protection needed is against divisor being zero. Fixes: 68fda450a7df ("bpf: fix 32-bit divide by zero") Co-developed-by: John Fastabend Signed-off-by: John Fastabend Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman --- kernel/bpf/verifier.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9002,30 +9002,28 @@ static int fixup_bpf_calls(struct bpf_ve insn->code == (BPF_ALU | BPF_MOD | BPF_X) || insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; - struct bpf_insn mask_and_div[] = { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + bool isdiv = BPF_OP(insn->code) == BPF_DIV; + struct bpf_insn *patchlet; + struct bpf_insn chk_and_div[] = { /* Rx div 0 -> 0 */ - BPF_JMP_IMM(BPF_JNE, insn->src_reg, 0, 2), + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | +BPF_JNE | BPF_K, insn->src_reg, +0, 2, 0), BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), BPF_JMP_IMM(BPF_JA, 0, 0, 1), *insn, }; - struct bpf_insn mask_and_mod[] = { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + struct bpf_insn chk_and_mod[] = { /* Rx mod 0 -> Rx */ - BPF_JMP_IMM(BPF_JEQ, insn->src_reg, 0, 1), + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | +BPF_JEQ | BPF_K, insn->src_reg, +0, 1, 0), *insn, }; - struct bpf_ins
[PATCH 4.19 00/24] 4.19.176-rc1 review
This is the start of the stable review cycle for the 4.19.176 release. There are 24 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Sat, 13 Feb 2021 15:01:39 +. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.176-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 4.19.176-rc1 Phillip Lougher squashfs: add more sanity checks in xattr id lookup Phillip Lougher squashfs: add more sanity checks in inode lookup Phillip Lougher squashfs: add more sanity checks in id lookup Ming Lei blk-mq: don't hold q->sysfs_lock in blk_mq_map_swqueue Ming Lei block: don't hold q->sysfs_lock in elevator_init_mq Peter Gonda Fix unsynchronized access to sev members through svm_register_enc_region Theodore Ts'o memcg: fix a crash in wb_workfn when a device disappears Qian Cai include/trace/events/writeback.h: fix -Wstringop-truncation warnings Tobin C. Harding lib/string: Add strscpy_pad() function Dave Wysochanski SUNRPC: Handle 0 length opaque XDR object data properly Dave Wysochanski SUNRPC: Move simple_get_bytes and simple_get_netobj into private header Johannes Berg iwlwifi: mvm: guard against device removal in reprobe Johannes Berg iwlwifi: pcie: fix context info memory leak Emmanuel Grumbach iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap Johannes Berg iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time() Trond Myklebust pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process() Pan Bian chtls: Fix potential resource leak David Collins regulator: core: avoid regulator_resolve_supply() race condition Cong Wang af_key: relax availability checks for skb size calculation Sibi Sankar remoteproc: qcom_q6v5_mss: Validate MBA firmware size before load Sibi Sankar remoteproc: qcom_q6v5_mss: Validate modem blob firmware size before load Steven Rostedt (VMware) fgraph: Initialize tracing_graph_pause at task creation zhengbin block: fix NULL pointer dereference in register_disk Masami Hiramatsu tracing/kprobe: Fix to support kretprobe events on unloaded modules - Diffstat: Makefile | 4 +- arch/x86/kvm/svm.c | 18 +++--- block/blk-mq.c | 7 --- block/elevator.c | 14 ++--- block/genhd.c | 10 ++-- drivers/crypto/chelsio/chtls/chtls_cm.c| 7 +-- .../net/wireless/intel/iwlwifi/mvm/debugfs-vif.c | 3 + drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 3 +- .../wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 11 +++- drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 5 ++ drivers/regulator/core.c | 39 + drivers/remoteproc/qcom_q6v5_pil.c | 11 +++- fs/fs-writeback.c | 2 +- fs/nfs/pnfs.c | 8 ++- fs/squashfs/export.c | 41 +++--- fs/squashfs/id.c | 40 ++--- fs/squashfs/squashfs_fs_sb.h | 1 + fs/squashfs/super.c| 6 +- fs/squashfs/xattr.h| 10 +++- fs/squashfs/xattr_id.c | 66 +++--- include/linux/backing-dev.h| 10 include/linux/kprobes.h| 2 +- include/linux/string.h | 4 ++ include/linux/sunrpc/xdr.h | 3 +- include/trace/events/writeback.h | 35 ++-- init/init_task.c | 3 +- kernel/kprobes.c | 34 --- kernel/trace/ftrace.c | 2 - kernel/trace/trace_kprobe.c| 4 +- lib/string.c | 47 --- mm/backing-dev.c | 1 + net/key/af_key.c | 6 +- net/sunrpc/auth_gss/auth_gss.c | 30 +- net/sunrpc/auth_gss/auth_gss_internal.h| 45 +++ net/sunrpc/auth_gss/gss_krb5_mech.c| 31 +- 35 files changed, 379 insertions(+), 184 deletions(-)
[PATCH 4.19 02/24] block: fix NULL pointer dereference in register_disk
From: zhengbin commit 4d7c1d3fd7c7eda7dea351f071945e843a46c145 upstream. If __device_add_disk-->bdi_register_owner-->bdi_register--> bdi_register_va-->device_create_vargs fails, bdi->dev is still NULL, __device_add_disk-->register_disk will visit bdi->dev->kobj. This patch fixes that. Signed-off-by: zhengbin Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman Signed-off-by: Jack Wang --- block/genhd.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) --- a/block/genhd.c +++ b/block/genhd.c @@ -652,10 +652,12 @@ exit: kobject_uevent(&part_to_dev(part)->kobj, KOBJ_ADD); disk_part_iter_exit(&piter); - err = sysfs_create_link(&ddev->kobj, - &disk->queue->backing_dev_info->dev->kobj, - "bdi"); - WARN_ON(err); + if (disk->queue->backing_dev_info->dev) { + err = sysfs_create_link(&ddev->kobj, + &disk->queue->backing_dev_info->dev->kobj, + "bdi"); + WARN_ON(err); + } } /**
[PATCH 5.4 02/24] af_key: relax availability checks for skb size calculation
From: Cong Wang [ Upstream commit afbc293add6466f8f3f0c3d944d85f53709c170f ] xfrm_probe_algs() probes kernel crypto modules and changes the availability of struct xfrm_algo_desc. But there is a small window where ealg->available and aalg->available get changed between count_ah_combs()/count_esp_combs() and dump_ah_combs()/dump_esp_combs(), in this case we may allocate a smaller skb but later put a larger amount of data and trigger the panic in skb_put(). Fix this by relaxing the checks when counting the size, that is, skipping the test of ->available. We may waste some memory for a few of sizeof(struct sadb_comb), but it is still much better than a panic. Reported-by: syzbot+b2bf2652983d23734...@syzkaller.appspotmail.com Cc: Steffen Klassert Cc: Herbert Xu Signed-off-by: Cong Wang Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin --- net/key/af_key.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/key/af_key.c b/net/key/af_key.c index a915bc86620af..907d04a474597 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2902,7 +2902,7 @@ static int count_ah_combs(const struct xfrm_tmpl *t) break; if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } return sz + sizeof(struct sadb_prop); @@ -2920,7 +2920,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!ealg->pfkey_supported) continue; - if (!(ealg_tmpl_set(t, ealg) && ealg->available)) + if (!(ealg_tmpl_set(t, ealg))) continue; for (k = 1; ; k++) { @@ -2931,7 +2931,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } } -- 2.27.0
[PATCH 4.19 11/24] iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap
From: Emmanuel Grumbach [ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ] I hit a NULL pointer exception in this function when the init flow went really bad. Signed-off-by: Emmanuel Grumbach Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf70aaa5028a20c863255e05bc1f84@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index b73582ec03a08..b1a71539ca3e5 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -631,6 +631,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_txq *txq = trans_pcie->txq[txq_id]; + if (!txq) { + IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n"); + return; + } + spin_lock_bh(&txq->lock); while (txq->write_ptr != txq->read_ptr) { IWL_DEBUG_TX_REPLY(trans, "Q %d Free %d\n", -- 2.27.0
[PATCH 5.4 07/24] pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()
From: Trond Myklebust [ Upstream commit 08bd8dbe88825760e953759d7ec212903a026c75 ] If the server returns a new stateid that does not match the one in our cache, then try to return the one we hold instead of just invalidating it on the client side. This ensures that both client and server will agree that the stateid is invalid. Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/pnfs.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index ca1d98f274d12..e3a79e6958124 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -2369,7 +2369,13 @@ pnfs_layout_process(struct nfs4_layoutget *lgp) * We got an entirely new state ID. Mark all segments for the * inode invalid, and retry the layoutget */ - pnfs_mark_layout_stateid_invalid(lo, &free_me); + struct pnfs_layout_range range = { + .iomode = IOMODE_ANY, + .length = NFS4_MAX_UINT64, + }; + pnfs_set_plh_return_info(lo, IOMODE_ANY, 0); + pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs, + &range, 0); goto out_forget; } -- 2.27.0
[PATCH 5.4 08/24] ASoC: ak4458: correct reset polarity
From: Eliot Blennerhassett [ Upstream commit e953daeb68b1abd8a7d44902786349fdeef5c297 ] Reset (aka power off) happens when the reset gpio is made active. Change function name to ak4458_reset to match devicetree property "reset-gpios" Signed-off-by: Eliot Blennerhassett Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/ce650f47-4ff6-e486-7846-cc3d033f3...@blennerhassett.gen.nz Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/ak4458.c | 22 +++--- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/sound/soc/codecs/ak4458.c b/sound/soc/codecs/ak4458.c index 71562154c0b1e..217e8ce9a4ba4 100644 --- a/sound/soc/codecs/ak4458.c +++ b/sound/soc/codecs/ak4458.c @@ -523,18 +523,10 @@ static struct snd_soc_dai_driver ak4497_dai = { .ops = &ak4458_dai_ops, }; -static void ak4458_power_off(struct ak4458_priv *ak4458) +static void ak4458_reset(struct ak4458_priv *ak4458, bool active) { if (ak4458->reset_gpiod) { - gpiod_set_value_cansleep(ak4458->reset_gpiod, 0); - usleep_range(1000, 2000); - } -} - -static void ak4458_power_on(struct ak4458_priv *ak4458) -{ - if (ak4458->reset_gpiod) { - gpiod_set_value_cansleep(ak4458->reset_gpiod, 1); + gpiod_set_value_cansleep(ak4458->reset_gpiod, active); usleep_range(1000, 2000); } } @@ -548,7 +540,7 @@ static int ak4458_init(struct snd_soc_component *component) if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 1); - ak4458_power_on(ak4458); + ak4458_reset(ak4458, false); ret = snd_soc_component_update_bits(component, AK4458_00_CONTROL1, 0x80, 0x80); /* ACKS bit = 1; 1000 */ @@ -571,7 +563,7 @@ static void ak4458_remove(struct snd_soc_component *component) { struct ak4458_priv *ak4458 = snd_soc_component_get_drvdata(component); - ak4458_power_off(ak4458); + ak4458_reset(ak4458, true); } #ifdef CONFIG_PM @@ -581,7 +573,7 @@ static int __maybe_unused ak4458_runtime_suspend(struct device *dev) regcache_cache_only(ak4458->regmap, true); - ak4458_power_off(ak4458); + ak4458_reset(ak4458, true); if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 0); @@ -596,8 +588,8 @@ static int __maybe_unused ak4458_runtime_resume(struct device *dev) if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 1); - ak4458_power_off(ak4458); - ak4458_power_on(ak4458); + ak4458_reset(ak4458, true); + ak4458_reset(ak4458, false); regcache_cache_only(ak4458->regmap, false); regcache_mark_dirty(ak4458->regmap); -- 2.27.0
[PATCH 5.4 11/24] iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap
From: Emmanuel Grumbach [ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ] I hit a NULL pointer exception in this function when the init flow went really bad. Signed-off-by: Emmanuel Grumbach Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf70aaa5028a20c863255e05bc1f84@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index d3b58334e13ea..e7dcf8bc99b7c 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -657,6 +657,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_txq *txq = trans_pcie->txq[txq_id]; + if (!txq) { + IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n"); + return; + } + spin_lock_bh(&txq->lock); while (txq->write_ptr != txq->read_ptr) { IWL_DEBUG_TX_REPLY(trans, "Q %d Free %d\n", -- 2.27.0
[PATCH 5.4 10/24] iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time()
From: Johannes Berg [ Upstream commit 5c56d862c749669d45c256f581eac4244be00d4d ] We need to take the mutex to call iwl_mvm_get_sync_time(), do it. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.4bb5ccf881a6.I62973cbb081e80aa5b0447a5c3b9c3251a65cf6b@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c index f043eefabb4ec..7b1d2dac6ceb8 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c @@ -514,7 +514,10 @@ static ssize_t iwl_dbgfs_os_device_timediff_read(struct file *file, const size_t bufsz = sizeof(buf); int pos = 0; + mutex_lock(&mvm->mutex); iwl_mvm_get_sync_time(mvm, &curr_gp2, &curr_os); + mutex_unlock(&mvm->mutex); + do_div(curr_os, NSEC_PER_USEC); diff = curr_os - curr_gp2; pos += scnprintf(buf + pos, bufsz - pos, "diff=%lld\n", diff); -- 2.27.0
[PATCH 4.19 17/24] include/trace/events/writeback.h: fix -Wstringop-truncation warnings
From: Qian Cai [ Upstream commit d1a445d3b86c9341ce7a0954c23be0edb5c9bec5 ] There are many of those warnings. In file included from ./arch/powerpc/include/asm/paca.h:15, from ./arch/powerpc/include/asm/current.h:13, from ./include/linux/thread_info.h:21, from ./include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:51, from fs/fs-writeback.c:19: In function 'strncpy', inlined from 'perf_trace_writeback_page_template' at ./include/trace/events/writeback.h:56:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^ Fix it by using the new strscpy_pad() which was introduced in "lib/string: Add strscpy_pad() function" and will always be NUL-terminated instead of strncpy(). Also, change strlcpy() to use strscpy_pad() in this file for consistency. Link: http://lkml.kernel.org/r/1564075099-27750-1-git-send-email-...@lca.pw Fixes: 455b2864686d ("writeback: Initial tracing support") Fixes: 028c2dd184c0 ("writeback: Add tracing to balance_dirty_pages") Fixes: e84d0a4f8e39 ("writeback: trace event writeback_queue_io") Fixes: b48c104d2211 ("writeback: trace event bdi_dirty_ratelimit") Fixes: cc1676d917f3 ("writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()") Fixes: 9fb0a7da0c52 ("writeback: add more tracepoints") Signed-off-by: Qian Cai Reviewed-by: Jan Kara Cc: Tobin C. Harding Cc: Steven Rostedt (VMware) Cc: Ingo Molnar Cc: Tejun Heo Cc: Dave Chinner Cc: Fengguang Wu Cc: Jens Axboe Cc: Joe Perches Cc: Kees Cook Cc: Jann Horn Cc: Jonathan Corbet Cc: Nitin Gote Cc: Rasmus Villemoes Cc: Stephen Kitt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- include/trace/events/writeback.h | 38 +--- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 146e7b3faa856..b463e2575117e 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -65,8 +65,9 @@ TRACE_EVENT(writeback_dirty_page, ), TP_fast_assign( - strncpy(__entry->name, - mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", + 32); __entry->ino = mapping ? mapping->host->i_ino : 0; __entry->index = page->index; ), @@ -95,8 +96,8 @@ DECLARE_EVENT_CLASS(writeback_dirty_inode_template, struct backing_dev_info *bdi = inode_to_bdi(inode); /* may be called for files on pseudo FSes w/ unregistered bdi */ - strncpy(__entry->name, - bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32); __entry->ino= inode->i_ino; __entry->state = inode->i_state; __entry->flags = flags; @@ -175,8 +176,8 @@ DECLARE_EVENT_CLASS(writeback_write_inode_template, ), TP_fast_assign( - strncpy(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + strscpy_pad(__entry->name, + dev_name(inode_to_bdi(inode)->dev), 32); __entry->ino= inode->i_ino; __entry->sync_mode = wbc->sync_mode; __entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc); @@ -219,8 +220,9 @@ DECLARE_EVENT_CLASS(writeback_work_class, __field(unsigned int, cgroup_ino) ), TP_fast_assign( - strncpy(__entry->name, - wb->bdi->dev ? dev_name(wb->bdi->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + wb->bdi->dev ? dev_name(wb->bdi->dev) : + "(unknown)", 32); __entry->nr_pages = work->nr_pages; __entry->sb_dev = work->sb ? work->sb->s_dev : 0; __entry->sync_mode = work->sync_mode; @@ -273,7 +275,7 @@ DECLARE_EVENT_CLASS(writeback_class, __field(unsigned int, cgroup_ino) ), TP_fast_assign( - strncpy(__entry->name, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->name, dev_name(wb->bdi->dev), 32); __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: cgroup_ino=%
[PATCH 4.19 18/24] memcg: fix a crash in wb_workfn when a device disappears
From: Theodore Ts'o [ Upstream commit 68f23b89067fdf187763e75a56087550624fdbee ] Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained. With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister). Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*. Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled. The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment. Google Bug Id: 145475544 Link: https://lore.kernel.org/r/20191227194829.150110-1-ty...@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-ty...@mit.edu Signed-off-by: Theodore Ts'o Cc: Chris Mason Cc: Tejun Heo Cc: Jens Axboe Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- fs/fs-writeback.c| 2 +- include/linux/backing-dev.h | 10 ++ include/trace/events/writeback.h | 29 + mm/backing-dev.c | 1 + 4 files changed, 25 insertions(+), 17 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index f2d0c4acb3cbb..a247cb4b00e2d 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1986,7 +1986,7 @@ void wb_workfn(struct work_struct *work) struct bdi_writeback, dwork); long pages_written; - set_worker_desc("flush-%s", dev_name(wb->bdi->dev)); + set_worker_desc("flush-%s", bdi_dev_name(wb->bdi)); current->flags |= PF_SWAPWRITE; if (likely(!current_is_workqueue_rescuer() || diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index c28a47cbe355e..1ef4aca7b953f 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -498,4 +499,13 @@ static inline int bdi_rw_congested(struct backing_dev_info *bdi) (1 << WB_async_congested)); } +extern const char *bdi_unknown_name; + +static inline const char *bdi_dev_name(struct backing_dev_info *bdi) +{ + if (!bdi || !bdi->dev) + return bdi_unknown_name; + return dev_name(bdi->dev); +} + #endif /* _LINUX_BACKING_DEV_H */ diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index b463e2575117e..300afa559f467 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -66,8 +66,8 @@ TRACE_EVENT(writeback_dirty_page, TP_fast_assign( strscpy_pad(__entry->name, - mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", - 32); + bdi_dev_name(mapping ? inode_to_bdi(mapping->host) : +NULL), 32); __entry->ino = mapping ? mapping->host->i_ino : 0; __entry->index = page->index; ), @@ -96,8 +96,7 @@ DECLARE_EVENT_CLASS(writeback_dirty_inode_template, struct backing_dev_info *bdi = inode_to_bdi(inode); /* may be called for files on pseudo FSes w/ unregistered bdi */ - strscpy_pad(__entry->name, - bdi->dev ? dev_name(bdi->dev) : "(unk
[PATCH 4.19 19/24] Fix unsynchronized access to sev members through svm_register_enc_region
From: Peter Gonda commit 19a23da53932bc8011220bd8c410cb76012de004 upstream. Grab kvm->lock before pinning memory when registering an encrypted region; sev_pin_memory() relies on kvm->lock being held to ensure correctness when checking and updating the number of pinned pages. Add a lockdep assertion to help prevent future regressions. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Paolo Bonzini Cc: Joerg Roedel Cc: Tom Lendacky Cc: Brijesh Singh Cc: Sean Christopherson Cc: x...@kernel.org Cc: k...@vger.kernel.org Cc: sta...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active") Signed-off-by: Peter Gonda V2 - Fix up patch description - Correct file paths svm.c -> sev.c - Add unlock of kvm->lock on sev_pin_memory error V1 - https://lore.kernel.org/kvm/20210126185431.1824530-1-pgo...@google.com/ Message-Id: <20210127161524.2832400-1-pgo...@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/svm.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1832,6 +1832,8 @@ static struct page **sev_pin_memory(stru struct page **pages; unsigned long first, last; + lockdep_assert_held(&kvm->lock); + if (ulen == 0 || uaddr + ulen < uaddr) return NULL; @@ -7084,12 +7086,21 @@ static int svm_register_enc_region(struc if (!region) return -ENOMEM; + mutex_lock(&kvm->lock); region->pages = sev_pin_memory(kvm, range->addr, range->size, ®ion->npages, 1); if (!region->pages) { ret = -ENOMEM; + mutex_unlock(&kvm->lock); goto e_free; } + region->uaddr = range->addr; + region->size = range->size; + + mutex_lock(&kvm->lock); + list_add_tail(®ion->list, &sev->regions_list); + mutex_unlock(&kvm->lock); + /* * The guest may change the memory encryption attribute from C=0 -> C=1 * or vice versa for this memory range. Lets make sure caches are @@ -7098,13 +7109,6 @@ static int svm_register_enc_region(struc */ sev_clflush_pages(region->pages, region->npages); - region->uaddr = range->addr; - region->size = range->size; - - mutex_lock(&kvm->lock); - list_add_tail(®ion->list, &sev->regions_list); - mutex_unlock(&kvm->lock); - return ret; e_free:
[PATCH 4.19 20/24] block: dont hold q->sysfs_lock in elevator_init_mq
From: Ming Lei commit c48dac137a62a5d6fa1ef3fa445cbd9c43655a76 upstream. The original comment says: q->sysfs_lock must be held to provide mutual exclusion between elevator_switch() and here. Which is simply wrong. elevator_init_mq() is only called from blk_mq_init_allocated_queue, which is always called before the request queue is registered via blk_register_queue(), for dm-rq or normal rq based driver. However, queue's kobject is only exposed and added to sysfs in blk_register_queue(). So there isn't such race between elevator_switch() and elevator_init_mq(). So avoid to hold q->sysfs_lock in elevator_init_mq(). Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Cc: Damien Le Moal Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Jack Wang Signed-off-by: Greg Kroah-Hartman --- block/elevator.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) --- a/block/elevator.c +++ b/block/elevator.c @@ -980,23 +980,19 @@ int elevator_init_mq(struct request_queu if (q->nr_hw_queues != 1) return 0; - /* -* q->sysfs_lock must be held to provide mutual exclusion between -* elevator_switch() and here. -*/ - mutex_lock(&q->sysfs_lock); + WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)); + if (unlikely(q->elevator)) - goto out_unlock; + goto out; e = elevator_get(q, "mq-deadline", false); if (!e) - goto out_unlock; + goto out; err = blk_mq_init_sched(q, e); if (err) elevator_put(e); -out_unlock: - mutex_unlock(&q->sysfs_lock); +out: return err; }
[PATCH 4.19 14/24] SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
From: Dave Wysochanski [ Upstream commit ba6dfce47c4d002d96cd02a304132fca76981172 ] Remove duplicated helper functions to parse opaque XDR objects and place inside new file net/sunrpc/auth_gss/auth_gss_internal.h. In the new file carry the license and copyright from the source file net/sunrpc/auth_gss/auth_gss.c. Finally, update the comment inside include/linux/sunrpc/xdr.h since lockd is not the only user of struct xdr_netobj. Signed-off-by: Dave Wysochanski Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- include/linux/sunrpc/xdr.h | 3 +- net/sunrpc/auth_gss/auth_gss.c | 30 +- net/sunrpc/auth_gss/auth_gss_internal.h | 42 + net/sunrpc/auth_gss/gss_krb5_mech.c | 31 ++ 4 files changed, 46 insertions(+), 60 deletions(-) create mode 100644 net/sunrpc/auth_gss/auth_gss_internal.h diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h index 2bd68177a442e..33580cc72a43d 100644 --- a/include/linux/sunrpc/xdr.h +++ b/include/linux/sunrpc/xdr.h @@ -26,8 +26,7 @@ struct rpc_rqst; #define XDR_QUADLEN(l) (((l) + 3) >> 2) /* - * Generic opaque `network object.' At the kernel level, this type - * is used only by lockd. + * Generic opaque `network object.' */ #define XDR_MAX_NETOBJ 1024 struct xdr_netobj { diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c index 8cb7d812ccb82..e61c48c1b37d6 100644 --- a/net/sunrpc/auth_gss/auth_gss.c +++ b/net/sunrpc/auth_gss/auth_gss.c @@ -53,6 +53,7 @@ #include #include +#include "auth_gss_internal.h" #include "../netns.h" static const struct rpc_authops authgss_ops; @@ -147,35 +148,6 @@ gss_cred_set_ctx(struct rpc_cred *cred, struct gss_cl_ctx *ctx) clear_bit(RPCAUTH_CRED_NEW, &cred->cr_flags); } -static const void * -simple_get_bytes(const void *p, const void *end, void *res, size_t len) -{ - const void *q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - memcpy(res, p, len); - return q; -} - -static inline const void * -simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) -{ - const void *q; - unsigned int len; - - p = simple_get_bytes(p, end, &len, sizeof(len)); - if (IS_ERR(p)) - return p; - q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - dest->data = kmemdup(p, len, GFP_NOFS); - if (unlikely(dest->data == NULL)) - return ERR_PTR(-ENOMEM); - dest->len = len; - return q; -} - static struct gss_cl_ctx * gss_cred_get_ctx(struct rpc_cred *cred) { diff --git a/net/sunrpc/auth_gss/auth_gss_internal.h b/net/sunrpc/auth_gss/auth_gss_internal.h new file mode 100644 index 0..c5603242b54bf --- /dev/null +++ b/net/sunrpc/auth_gss/auth_gss_internal.h @@ -0,0 +1,42 @@ +// SPDX-License-Identifier: BSD-3-Clause +/* + * linux/net/sunrpc/auth_gss/auth_gss_internal.h + * + * Internal definitions for RPCSEC_GSS client authentication + * + * Copyright (c) 2000 The Regents of the University of Michigan. + * All rights reserved. + * + */ +#include +#include +#include + +static inline const void * +simple_get_bytes(const void *p, const void *end, void *res, size_t len) +{ + const void *q = (const void *)((const char *)p + len); + if (unlikely(q > end || q < p)) + return ERR_PTR(-EFAULT); + memcpy(res, p, len); + return q; +} + +static inline const void * +simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) +{ + const void *q; + unsigned int len; + + p = simple_get_bytes(p, end, &len, sizeof(len)); + if (IS_ERR(p)) + return p; + q = (const void *)((const char *)p + len); + if (unlikely(q > end || q < p)) + return ERR_PTR(-EFAULT); + dest->data = kmemdup(p, len, GFP_NOFS); + if (unlikely(dest->data == NULL)) + return ERR_PTR(-ENOMEM); + dest->len = len; + return q; +} diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c b/net/sunrpc/auth_gss/gss_krb5_mech.c index 7bb2514aadd9d..14f2823ad6c20 100644 --- a/net/sunrpc/auth_gss/gss_krb5_mech.c +++ b/net/sunrpc/auth_gss/gss_krb5_mech.c @@ -46,6 +46,8 @@ #include #include +#include "auth_gss_internal.h" + #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) # define RPCDBG_FACILITY RPCDBG_AUTH #endif @@ -187,35 +189,6 @@ get_gss_krb5_enctype(int etype) return NULL; } -static const void * -simple_get_bytes(const void *p, const void *end, void *res, int len) -{ - const void *q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - memcpy(res, p, len); - return q; -} - -static const void * -simple_get_netobj(const void *p, const void
[PATCH 4.19 03/24] fgraph: Initialize tracing_graph_pause at task creation
From: Steven Rostedt (VMware) commit 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 upstream. On some archs, the idle task can call into cpu_suspend(). The cpu_suspend() will disable or pause function graph tracing, as there's some paths in bringing down the CPU that can have issues with its return address being modified. The task_struct structure has a "tracing_graph_pause" atomic counter, that when set to something other than zero, the function graph tracer will not modify the return address. The problem is that the tracing_graph_pause counter is initialized when the function graph tracer is enabled. This can corrupt the counter for the idle task if it is suspended in these architectures. CPU 1CPU 2 -- do_idle() cpu_suspend() pause_graph_tracing() task_struct->tracing_graph_pause++ (0 -> 1) start_graph_tracing() for_each_online_cpu(cpu) { ftrace_graph_init_idle_task(cpu) task-struct->tracing_graph_pause = 0 (1 -> 0) unpause_graph_tracing() task_struct->tracing_graph_pause-- (0 -> -1) The above should have gone from 1 to zero, and enabled function graph tracing again. But instead, it is set to -1, which keeps it disabled. There's no reason that the field tracing_graph_pause on the task_struct can not be initialized at boot up. Cc: sta...@vger.kernel.org Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339 Reported-by: pierre.gond...@arm.com Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- init/init_task.c |3 ++- kernel/trace/ftrace.c |2 -- 2 files changed, 2 insertions(+), 3 deletions(-) --- a/init/init_task.c +++ b/init/init_task.c @@ -168,7 +168,8 @@ struct task_struct init_task .lockdep_recursion = 0, #endif #ifdef CONFIG_FUNCTION_GRAPH_TRACER - .ret_stack = NULL, + .ret_stack = NULL, + .tracing_graph_pause= ATOMIC_INIT(0), #endif #if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPT) .trace_recursion = 0, --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -6875,7 +6875,6 @@ static int alloc_retstack_tasklist(struc } if (t->ret_stack == NULL) { - atomic_set(&t->tracing_graph_pause, 0); atomic_set(&t->trace_overrun, 0); t->curr_ret_stack = -1; t->curr_ret_depth = -1; @@ -7088,7 +7087,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_ static void graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack) { - atomic_set(&t->tracing_graph_pause, 0); atomic_set(&t->trace_overrun, 0); t->ftrace_timestamp = 0; /* make curr_ret_stack visible before we add the ret_stack */
[PATCH 4.19 15/24] SUNRPC: Handle 0 length opaque XDR object data properly
From: Dave Wysochanski [ Upstream commit e4a7d1f7707eb44fd953a31dd59eff82009d879c ] When handling an auth_gss downcall, it's possible to get 0-length opaque object for the acceptor. In the case of a 0-length XDR object, make sure simple_get_netobj() fills in dest->data = NULL, and does not continue to kmemdup() which will set dest->data = ZERO_SIZE_PTR for the acceptor. The trace event code can handle NULL but not ZERO_SIZE_PTR for a string, and so without this patch the rpcgss_context trace event will crash the kernel as follows: [ 162.887992] BUG: kernel NULL pointer dereference, address: 0010 [ 162.898693] #PF: supervisor read access in kernel mode [ 162.900830] #PF: error_code(0x) - not-present page [ 162.902940] PGD 0 P4D 0 [ 162.904027] Oops: [#1] SMP PTI [ 162.905493] CPU: 4 PID: 4321 Comm: rpc.gssd Kdump: loaded Not tainted 5.10.0 #133 [ 162.908548] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 162.910978] RIP: 0010:strlen+0x0/0x20 [ 162.912505] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 162.920101] RSP: 0018:aec900c77d90 EFLAGS: 00010202 [ 162.922263] RAX: RBX: RCX: fffde697 [ 162.925158] RDX: 002f RSI: 0080 RDI: 0010 [ 162.928073] RBP: 0010 R08: 0e10 R09: [ 162.930976] R10: 8e698a590cb8 R11: 0001 R12: 0e10 [ 162.933883] R13: fffde697 R14: 00010034d517 R15: 00070028 [ 162.936777] FS: 7f1e1eb93700() GS:8e6ab7d0() knlGS: [ 162.940067] CS: 0010 DS: ES: CR0: 80050033 [ 162.942417] CR2: 0010 CR3: 000104eba000 CR4: 000406e0 [ 162.945300] Call Trace: [ 162.946428] trace_event_raw_event_rpcgss_context+0x84/0x140 [auth_rpcgss] [ 162.949308] ? __kmalloc_track_caller+0x35/0x5a0 [ 162.951224] ? gss_pipe_downcall+0x3a3/0x6a0 [auth_rpcgss] [ 162.953484] gss_pipe_downcall+0x585/0x6a0 [auth_rpcgss] [ 162.955953] rpc_pipe_write+0x58/0x70 [sunrpc] [ 162.957849] vfs_write+0xcb/0x2c0 [ 162.959264] ksys_write+0x68/0xe0 [ 162.960706] do_syscall_64+0x33/0x40 [ 162.962238] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 162.964346] RIP: 0033:0x7f1e1f1e57df Signed-off-by: Dave Wysochanski Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- net/sunrpc/auth_gss/auth_gss_internal.h | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/net/sunrpc/auth_gss/auth_gss_internal.h b/net/sunrpc/auth_gss/auth_gss_internal.h index c5603242b54bf..f6d9631bd9d00 100644 --- a/net/sunrpc/auth_gss/auth_gss_internal.h +++ b/net/sunrpc/auth_gss/auth_gss_internal.h @@ -34,9 +34,12 @@ simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) q = (const void *)((const char *)p + len); if (unlikely(q > end || q < p)) return ERR_PTR(-EFAULT); - dest->data = kmemdup(p, len, GFP_NOFS); - if (unlikely(dest->data == NULL)) - return ERR_PTR(-ENOMEM); + if (len) { + dest->data = kmemdup(p, len, GFP_NOFS); + if (unlikely(dest->data == NULL)) + return ERR_PTR(-ENOMEM); + } else + dest->data = NULL; dest->len = len; return q; } -- 2.27.0
[PATCH 4.19 13/24] iwlwifi: mvm: guard against device removal in reprobe
From: Johannes Berg [ Upstream commit 7a21b1d4a728a483f07c638ccd8610d4b4f12684 ] If we get into a problem severe enough to attempt a reprobe, we schedule a worker to do that. However, if the problem gets more severe and the device is actually destroyed before this worker has a chance to run, we use a free device. Bump up the reference count of the device until the worker runs to avoid this situation. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210122144849.871f0892e4b2.I94819e11afd68d875f3e242b98bef724b8236f1e@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c index 0e26619fb330b..d932171617e6a 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c @@ -1192,6 +1192,7 @@ static void iwl_mvm_reprobe_wk(struct work_struct *wk) reprobe = container_of(wk, struct iwl_mvm_reprobe, work); if (device_reprobe(reprobe->dev)) dev_err(reprobe->dev, "reprobe failed!\n"); + put_device(reprobe->dev); kfree(reprobe); module_put(THIS_MODULE); } @@ -1242,7 +1243,7 @@ void iwl_mvm_nic_restart(struct iwl_mvm *mvm, bool fw_error) module_put(THIS_MODULE); return; } - reprobe->dev = mvm->trans->dev; + reprobe->dev = get_device(mvm->trans->dev); INIT_WORK(&reprobe->work, iwl_mvm_reprobe_wk); schedule_work(&reprobe->work); } else if (mvm->fwrt.cur_fw_img == IWL_UCODE_REGULAR && -- 2.27.0
[PATCH 4.19 16/24] lib/string: Add strscpy_pad() function
From: Tobin C. Harding [ Upstream commit 458a3bf82df4fe1f951d0f52b1e0c1e9d5a88a3b ] We have a function to copy strings safely and we have a function to copy strings and zero the tail of the destination (if source string is shorter than destination buffer) but we do not have a function to do both at once. This means developers must write this themselves if they desire this functionality. This is a chore, and also leaves us open to off by one errors unnecessarily. Add a function that calls strscpy() then memset()s the tail to zero if the source string is shorter than the destination buffer. Acked-by: Kees Cook Signed-off-by: Tobin C. Harding Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin --- include/linux/string.h | 4 lib/string.c | 47 +++--- 2 files changed, 44 insertions(+), 7 deletions(-) diff --git a/include/linux/string.h b/include/linux/string.h index 4db285b83f44e..1e0c442b941e2 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -31,6 +31,10 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif + +/* Wraps calls to strscpy()/memset(), no arch specific code required */ +ssize_t strscpy_pad(char *dest, const char *src, size_t count); + #ifndef __HAVE_ARCH_STRCAT extern char * strcat(char *, const char *); #endif diff --git a/lib/string.c b/lib/string.c index edf4907ec946f..f7f7770444bf5 100644 --- a/lib/string.c +++ b/lib/string.c @@ -158,11 +158,9 @@ EXPORT_SYMBOL(strlcpy); * @src: Where to copy the string from * @count: Size of destination buffer * - * Copy the string, or as much of it as fits, into the dest buffer. - * The routine returns the number of characters copied (not including - * the trailing NUL) or -E2BIG if the destination buffer wasn't big enough. - * The behavior is undefined if the string buffers overlap. - * The destination buffer is always NUL terminated, unless it's zero-sized. + * Copy the string, or as much of it as fits, into the dest buffer. The + * behavior is undefined if the string buffers overlap. The destination + * buffer is always NUL terminated, unless it's zero-sized. * * Preferred to strlcpy() since the API doesn't require reading memory * from the src string beyond the specified "count" bytes, and since @@ -172,8 +170,10 @@ EXPORT_SYMBOL(strlcpy); * * Preferred to strncpy() since it always returns a valid string, and * doesn't unnecessarily force the tail of the destination buffer to be - * zeroed. If the zeroing is desired, it's likely cleaner to use strscpy() - * with an overflow test, then just memset() the tail of the dest buffer. + * zeroed. If zeroing is desired please use strscpy_pad(). + * + * Return: The number of characters copied (not including the trailing + * %NUL) or -E2BIG if the destination buffer wasn't big enough. */ ssize_t strscpy(char *dest, const char *src, size_t count) { @@ -260,6 +260,39 @@ char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) } EXPORT_SYMBOL(stpcpy); +/** + * strscpy_pad() - Copy a C-string into a sized buffer + * @dest: Where to copy the string to + * @src: Where to copy the string from + * @count: Size of destination buffer + * + * Copy the string, or as much of it as fits, into the dest buffer. The + * behavior is undefined if the string buffers overlap. The destination + * buffer is always %NUL terminated, unless it's zero-sized. + * + * If the source string is shorter than the destination buffer, zeros + * the tail of the destination buffer. + * + * For full explanation of why you may want to consider using the + * 'strscpy' functions please see the function docstring for strscpy(). + * + * Return: The number of characters copied (not including the trailing + * %NUL) or -E2BIG if the destination buffer wasn't big enough. + */ +ssize_t strscpy_pad(char *dest, const char *src, size_t count) +{ + ssize_t written; + + written = strscpy(dest, src, count); + if (written < 0 || written == count - 1) + return written; + + memset(dest + written + 1, 0, count - written - 1); + + return written; +} +EXPORT_SYMBOL(strscpy_pad); + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another -- 2.27.0
[PATCH 5.4 06/24] chtls: Fix potential resource leak
From: Pan Bian [ Upstream commit b6011966ac6f402847eb5326beee8da3a80405c7 ] The dst entry should be released if no neighbour is found. Goto label free_dst to fix the issue. Besides, the check of ndev against NULL is redundant. Signed-off-by: Pan Bian Link: https://lore.kernel.org/r/20210121145738.51091-1-bianpan2...@163.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/crypto/chelsio/chtls/chtls_cm.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c b/drivers/crypto/chelsio/chtls/chtls_cm.c index eddc6d1bdb2d1..82b76df43ae57 100644 --- a/drivers/crypto/chelsio/chtls/chtls_cm.c +++ b/drivers/crypto/chelsio/chtls/chtls_cm.c @@ -1047,11 +1047,9 @@ static struct sock *chtls_recv_sock(struct sock *lsk, n = dst_neigh_lookup(dst, &iph->saddr); if (!n || !n->dev) - goto free_sk; + goto free_dst; ndev = n->dev; - if (!ndev) - goto free_dst; if (is_vlan_dev(ndev)) ndev = vlan_dev_real_dev(ndev); @@ -1117,7 +1115,8 @@ static struct sock *chtls_recv_sock(struct sock *lsk, free_csk: chtls_sock_release(&csk->kref); free_dst: - neigh_release(n); + if (n) + neigh_release(n); dst_release(dst); free_sk: inet_csk_prepare_forced_close(newsk); -- 2.27.0
Re: [PATCH] staging: vt6656: Fixed alignment with issue in rf.c
On 11/02/21 7:15 pm, Pritthijit Nath wrote: > This change fixes a checkpatch CHECK style issue for "Alignment should match > open parenthesis". > > Signed-off-by: Pritthijit Nath > --- > drivers/staging/vt6656/rf.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/vt6656/rf.c b/drivers/staging/vt6656/rf.c > index 5b8da06e3916..bcd4d467e03a 100644 > --- a/drivers/staging/vt6656/rf.c > +++ b/drivers/staging/vt6656/rf.c > @@ -687,7 +687,7 @@ static int vnt_rf_set_txpower(struct vnt_private *priv, > u8 power, > > if (hw_value < ARRAY_SIZE(vt3226d0_lo_current_table)) { > ret = vnt_rf_write_embedded(priv, > - vt3226d0_lo_current_table[hw_value]); > + > vt3226d0_lo_current_table[hw_value]); > if (ret) > return ret; > } > I am resubmitting this patch. Pardon the typo in the subject line. thanks, Pritthijit
[PATCH 5.4 09/24] iwlwifi: mvm: skip power command when unbinding vif during CSA
From: Sara Sharon [ Upstream commit bf544e9aa570034e094a8a40d5f9e1e2c4916d18 ] In the new CSA flow, we remain associated during CSA, but still do a unbind-bind to the vif. However, sending the power command right after when vif is unbound but still associated causes FW to assert (0x3400) since it cannot tell the LMAC id. Just skip this command, we will send it again in a bit, when assigning the new context. Signed-off-by: Sara Sharon Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.64a2254ac5c3.Iaa3a9050bf3d7c9cd5beaf561e932e6defc12ec3@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c index daae86cd61140..fc6430edd1107 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c @@ -4169,6 +4169,9 @@ static void __iwl_mvm_unassign_vif_chanctx(struct iwl_mvm *mvm, iwl_mvm_binding_remove_vif(mvm, vif); out: + if (fw_has_capa(&mvm->fw->ucode_capa, IWL_UCODE_TLV_CAPA_CHANNEL_SWITCH_CMD) && + switching_chanctx) + return; mvmvif->phy_ctxt = NULL; iwl_mvm_power_update_mac(mvm); } -- 2.27.0
[PATCH 5.10 01/54] io_uring: simplify io_task_match()
From: Pavel Begunkov [ Upstream commit 06de5f5973c641c7ae033f133ecfaaf64fe633a6 ] If IORING_SETUP_SQPOLL is set all requests belong to the corresponding SQPOLL task, so skip task checking in that case and always match. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |6 +- 1 file changed, 1 insertion(+), 5 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1472,11 +1472,7 @@ static bool io_task_match(struct io_kioc if (!tsk || req->task == tsk) return true; - if (ctx->flags & IORING_SETUP_SQPOLL) { - if (ctx->sq_data && req->task == ctx->sq_data->thread) - return true; - } - return false; + return (ctx->flags & IORING_SETUP_SQPOLL); } /*
[PATCH] arm: dts: sun5i: Add GPU node
sun5i has the same Mali 400 GPU as sun4i with the same interrupts, clocks and resets. Add node for it in dts. Signed-off-by: Yassine Oudjana --- arch/arm/boot/dts/sun5i.dtsi | 42 1 file changed, 42 insertions(+) diff --git a/arch/arm/boot/dts/sun5i.dtsi b/arch/arm/boot/dts/sun5i.dtsi index c2b4fbf552a3..81203f19b6ce 100644 --- a/arch/arm/boot/dts/sun5i.dtsi +++ b/arch/arm/boot/dts/sun5i.dtsi @@ -726,6 +726,27 @@ i2c2: i2c@1c2b400 { #size-cells = <0>; }; + mali: gpu@1c4 { + compatible = "allwinner,sun4i-a10-mali", "arm,mali-400"; + reg = <0x01c4 0x1>; + interrupts = <69>, +<70>, +<71>, +<72>, +<73>; + interrupt-names = "gp", + "gpmmu", + "pp0", + "ppmmu0", + "pmu"; + clocks = <&ccu CLK_AHB_GPU>, <&ccu CLK_GPU>; + clock-names = "bus", "core"; + resets = <&ccu RST_GPU>; + + assigned-clocks = <&ccu CLK_GPU>; + assigned-clock-rates = <38400>; + }; + timer@1c6 { compatible = "allwinner,sun5i-a13-hstimer"; reg = <0x01c6 0x1000>; @@ -733,6 +754,27 @@ timer@1c6 { clocks = <&ccu CLK_AHB_HSTIMER>; }; + mali: gpu@1c4 { + compatible = "allwinner,sun4i-a10-mali", "arm,mali-400"; + reg = <0x01c4 0x1>; + interrupts = <69>, +<70>, +<71>, +<72>, +<73>; + interrupt-names = "gp", + "gpmmu", + "pp0", + "ppmmu0", + "pmu"; + clocks = <&ccu CLK_AHB_GPU>, <&ccu CLK_GPU>; + clock-names = "bus", "core"; + resets = <&ccu RST_GPU>; + + assigned-clocks = <&ccu CLK_GPU>; + assigned-clock-rates = <38400>; + }; + fe0: display-frontend@1e0 { compatible = "allwinner,sun5i-a13-display-frontend"; reg = <0x01e0 0x2>; -- 2.30.0
[PATCH 4.19 21/24] blk-mq: dont hold q->sysfs_lock in blk_mq_map_swqueue
From: Ming Lei commit c6ba933358f0d7a6a042b894dba20cc70396a6d3 upstream. blk_mq_map_swqueue() is called from blk_mq_init_allocated_queue() and blk_mq_update_nr_hw_queues(). For the former caller, the kobject isn't exposed to userspace yet. For the latter caller, hctx sysfs entries and debugfs are un-registered before updating nr_hw_queues. On the other hand, commit 2f8f1336a48b ("blk-mq: always free hctx after request queue is freed") moves freeing hctx into queue's release handler, so there won't be race with queue release path too. So don't hold q->sysfs_lock in blk_mq_map_swqueue(). Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Jack Wang Signed-off-by: Greg Kroah-Hartman --- block/blk-mq.c |7 --- 1 file changed, 7 deletions(-) --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2324,11 +2324,6 @@ static void blk_mq_map_swqueue(struct re struct blk_mq_ctx *ctx; struct blk_mq_tag_set *set = q->tag_set; - /* -* Avoid others reading imcomplete hctx->cpumask through sysfs -*/ - mutex_lock(&q->sysfs_lock); - queue_for_each_hw_ctx(q, hctx, i) { cpumask_clear(hctx->cpumask); hctx->nr_ctx = 0; @@ -2362,8 +2357,6 @@ static void blk_mq_map_swqueue(struct re hctx->ctxs[hctx->nr_ctx++] = ctx; } - mutex_unlock(&q->sysfs_lock); - queue_for_each_hw_ctx(q, hctx, i) { /* * If no software queues are mapped to this hardware queue,
[PATCH 4.19 07/24] regulator: core: avoid regulator_resolve_supply() race condition
From: David Collins [ Upstream commit eaa7995c529b54d68d97a30f6344cc6ca2f214a7 ] The final step in regulator_register() is to call regulator_resolve_supply() for each registered regulator (including the one in the process of being registered). The regulator_resolve_supply() function first checks if rdev->supply is NULL, then it performs various steps to try to find the supply. If successful, rdev->supply is set inside of set_supply(). This procedure can encounter a race condition if two concurrent tasks call regulator_register() near to each other on separate CPUs and one of the regulators has rdev->supply_name specified. There is currently nothing guaranteeing atomicity between the rdev->supply check and set steps. Thus, both tasks can observe rdev->supply==NULL in their regulator_resolve_supply() calls. This then results in both creating a struct regulator for the supply. One ends up actually stored in rdev->supply and the other is lost (though still present in the supply's consumer_list). Here is a kernel log snippet showing the issue: [ 12.421768] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.425854] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.429064] debugfs: Directory 'regulator.4-SUPPLY' with parent '17a0.rsc:rpmh-regulator-gfxlvl-pm8350_s5_level' already present! Avoid this race condition by holding the rdev->mutex lock inside of regulator_resolve_supply() while checking and setting rdev->supply. Signed-off-by: David Collins Link: https://lore.kernel.org/r/1610068562-4410-1-git-send-email-colli...@codeaurora.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/core.c | 39 --- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 8a6ca06d9c160..fa8f5fc04d8fd 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1567,23 +1567,34 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) { struct regulator_dev *r; struct device *dev = rdev->dev.parent; - int ret; + int ret = 0; /* No supply to resovle? */ if (!rdev->supply_name) return 0; - /* Supply already resolved? */ + /* Supply already resolved? (fast-path without locking contention) */ if (rdev->supply) return 0; + /* +* Recheck rdev->supply with rdev->mutex lock held to avoid a race +* between rdev->supply null check and setting rdev->supply in +* set_supply() from concurrent tasks. +*/ + regulator_lock(rdev); + + /* Supply just resolved by a concurrent task? */ + if (rdev->supply) + goto out; + r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); /* Did the lookup explicitly defer for us? */ if (ret == -EPROBE_DEFER) - return ret; + goto out; if (have_full_constraints()) { r = dummy_regulator_rdev; @@ -1591,15 +1602,18 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) } else { dev_err(dev, "Failed to resolve %s-supply for %s\n", rdev->supply_name, rdev->desc->name); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } if (r == rdev) { dev_err(dev, "Supply for %s (%s) resolved to itself\n", rdev->desc->name, rdev->supply_name); - if (!have_full_constraints()) - return -EINVAL; + if (!have_full_constraints()) { + ret = -EINVAL; + goto out; + } r = dummy_regulator_rdev; get_device(&r->dev); } @@ -1613,7 +1627,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (r->dev.parent && r->dev.parent != rdev->dev.parent) { if (!device_is_bound(r->dev.parent)) { put_device(&r->dev); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } @@ -1621,13 +1636,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) ret = regulator_resolve_supply(r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } ret = set_supply(rdev, r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } /* Cascade always-on state to supply */ @@ -1636,11 +1651,13 @@ static int regulator_
[PATCH 4.19 08/24] chtls: Fix potential resource leak
From: Pan Bian [ Upstream commit b6011966ac6f402847eb5326beee8da3a80405c7 ] The dst entry should be released if no neighbour is found. Goto label free_dst to fix the issue. Besides, the check of ndev against NULL is redundant. Signed-off-by: Pan Bian Link: https://lore.kernel.org/r/20210121145738.51091-1-bianpan2...@163.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/crypto/chelsio/chtls/chtls_cm.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c b/drivers/crypto/chelsio/chtls/chtls_cm.c index fd3092a4378e4..08ed3ff8b255f 100644 --- a/drivers/crypto/chelsio/chtls/chtls_cm.c +++ b/drivers/crypto/chelsio/chtls/chtls_cm.c @@ -1051,11 +1051,9 @@ static struct sock *chtls_recv_sock(struct sock *lsk, tcph = (struct tcphdr *)(iph + 1); n = dst_neigh_lookup(dst, &iph->saddr); if (!n || !n->dev) - goto free_sk; + goto free_dst; ndev = n->dev; - if (!ndev) - goto free_dst; if (is_vlan_dev(ndev)) ndev = vlan_dev_real_dev(ndev); @@ -1117,7 +1115,8 @@ static struct sock *chtls_recv_sock(struct sock *lsk, free_csk: chtls_sock_release(&csk->kref); free_dst: - neigh_release(n); + if (n) + neigh_release(n); dst_release(dst); free_sk: inet_csk_prepare_forced_close(newsk); -- 2.27.0
[PATCH 4.19 09/24] pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()
From: Trond Myklebust [ Upstream commit 08bd8dbe88825760e953759d7ec212903a026c75 ] If the server returns a new stateid that does not match the one in our cache, then try to return the one we hold instead of just invalidating it on the client side. This ensures that both client and server will agree that the stateid is invalid. Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/pnfs.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 4b165aa5a2561..55965e8e9a2ed 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -2301,7 +2301,13 @@ pnfs_layout_process(struct nfs4_layoutget *lgp) * We got an entirely new state ID. Mark all segments for the * inode invalid, and retry the layoutget */ - pnfs_mark_layout_stateid_invalid(lo, &free_me); + struct pnfs_layout_range range = { + .iomode = IOMODE_ANY, + .length = NFS4_MAX_UINT64, + }; + pnfs_set_plh_return_info(lo, IOMODE_ANY, 0); + pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs, + &range, 0); goto out_forget; } -- 2.27.0
[PATCH 4.19 10/24] iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time()
From: Johannes Berg [ Upstream commit 5c56d862c749669d45c256f581eac4244be00d4d ] We need to take the mutex to call iwl_mvm_get_sync_time(), do it. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210115130252.4bb5ccf881a6.I62973cbb081e80aa5b0447a5c3b9c3251a65cf6b@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c index 798605c4f1227..5287f21d7ba63 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c @@ -520,7 +520,10 @@ static ssize_t iwl_dbgfs_os_device_timediff_read(struct file *file, const size_t bufsz = sizeof(buf); int pos = 0; + mutex_lock(&mvm->mutex); iwl_mvm_get_sync_time(mvm, &curr_gp2, &curr_os); + mutex_unlock(&mvm->mutex); + do_div(curr_os, NSEC_PER_USEC); diff = curr_os - curr_gp2; pos += scnprintf(buf + pos, bufsz - pos, "diff=%lld\n", diff); -- 2.27.0
[PATCH 4.19 06/24] af_key: relax availability checks for skb size calculation
From: Cong Wang [ Upstream commit afbc293add6466f8f3f0c3d944d85f53709c170f ] xfrm_probe_algs() probes kernel crypto modules and changes the availability of struct xfrm_algo_desc. But there is a small window where ealg->available and aalg->available get changed between count_ah_combs()/count_esp_combs() and dump_ah_combs()/dump_esp_combs(), in this case we may allocate a smaller skb but later put a larger amount of data and trigger the panic in skb_put(). Fix this by relaxing the checks when counting the size, that is, skipping the test of ->available. We may waste some memory for a few of sizeof(struct sadb_comb), but it is still much better than a panic. Reported-by: syzbot+b2bf2652983d23734...@syzkaller.appspotmail.com Cc: Steffen Klassert Cc: Herbert Xu Signed-off-by: Cong Wang Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin --- net/key/af_key.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/key/af_key.c b/net/key/af_key.c index e340e97224c3a..c7d5a6015389b 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2908,7 +2908,7 @@ static int count_ah_combs(const struct xfrm_tmpl *t) break; if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } return sz + sizeof(struct sadb_prop); @@ -2926,7 +2926,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!ealg->pfkey_supported) continue; - if (!(ealg_tmpl_set(t, ealg) && ealg->available)) + if (!(ealg_tmpl_set(t, ealg))) continue; for (k = 1; ; k++) { @@ -2937,7 +2937,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } } -- 2.27.0
[PATCH 5.10 14/54] io_uring: fix sqo ownership false positive warning
From: Pavel Begunkov [ Upstream commit 70b2c60d3797bffe182dddb9bb55975b9be5889a ] WARNING: CPU: 0 PID: 21359 at fs/io_uring.c:9042 io_uring_cancel_task_requests+0xe55/0x10c0 fs/io_uring.c:9042 Call Trace: io_uring_flush+0x47b/0x6e0 fs/io_uring.c:9227 filp_close+0xb4/0x170 fs/open.c:1295 close_files fs/file.c:403 [inline] put_files_struct fs/file.c:418 [inline] put_files_struct+0x1cc/0x350 fs/file.c:415 exit_files+0x7e/0xa0 fs/file.c:435 do_exit+0xc22/0x2ae0 kernel/exit.c:820 do_group_exit+0x125/0x310 kernel/exit.c:922 get_signal+0x427/0x20f0 kernel/signal.c:2773 arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:201 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Now io_uring_cancel_task_requests() can be called not through file notes but directly, remove a WARN_ONCE() there that give us false positives. That check is not very important and we catch it in other places. Fixes: 84965ff8a84f0 ("io_uring: if we see flush on exit, cancel related tasks") Cc: sta...@vger.kernel.org # 5.9+ Reported-by: syzbot+3e3d9bd0c6ce9efbc...@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |2 -- 1 file changed, 2 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8683,8 +8683,6 @@ static void io_uring_cancel_task_request struct task_struct *task = current; if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) { - /* for SQPOLL only sqo_task has task notes */ - WARN_ON_ONCE(ctx->sqo_task != current); io_disable_sqo_submit(ctx); task = ctx->sq_data->thread; atomic_inc(&task->io_uring->in_idle);
[PATCH 5.10 11/54] io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
From: Pavel Begunkov [ Upstream commit ca70f00bed6cb255b7a9b91aa18a2717c9217f70 ] do not call blocking ops when !TASK_RUNNING; state=2 set at [] prepare_to_wait+0x1f4/0x3b0 kernel/sched/wait.c:262 WARNING: CPU: 1 PID: 19888 at kernel/sched/core.c:7853 __might_sleep+0xed/0x100 kernel/sched/core.c:7848 RIP: 0010:__might_sleep+0xed/0x100 kernel/sched/core.c:7848 Call Trace: __mutex_lock_common+0xc4/0x2ef0 kernel/locking/mutex.c:935 __mutex_lock kernel/locking/mutex.c:1103 [inline] mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1118 io_wq_submit_work+0x39a/0x720 fs/io_uring.c:6411 io_run_cancel fs/io-wq.c:856 [inline] io_wqe_cancel_pending_work fs/io-wq.c:990 [inline] io_wq_cancel_cb+0x614/0xcb0 fs/io-wq.c:1027 io_uring_cancel_files fs/io_uring.c:8874 [inline] io_uring_cancel_task_requests fs/io_uring.c:8952 [inline] __io_uring_files_cancel+0x115d/0x19e0 fs/io_uring.c:9038 io_uring_files_cancel include/linux/io_uring.h:51 [inline] do_exit+0x2e6/0x2490 kernel/exit.c:780 do_group_exit+0x168/0x2d0 kernel/exit.c:922 get_signal+0x16b5/0x2030 kernel/signal.c:2770 arch_do_signal_or_restart+0x8e/0x6a0 arch/x86/kernel/signal.c:811 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0xac/0x1e0 kernel/entry/common.c:201 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] syscall_exit_to_user_mode+0x48/0x190 kernel/entry/common.c:302 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Rewrite io_uring_cancel_files() to mimic __io_uring_task_cancel()'s counting scheme, so it does all the heavy work before setting TASK_UNINTERRUPTIBLE. Cc: sta...@vger.kernel.org # 5.9+ Reported-by: syzbot+f655445043a26a7cf...@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov [axboe: fix inverted task check] Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8586,30 +8586,31 @@ static void io_cancel_defer_files(struct } } +static int io_uring_count_inflight(struct io_ring_ctx *ctx, + struct task_struct *task, + struct files_struct *files) +{ + struct io_kiocb *req; + int cnt = 0; + + spin_lock_irq(&ctx->inflight_lock); + list_for_each_entry(req, &ctx->inflight_list, inflight_entry) + cnt += io_match_task(req, task, files); + spin_unlock_irq(&ctx->inflight_lock); + return cnt; +} + static void io_uring_cancel_files(struct io_ring_ctx *ctx, struct task_struct *task, struct files_struct *files) { while (!list_empty_careful(&ctx->inflight_list)) { struct io_task_cancel cancel = { .task = task, .files = files }; - struct io_kiocb *req; DEFINE_WAIT(wait); - bool found = false; + int inflight; - spin_lock_irq(&ctx->inflight_lock); - list_for_each_entry(req, &ctx->inflight_list, inflight_entry) { - if (!io_match_task(req, task, files)) - continue; - found = true; - break; - } - if (found) - prepare_to_wait(&task->io_uring->wait, &wait, - TASK_UNINTERRUPTIBLE); - spin_unlock_irq(&ctx->inflight_lock); - - /* We need to keep going until we don't find a matching req */ - if (!found) + inflight = io_uring_count_inflight(ctx, task, files); + if (!inflight) break; io_wq_cancel_cb(ctx->io_wq, io_cancel_task_cb, &cancel, true); @@ -8617,7 +8618,11 @@ static void io_uring_cancel_files(struct io_kill_timeouts(ctx, task, files); /* cancellations _may_ trigger task work */ io_run_task_work(); - schedule(); + + prepare_to_wait(&task->io_uring->wait, &wait, + TASK_UNINTERRUPTIBLE); + if (inflight == io_uring_count_inflight(ctx, task, files)) + schedule(); finish_wait(&task->io_uring->wait, &wait); } }
[PATCH 5.10 13/54] io_uring: fix list corruption for splice file_get
From: Pavel Begunkov [ Upstream commit f609cbb8911e40e15f9055e8f945f926ac906924 ] kernel BUG at lib/list_debug.c:29! Call Trace: __list_add include/linux/list.h:67 [inline] list_add include/linux/list.h:86 [inline] io_file_get+0x8cc/0xdb0 fs/io_uring.c:6466 __io_splice_prep+0x1bc/0x530 fs/io_uring.c:3866 io_splice_prep fs/io_uring.c:3920 [inline] io_req_prep+0x3546/0x4e80 fs/io_uring.c:6081 io_queue_sqe+0x609/0x10d0 fs/io_uring.c:6628 io_submit_sqe fs/io_uring.c:6705 [inline] io_submit_sqes+0x1495/0x2720 fs/io_uring.c:6953 __do_sys_io_uring_enter+0x107d/0x1f30 fs/io_uring.c:9353 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 io_file_get() may be called from splice, and so REQ_F_INFLIGHT may already be set. Fixes: 02a13674fa0e8 ("io_uring: account io_uring internal files as REQ_F_INFLIGHT") Cc: sta...@vger.kernel.org # 5.9+ Reported-by: syzbot+6879187cf57845801...@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6170,7 +6170,8 @@ static struct file *io_file_get(struct i file = __io_file_get(state, fd); } - if (file && file->f_op == &io_uring_fops) { + if (file && file->f_op == &io_uring_fops && + !(req->flags & REQ_F_INFLIGHT)) { io_req_init_async(req); req->flags |= REQ_F_INFLIGHT;
[PATCH v2] arm64: Fix warning in mte_get_random_tag()
The simplification of mte_get_random_tag() caused the introduction of the warning below: In file included from arch/arm64/include/asm/kasan.h:9, from include/linux/kasan.h:16, from mm/kasan/common.c:14: mm/kasan/common.c: In function ‘mte_get_random_tag’: arch/arm64/include/asm/mte-kasan.h:45:9: warning: ‘addr’ is used uninitialized [-Wuninitialized] 45 | asm(__MTE_PREAMBLE "irg %0, %0" | Fix the warning using "=r" for the address in the asm inline. Fixes: c8f8de4c0887 ("arm64: kasan: simplify and inline MTE functions") Cc: Catalin Marinas Cc: Will Deacon Cc: Andrey Konovalov Cc: Andrew Morton Signed-off-by: Vincenzo Frascino --- This patch is based on linux-next/akpm arch/arm64/include/asm/mte-kasan.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/mte-kasan.h b/arch/arm64/include/asm/mte-kasan.h index 3d58489228c0..7ab500e2ad17 100644 --- a/arch/arm64/include/asm/mte-kasan.h +++ b/arch/arm64/include/asm/mte-kasan.h @@ -43,7 +43,7 @@ static inline u8 mte_get_random_tag(void) void *addr; asm(__MTE_PREAMBLE "irg %0, %0" - : "+r" (addr)); + : "=r" (addr)); return mte_get_ptr_tag(addr); } -- 2.30.0
[PATCH 5.10 12/54] io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE
From: Hao Xu [ Upstream commit 6195ba09822c87cad09189bbf550d0fbe714687a ] Abaci reported the follow warning: [ 27.073425] do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait_exclusive+0x3a/0xc0 [ 27.075805] WARNING: CPU: 0 PID: 951 at kernel/sched/core.c:7853 __might_sleep+0x80/0xa0 [ 27.077604] Modules linked in: [ 27.078379] CPU: 0 PID: 951 Comm: a.out Not tainted 5.11.0-rc3+ #1 [ 27.079637] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 27.080852] RIP: 0010:__might_sleep+0x80/0xa0 [ 27.081835] Code: 65 48 8b 04 25 80 71 01 00 48 8b 90 c0 15 00 00 48 8b 70 18 48 c7 c7 08 39 95 82 c6 05 f9 5f de 08 01 48 89 d1 e8 00 c6 fa ff 0b eb bf 41 0f b6 f5 48 c7 c7 40 23 c9 82 e8 f3 48 ec 00 eb a7 [ 27.084521] RSP: 0018:c9fe3ce8 EFLAGS: 00010286 [ 27.085350] RAX: RBX: 82956083 RCX: [ 27.086348] RDX: 8881057a RSI: 8118cc9e RDI: 88813bc28570 [ 27.087598] RBP: 03a7 R08: 0001 R09: 0001 [ 27.088819] R10: c9fe3e00 R11: fffef9f0 R12: [ 27.089819] R13: R14: 88810576eb80 R15: 88810576e800 [ 27.091058] FS: 7f7b144cf740() GS:88813bc0() knlGS: [ 27.092775] CS: 0010 DS: ES: CR0: 80050033 [ 27.093796] CR2: 022da7b8 CR3: 00010b928002 CR4: 003706f0 [ 27.094778] DR0: DR1: DR2: [ 27.095780] DR3: DR6: fffe0ff0 DR7: 0400 [ 27.097011] Call Trace: [ 27.097685] __mutex_lock+0x5d/0xa30 [ 27.098565] ? prepare_to_wait_exclusive+0x71/0xc0 [ 27.099412] ? io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.100441] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0 [ 27.101537] ? _raw_spin_unlock_irqrestore+0x2d/0x40 [ 27.102656] ? trace_hardirqs_on+0x46/0x110 [ 27.103459] ? io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.104317] io_cqring_overflow_flush.part.101+0x6d/0x70 [ 27.105113] io_cqring_wait+0x36e/0x4d0 [ 27.105770] ? find_held_lock+0x28/0xb0 [ 27.106370] ? io_uring_remove_task_files+0xa0/0xa0 [ 27.107076] __x64_sys_io_uring_enter+0x4fb/0x640 [ 27.107801] ? rcu_read_lock_sched_held+0x59/0xa0 [ 27.108562] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0 [ 27.109684] ? syscall_enter_from_user_mode+0x26/0x70 [ 27.110731] do_syscall_64+0x2d/0x40 [ 27.111296] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 27.112056] RIP: 0033:0x7f7b13dc8239 [ 27.112663] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48 [ 27.115113] RSP: 002b:7ffd6d7f5c88 EFLAGS: 0286 ORIG_RAX: 01aa [ 27.116562] RAX: ffda RBX: RCX: 7f7b13dc8239 [ 27.117961] RDX: 478e RSI: RDI: 0003 [ 27.118925] RBP: 7ffd6d7f5cb0 R08: 2040 R09: 0008 [ 27.119773] R10: 0001 R11: 0286 R12: 00400480 [ 27.120614] R13: 7ffd6d7f5d90 R14: R15: [ 27.121490] irq event stamp: 5635 [ 27.121946] hardirqs last enabled at (5643): [] console_unlock+0x5c4/0x740 [ 27.123476] hardirqs last disabled at (5652): [] console_unlock+0x4e7/0x740 [ 27.125192] softirqs last enabled at (5272): [] __do_softirq+0x3c5/0x5aa [ 27.126430] softirqs last disabled at (5267): [] asm_call_irq_on_stack+0xf/0x20 [ 27.127634] ---[ end trace 289d7e28fa60f928 ]--- This is caused by calling io_cqring_overflow_flush() which may sleep after calling prepare_to_wait_exclusive() which set task state to TASK_INTERRUPTIBLE Reported-by: Abaci Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync") Reviewed-by: Pavel Begunkov Signed-off-by: Hao Xu Signed-off-by: Jens Axboe Signed-off-by: Pavel Begunkov Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7000,14 +7000,18 @@ static int io_cqring_wait(struct io_ring TASK_INTERRUPTIBLE); /* make sure we run task_work before checking for signals */ ret = io_run_task_work_sig(); - if (ret > 0) + if (ret > 0) { + finish_wait(&ctx->wait, &iowq.wq); continue; + } else if (ret < 0) break; if (io_should_wake(&iowq)) break; - if (test_bit(0, &ctx->cq_check_overflow)) + if (test_bit(0, &ctx->cq_check_overflow)) { + finish_wait(&ctx->wait, &iowq.wq); continue; +
[PATCH 4.19 01/24] tracing/kprobe: Fix to support kretprobe events on unloaded modules
From: Masami Hiramatsu commit 97c753e62e6c31a404183898d950d8c08d752dbd upstream. Fix kprobe_on_func_entry() returns error code instead of false so that register_kretprobe() can return an appropriate error code. append_trace_kprobe() expects the kprobe registration returns -ENOENT when the target symbol is not found, and it checks whether the target module is unloaded or not. If the target module doesn't exist, it defers to probe the target symbol until the module is loaded. However, since register_kretprobe() returns -EINVAL instead of -ENOENT in that case, it always fail on putting the kretprobe event on unloaded modules. e.g. Kprobe event: /sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events [ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue. Kretprobe event: (p -> r) /sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events sh: write error: Invalid argument /sys/kernel/debug/tracing # cat error_log [ 41.122514] trace_kprobe: error: Failed to register probe event Command: r xfs:xfs_end_io ^ To fix this bug, change kprobe_on_func_entry() to detect symbol lookup failure and return -ENOENT in that case. Otherwise it returns -EINVAL or 0 (succeeded, given address is on the entry). Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@devnote2 Cc: sta...@vger.kernel.org Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly") Reported-by: Jianlin Lv Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- include/linux/kprobes.h |2 +- kernel/kprobes.c| 34 +- kernel/trace/trace_kprobe.c |4 ++-- 3 files changed, 28 insertions(+), 12 deletions(-) --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -245,7 +245,7 @@ extern void kprobes_inc_nmissed_count(st extern bool arch_within_kprobe_blacklist(unsigned long addr); extern int arch_populate_kprobe_blacklist(void); extern bool arch_kprobe_on_func_entry(unsigned long offset); -extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset); +extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset); extern bool within_kprobe_blacklist(unsigned long addr); extern int kprobe_add_ksym_blacklist(unsigned long entry); --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1921,29 +1921,45 @@ bool __weak arch_kprobe_on_func_entry(un return !offset; } -bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset) +/** + * kprobe_on_func_entry() -- check whether given address is function entry + * @addr: Target address + * @sym: Target symbol name + * @offset: The offset from the symbol or the address + * + * This checks whether the given @addr+@offset or @sym+@offset is on the + * function entry address or not. + * This returns 0 if it is the function entry, or -EINVAL if it is not. + * And also it returns -ENOENT if it fails the symbol or address lookup. + * Caller must pass @addr or @sym (either one must be NULL), or this + * returns -EINVAL. + */ +int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset) { kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset); if (IS_ERR(kp_addr)) - return false; + return PTR_ERR(kp_addr); - if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) || - !arch_kprobe_on_func_entry(offset)) - return false; + if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset)) + return -ENOENT; - return true; + if (!arch_kprobe_on_func_entry(offset)) + return -EINVAL; + + return 0; } int register_kretprobe(struct kretprobe *rp) { - int ret = 0; + int ret; struct kretprobe_instance *inst; int i; void *addr; - if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset)) - return -EINVAL; + ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset); + if (ret) + return ret; /* If only rp->kp.addr is specified, check reregistering kprobes */ if (rp->kp.addr && check_kprobe_rereg(&rp->kp)) --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -112,9 +112,9 @@ bool trace_kprobe_on_func_entry(struct t { struct trace_kprobe *tk = (struct trace_kprobe *)call->data; - return kprobe_on_func_entry(tk->rp.kp.addr, + return (kprobe_on_func_entry(tk->rp.kp.addr, tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name, - tk->rp.kp.addr ? 0 : tk->rp.kp.offset); + tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0);
[PATCH 5.10 17/54] gpiolib: cdev: clear debounce period if line set to output
From: Kent Gibson commit 03a58ea5905fdbd93ff9e52e670d802600ba38cd upstream. When set_config changes a line from input to output debounce is implicitly disabled, as debounce makes no sense for outputs, but the debounce period is not being cleared and is still reported in the line info. So clear the debounce period when the debouncer is stopped in edge_detector_stop(). Fixes: 65cff7046406 ("gpiolib: cdev: support setting debounce") Cc: sta...@vger.kernel.org Signed-off-by: Kent Gibson Reviewed-by: Linus Walleij Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman --- drivers/gpio/gpiolib-cdev.c |2 ++ 1 file changed, 2 insertions(+) --- a/drivers/gpio/gpiolib-cdev.c +++ b/drivers/gpio/gpiolib-cdev.c @@ -756,6 +756,8 @@ static void edge_detector_stop(struct li cancel_delayed_work_sync(&line->work); WRITE_ONCE(line->sw_debounced, 0); line->eflags = 0; + if (line->desc) + WRITE_ONCE(line->desc->debounce_period_us, 0); /* do not change line->level - see comment in debounced_value() */ }
[PATCH 4.19 24/24] squashfs: add more sanity checks in xattr id lookup
From: Phillip Lougher commit 506220d2ba21791314af569211ffd8870b8208fa upstream. Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. [phil...@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/270245655.754655.1612770082...@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-5-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+2ccea6339d3683608...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/xattr_id.c | 66 ++--- 1 file changed, 57 insertions(+), 9 deletions(-) --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -44,10 +44,15 @@ int squashfs_xattr_lookup(struct super_b struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_XATTR_BLOCK(index); int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]); + u64 start_block; struct squashfs_xattr_id id; int err; + if (index >= msblk->xattr_ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->xattr_id_table[block]); + err = squashfs_read_metadata(sb, &id, &start_block, &offset, sizeof(id)); if (err < 0) @@ -63,13 +68,17 @@ int squashfs_xattr_lookup(struct super_b /* * Read uncompressed xattr id lookup table indexes from disk into memory */ -__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, +__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, u64 *xattr_table_start, int *xattr_ids) { - unsigned int len; + struct squashfs_sb_info *msblk = sb->s_fs_info; + unsigned int len, indexes; struct squashfs_xattr_id_table *id_table; + __le64 *table; + u64 start, end; + int n; - id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + id_table = squashfs_read_table(sb, table_start, sizeof(*id_table)); if (IS_ERR(id_table)) return (__le64 *) id_table; @@ -83,13 +92,52 @@ __le64 *squashfs_read_xattr_id_table(str if (*xattr_ids == 0) return ERR_PTR(-EINVAL); - /* xattr_table should be less than start */ - if (*xattr_table_start >= start) + len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); + + /* +* The computed size of the index table (len bytes) should exactly +* match the table start and end points +*/ + start = table_start + sizeof(*id_table); + end = msblk->bytes_used; + + if (len != (end - start)) return ERR_PTR(-EINVAL); - len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + table = squashfs_read_table(sb, start, len); + if (IS_ERR(table)) + return table; + + /* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed xattr id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than table_start, and again the difference +* shouls be SQUASHFS_METADATA_SIZE or less. +* +* Finally xattr_table_start should be less than table[0]. +*/ + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } - TRACE("In read_xattr_index_table, length %d\n", len); + if (*xattr_table_start >= le64_to_cpu(table[0])) { + kfree(table); +
[PATCH 4.19 04/24] remoteproc: qcom_q6v5_mss: Validate modem blob firmware size before load
From: Sibi Sankar commit 135b9e8d1cd8ba5ac9ad9bcf24b464b7b052e5b8 upstream The following mem abort is observed when one of the modem blob firmware size exceeds the allocated mpss region. Fix this by restricting the copy size to segment size using request_firmware_into_buf before load. Err Logs: Unable to handle kernel paging request at virtual address Mem abort info: ... Call trace: __memcpy+0x110/0x180 rproc_start+0xd0/0x190 rproc_boot+0x404/0x550 state_store+0x54/0xf8 dev_attr_store+0x44/0x60 sysfs_kf_write+0x58/0x80 kernfs_fop_write+0x140/0x230 vfs_write+0xc4/0x208 ksys_write+0x74/0xf8 ... Reviewed-by: Bjorn Andersson Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5") Cc: sta...@vger.kernel.org Signed-off-by: Sibi Sankar Link: https://lore.kernel.org/r/20200722201047.12975-3-si...@codeaurora.org Signed-off-by: Bjorn Andersson [sudip: manual backport to old file path] Signed-off-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman --- drivers/remoteproc/qcom_q6v5_pil.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/drivers/remoteproc/qcom_q6v5_pil.c +++ b/drivers/remoteproc/qcom_q6v5_pil.c @@ -739,14 +739,13 @@ static int q6v5_mpss_load(struct q6v5 *q if (phdr->p_filesz) { snprintf(seg_name, sizeof(seg_name), "modem.b%02d", i); - ret = request_firmware(&seg_fw, seg_name, qproc->dev); + ret = request_firmware_into_buf(&seg_fw, seg_name, qproc->dev, + ptr, phdr->p_filesz); if (ret) { dev_err(qproc->dev, "failed to load %s\n", seg_name); goto release_firmware; } - memcpy(ptr, seg_fw->data, seg_fw->size); - release_firmware(seg_fw); }
[PATCH 4.19 23/24] squashfs: add more sanity checks in inode lookup
From: Phillip Lougher commit eabac19e40c095543def79cb6ffeb3a8588aaff4 upstream. Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. [phil...@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/527909353.754618.1612769948...@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-4-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+04419e3ff19d2970e...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/export.c | 41 + 1 file changed, 33 insertions(+), 8 deletions(-) --- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -54,12 +54,17 @@ static long long squashfs_inode_lookup(s struct squashfs_sb_info *msblk = sb->s_fs_info; int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1); int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1); - u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]); + u64 start; __le64 ino; int err; TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num); + if (ino_num == 0 || (ino_num - 1) >= msblk->inodes) + return -EINVAL; + + start = le64_to_cpu(msblk->inode_lookup_table[blk]); + err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino)); if (err < 0) return err; @@ -124,7 +129,10 @@ __le64 *squashfs_read_inode_lookup_table u64 lookup_table_start, u64 next_table, unsigned int inodes) { unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes); + unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes); + int n; __le64 *table; + u64 start, end; TRACE("In read_inode_lookup_table, length %d\n", length); @@ -134,20 +142,37 @@ __le64 *squashfs_read_inode_lookup_table if (inodes == 0) return ERR_PTR(-EINVAL); - /* length bytes should not extend into the next table - this check -* also traps instances where lookup_table_start is incorrectly larger -* than the next table start + /* +* The computed size of the lookup table (length bytes) should exactly +* match the table start and end points */ - if (lookup_table_start + length > next_table) + if (length != (next_table - lookup_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, lookup_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first inode lookup table metadata block, -* this should be less than lookup_table_start +* table0], table[1], ... table[indexes - 1] store the locations +* of the compressed inode lookup blocks. Each entry should be +* less than the next (i.e. table[0] < table[1]), and the difference +* between them should be SQUASHFS_METADATA_SIZE or less. +* table[indexes - 1] should be less than lookup_table_start, and +* again the difference should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); }
[PATCH 4.19 22/24] squashfs: add more sanity checks in id lookup
From: Phillip Lougher commit f37aa4c7366e23f91b81d00bafd6a7ab54e4a381 upstream. Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Link: https://lkml.kernel.org/r/20210204130249.4495-3-phil...@squashfs.org.uk Signed-off-by: Phillip Lougher Reported-by: syzbot+b06d57ba83f604522...@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee98...@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0...@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0f...@syzkaller.appspotmail.com Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/squashfs/id.c | 40 fs/squashfs/squashfs_fs_sb.h |1 + fs/squashfs/super.c |6 +++--- fs/squashfs/xattr.h | 10 +- 4 files changed, 45 insertions(+), 12 deletions(-) --- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -48,10 +48,15 @@ int squashfs_get_id(struct super_block * struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_ID_BLOCK(index); int offset = SQUASHFS_ID_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->id_table[block]); + u64 start_block; __le32 disk_id; int err; + if (index >= msblk->ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->id_table[block]); + err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset, sizeof(disk_id)); if (err < 0) @@ -69,7 +74,10 @@ __le64 *squashfs_read_id_index_table(str u64 id_table_start, u64 next_table, unsigned short no_ids) { unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids); + unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids); + int n; __le64 *table; + u64 start, end; TRACE("In read_id_index_table, length %d\n", length); @@ -80,20 +88,36 @@ __le64 *squashfs_read_id_index_table(str return ERR_PTR(-EINVAL); /* -* length bytes should not extend into the next table - this check -* also traps instances where id_table_start is incorrectly larger -* than the next table start +* The computed size of the index table (length bytes) should exactly +* match the table start and end points */ - if (id_table_start + length > next_table) + if (length != (next_table - id_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, id_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first id lookup table metadata block, this -* should be less than id_table_start +* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than id_table_start, and again the difference +* should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } --- a/fs/squashfs/squashfs_fs_sb.h +++ b/fs/squashfs/squashfs_fs_sb.h @@ -77,5 +77,6 @@ struct squashfs_sb_info { unsigned intinodes; unsigned intfragments; int xattr_ids; + unsigned int
[PATCH 4.19 05/24] remoteproc: qcom_q6v5_mss: Validate MBA firmware size before load
From: Sibi Sankar commit e013f455d95add874f310dc47c608e8c70692ae5 upstream The following mem abort is observed when the mba firmware size exceeds the allocated mba region. MBA firmware size is restricted to a maximum size of 1M and remaining memory region is used by modem debug policy firmware when available. Hence verify whether the MBA firmware size lies within the allocated memory region and is not greater than 1M before loading. Err Logs: Unable to handle kernel paging request at virtual address Mem abort info: ... Call trace: __memcpy+0x110/0x180 rproc_start+0x40/0x218 rproc_boot+0x5b4/0x608 state_store+0x54/0xf8 dev_attr_store+0x44/0x60 sysfs_kf_write+0x58/0x80 kernfs_fop_write+0x140/0x230 vfs_write+0xc4/0x208 ksys_write+0x74/0xf8 __arm64_sys_write+0x24/0x30 ... Reviewed-by: Bjorn Andersson Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5") Cc: sta...@vger.kernel.org Signed-off-by: Sibi Sankar Link: https://lore.kernel.org/r/20200722201047.12975-2-si...@codeaurora.org Signed-off-by: Bjorn Andersson [sudip: manual backport to old file path] Signed-off-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman --- drivers/remoteproc/qcom_q6v5_pil.c |6 ++ 1 file changed, 6 insertions(+) --- a/drivers/remoteproc/qcom_q6v5_pil.c +++ b/drivers/remoteproc/qcom_q6v5_pil.c @@ -340,6 +340,12 @@ static int q6v5_load(struct rproc *rproc { struct q6v5 *qproc = rproc->priv; + /* MBA is restricted to a maximum size of 1M */ + if (fw->size > qproc->mba_size || fw->size > SZ_1M) { + dev_err(qproc->dev, "MBA firmware load failed\n"); + return -EINVAL; + } + memcpy(qproc->mba_region, fw->data, fw->size); return 0;
[PATCH 5.10 18/54] powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics
From: Raoni Fassina Firmino commit 24321ac668e452a4942598533d267805f291fdc9 upstream. Commit 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline") changed __kernel_sigtramp_rt64() VDSO and trampoline code, and introduced a regression in the way glibc's backtrace()[1] detects the signal-handler stack frame. Apart from the practical implications, __kernel_sigtramp_rt64() was a VDSO function with the semantics that it is a function you can call from userspace to end a signal handling. Now this semantics are no longer valid. I believe the aforementioned change affects all releases since 5.9. This patch tries to fix both the semantics and practical aspect of __kernel_sigtramp_rt64() returning it to the previous code, whilst keeping the intended behaviour of 0138ba5783ae by adding a new symbol to serve as the jump target from the kernel to the trampoline. Now the trampoline has two parts, a new entry point and the old return point. [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-January/223194.html Fixes: 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline") Cc: sta...@vger.kernel.org # v5.9+ Signed-off-by: Raoni Fassina Firmino Acked-by: Nicholas Piggin [mpe: Minor tweaks to change log formatting, add stable tag] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210201200505.iz46ubcizipnkcxe@work-tp Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/vdso.c |2 +- arch/powerpc/kernel/vdso64/sigtramp.S | 11 ++- arch/powerpc/kernel/vdso64/vdso64.lds.S |1 + 3 files changed, 12 insertions(+), 2 deletions(-) --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -475,7 +475,7 @@ static __init void vdso_setup_trampoline */ #ifdef CONFIG_PPC64 - vdso64_rt_sigtramp = find_function64(v64, "__kernel_sigtramp_rt64"); + vdso64_rt_sigtramp = find_function64(v64, "__kernel_start_sigtramp_rt64"); #endif vdso32_sigtramp= find_function32(v32, "__kernel_sigtramp32"); vdso32_rt_sigtramp = find_function32(v32, "__kernel_sigtramp_rt32"); --- a/arch/powerpc/kernel/vdso64/sigtramp.S +++ b/arch/powerpc/kernel/vdso64/sigtramp.S @@ -15,11 +15,20 @@ .text +/* + * __kernel_start_sigtramp_rt64 and __kernel_sigtramp_rt64 together + * are one function split in two parts. The kernel jumps to the former + * and the signal handler indirectly (by blr) returns to the latter. + * __kernel_sigtramp_rt64 needs to point to the return address so + * glibc can correctly identify the trampoline stack frame. + */ .balign 8 .balign IFETCH_ALIGN_BYTES -V_FUNCTION_BEGIN(__kernel_sigtramp_rt64) +V_FUNCTION_BEGIN(__kernel_start_sigtramp_rt64) .Lsigrt_start: bctrl /* call the handler */ +V_FUNCTION_END(__kernel_start_sigtramp_rt64) +V_FUNCTION_BEGIN(__kernel_sigtramp_rt64) addir1, r1, __SIGNAL_FRAMESIZE li r0,__NR_rt_sigreturn sc --- a/arch/powerpc/kernel/vdso64/vdso64.lds.S +++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S @@ -150,6 +150,7 @@ VERSION __kernel_get_tbfreq; __kernel_sync_dicache; __kernel_sync_dicache_p5; + __kernel_start_sigtramp_rt64; __kernel_sigtramp_rt64; __kernel_getcpu; __kernel_time;
Re: [PATCH v2 4/8] xen/netback: fix spurious event detection for common event case
On Thu, Feb 11, 2021 at 11:16:12AM +0100, Juergen Gross wrote: > In case of a common event for rx and tx queue the event should be > regarded to be spurious if no rx and no tx requests are pending. > > Unfortunately the condition for testing that is wrong causing to > decide a event being spurious if no rx OR no tx requests are > pending. > > Fix that plus using local variables for rx/tx pending indicators in > order to split function calls and if condition. > > Fixes: 23025393dbeb3b ("xen/netback: use lateeoi irq binding") > Signed-off-by: Juergen Gross Reviewed-by: Wei Liu
[PATCH 5.10 25/54] chtls: Fix potential resource leak
From: Pan Bian [ Upstream commit b6011966ac6f402847eb5326beee8da3a80405c7 ] The dst entry should be released if no neighbour is found. Goto label free_dst to fix the issue. Besides, the check of ndev against NULL is redundant. Signed-off-by: Pan Bian Link: https://lore.kernel.org/r/20210121145738.51091-1-bianpan2...@163.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- .../net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c| 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c index 5beec901713fb..a262c949ed76b 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c @@ -1158,11 +1158,9 @@ static struct sock *chtls_recv_sock(struct sock *lsk, #endif } if (!n || !n->dev) - goto free_sk; + goto free_dst; ndev = n->dev; - if (!ndev) - goto free_dst; if (is_vlan_dev(ndev)) ndev = vlan_dev_real_dev(ndev); @@ -1249,7 +1247,8 @@ static struct sock *chtls_recv_sock(struct sock *lsk, free_csk: chtls_sock_release(&csk->kref); free_dst: - neigh_release(n); + if (n) + neigh_release(n); dst_release(dst); free_sk: inet_csk_prepare_forced_close(newsk); -- 2.27.0
[PATCH 5.10 29/54] ASoC: ak4458: correct reset polarity
From: Eliot Blennerhassett [ Upstream commit e953daeb68b1abd8a7d44902786349fdeef5c297 ] Reset (aka power off) happens when the reset gpio is made active. Change function name to ak4458_reset to match devicetree property "reset-gpios" Signed-off-by: Eliot Blennerhassett Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/ce650f47-4ff6-e486-7846-cc3d033f3...@blennerhassett.gen.nz Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/ak4458.c | 22 +++--- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/sound/soc/codecs/ak4458.c b/sound/soc/codecs/ak4458.c index 1010c9ee2e836..472caad17012e 100644 --- a/sound/soc/codecs/ak4458.c +++ b/sound/soc/codecs/ak4458.c @@ -595,18 +595,10 @@ static struct snd_soc_dai_driver ak4497_dai = { .ops = &ak4458_dai_ops, }; -static void ak4458_power_off(struct ak4458_priv *ak4458) +static void ak4458_reset(struct ak4458_priv *ak4458, bool active) { if (ak4458->reset_gpiod) { - gpiod_set_value_cansleep(ak4458->reset_gpiod, 0); - usleep_range(1000, 2000); - } -} - -static void ak4458_power_on(struct ak4458_priv *ak4458) -{ - if (ak4458->reset_gpiod) { - gpiod_set_value_cansleep(ak4458->reset_gpiod, 1); + gpiod_set_value_cansleep(ak4458->reset_gpiod, active); usleep_range(1000, 2000); } } @@ -620,7 +612,7 @@ static int ak4458_init(struct snd_soc_component *component) if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 1); - ak4458_power_on(ak4458); + ak4458_reset(ak4458, false); ret = snd_soc_component_update_bits(component, AK4458_00_CONTROL1, 0x80, 0x80); /* ACKS bit = 1; 1000 */ @@ -650,7 +642,7 @@ static void ak4458_remove(struct snd_soc_component *component) { struct ak4458_priv *ak4458 = snd_soc_component_get_drvdata(component); - ak4458_power_off(ak4458); + ak4458_reset(ak4458, true); } #ifdef CONFIG_PM @@ -660,7 +652,7 @@ static int __maybe_unused ak4458_runtime_suspend(struct device *dev) regcache_cache_only(ak4458->regmap, true); - ak4458_power_off(ak4458); + ak4458_reset(ak4458, true); if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 0); @@ -685,8 +677,8 @@ static int __maybe_unused ak4458_runtime_resume(struct device *dev) if (ak4458->mute_gpiod) gpiod_set_value_cansleep(ak4458->mute_gpiod, 1); - ak4458_power_off(ak4458); - ak4458_power_on(ak4458); + ak4458_reset(ak4458, true); + ak4458_reset(ak4458, false); regcache_cache_only(ak4458->regmap, false); regcache_mark_dirty(ak4458->regmap); -- 2.27.0
[PATCH 5.10 21/54] ASoC: wm_adsp: Fix control name parsing for multi-fw
From: James Schulman [ Upstream commit a8939f2e138e418c2b059056ff5b501eaf2eae54 ] When switching between firmware types, the wrong control can be selected when requesting control in kernel API. Use the currently selected DSP firwmare type to select the proper mixer control. Signed-off-by: James Schulman Acked-by: Charles Keepax Link: https://lore.kernel.org/r/20210115201105.14075-1-james.schul...@cirrus.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/wm_adsp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c index dec8716aa8ef5..985b2dcecf138 100644 --- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -2031,11 +2031,14 @@ static struct wm_coeff_ctl *wm_adsp_get_ctl(struct wm_adsp *dsp, unsigned int alg) { struct wm_coeff_ctl *pos, *rslt = NULL; + const char *fw_txt = wm_adsp_fw_text[dsp->fw]; list_for_each_entry(pos, &dsp->ctl_list, list) { if (!pos->subname) continue; if (strncmp(pos->subname, name, pos->subname_len) == 0 && + strncmp(pos->fw_name, fw_txt, + SNDRV_CTL_ELEM_ID_NAME_MAXLEN) == 0 && pos->alg_region.alg == alg && pos->alg_region.type == type) { rslt = pos; -- 2.27.0
[PATCH 5.10 24/54] ASoC: Intel: Skylake: Zero snd_ctl_elem_value
From: Ricardo Ribalda [ Upstream commit 1d8fe0648e118fd495a2cb393a34eb8d428e7808 ] Clear struct snd_ctl_elem_value before calling ->put() to avoid any data leak. Signed-off-by: Ricardo Ribalda Reviewed-by: Cezary Rojewski Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20210121171644.131059-2-riba...@chromium.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/intel/skylake/skl-topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/intel/skylake/skl-topology.c b/sound/soc/intel/skylake/skl-topology.c index d699e61eca3d0..0955cbb4e9187 100644 --- a/sound/soc/intel/skylake/skl-topology.c +++ b/sound/soc/intel/skylake/skl-topology.c @@ -3632,7 +3632,7 @@ static void skl_tplg_complete(struct snd_soc_component *component) sprintf(chan_text, "c%d", mach->mach_params.dmic_num); for (i = 0; i < se->items; i++) { - struct snd_ctl_elem_value val; + struct snd_ctl_elem_value val = {}; if (strstr(texts[i], chan_text)) { val.value.enumerated.item[0] = i; -- 2.27.0
[PATCH 5.10 16/54] io_uring: drop mm/files between task_work_submit
From: Pavel Begunkov [ Upstream commit aec18a57edad562d620f7d19016de1fc0cc2208c ] Since SQPOLL task can be shared and so task_work entries can be a mix of them, we need to drop mm and files before trying to issue next request. Cc: sta...@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |3 +++ 1 file changed, 3 insertions(+) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2084,6 +2084,9 @@ static void __io_req_task_submit(struct else __io_req_task_cancel(req, -EFAULT); mutex_unlock(&ctx->uring_lock); + + if (ctx->flags & IORING_SETUP_SQPOLL) + io_sq_thread_drop_mm(); } static void io_req_task_submit(struct callback_head *cb)
[PATCH 5.10 04/54] io_uring: pass files into kill timeouts/poll
From: Pavel Begunkov [ Upstream commit 6b81928d4ca8668513251f9c04cdcb9d38ef51c7 ] Make io_poll_remove_all() and io_kill_timeouts() to match against files as well. A preparation patch, effectively not used by now. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1508,14 +1508,15 @@ static bool io_task_match(struct io_kioc /* * Returns true if we found and killed one or more timeouts */ -static bool io_kill_timeouts(struct io_ring_ctx *ctx, struct task_struct *tsk) +static bool io_kill_timeouts(struct io_ring_ctx *ctx, struct task_struct *tsk, +struct files_struct *files) { struct io_kiocb *req, *tmp; int canceled = 0; spin_lock_irq(&ctx->completion_lock); list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) { - if (io_task_match(req, tsk)) { + if (io_match_task(req, tsk, files)) { io_kill_timeout(req); canceled++; } @@ -5312,7 +5313,8 @@ static bool io_poll_remove_one(struct io /* * Returns true if we found and killed one or more poll requests */ -static bool io_poll_remove_all(struct io_ring_ctx *ctx, struct task_struct *tsk) +static bool io_poll_remove_all(struct io_ring_ctx *ctx, struct task_struct *tsk, + struct files_struct *files) { struct hlist_node *tmp; struct io_kiocb *req; @@ -5324,7 +5326,7 @@ static bool io_poll_remove_all(struct io list = &ctx->cancel_hash[i]; hlist_for_each_entry_safe(req, tmp, list, hash_node) { - if (io_task_match(req, tsk)) + if (io_match_task(req, tsk, files)) posted += io_poll_remove_one(req); } } @@ -8485,8 +8487,8 @@ static void io_ring_ctx_wait_and_kill(st __io_cqring_overflow_flush(ctx, true, NULL, NULL); mutex_unlock(&ctx->uring_lock); - io_kill_timeouts(ctx, NULL); - io_poll_remove_all(ctx, NULL); + io_kill_timeouts(ctx, NULL, NULL); + io_poll_remove_all(ctx, NULL, NULL); if (ctx->io_wq) io_wq_cancel_cb(ctx->io_wq, io_cancel_ctx_cb, ctx, true); @@ -8721,8 +8723,8 @@ static void __io_uring_cancel_task_reque } } - ret |= io_poll_remove_all(ctx, task); - ret |= io_kill_timeouts(ctx, task); + ret |= io_poll_remove_all(ctx, task, NULL); + ret |= io_kill_timeouts(ctx, task, NULL); if (!ret) break; io_run_task_work();
[PATCH 5.10 02/54] io_uring: add a {task,files} pair matching helper
From: Pavel Begunkov [ Upstream commit 08d23634643c239ddae706758f54d3a8e0c24962 ] Add io_match_task() that matches both task and files. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 63 +- 1 file changed, 32 insertions(+), 31 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -997,6 +997,36 @@ static inline void io_clean_op(struct io __io_clean_op(req); } +static inline bool __io_match_files(struct io_kiocb *req, + struct files_struct *files) +{ + return ((req->flags & REQ_F_WORK_INITIALIZED) && + (req->work.flags & IO_WQ_WORK_FILES)) && + req->work.identity->files == files; +} + +static bool io_match_task(struct io_kiocb *head, + struct task_struct *task, + struct files_struct *files) +{ + struct io_kiocb *link; + + if (task && head->task != task) + return false; + if (!files) + return true; + if (__io_match_files(head, files)) + return true; + if (head->flags & REQ_F_LINK_HEAD) { + list_for_each_entry(link, &head->link_list, link_list) { + if (__io_match_files(link, files)) + return true; + } + } + return false; +} + + static void io_sq_thread_drop_mm(void) { struct mm_struct *mm = current->mm; @@ -1612,32 +1642,6 @@ static void io_cqring_mark_overflow(stru } } -static inline bool __io_match_files(struct io_kiocb *req, - struct files_struct *files) -{ - return ((req->flags & REQ_F_WORK_INITIALIZED) && - (req->work.flags & IO_WQ_WORK_FILES)) && - req->work.identity->files == files; -} - -static bool io_match_files(struct io_kiocb *req, - struct files_struct *files) -{ - struct io_kiocb *link; - - if (!files) - return true; - if (__io_match_files(req, files)) - return true; - if (req->flags & REQ_F_LINK_HEAD) { - list_for_each_entry(link, &req->link_list, link_list) { - if (__io_match_files(link, files)) - return true; - } - } - return false; -} - /* Returns true if there are no backlogged entries after the flush */ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force, struct task_struct *tsk, @@ -1659,9 +1663,7 @@ static bool __io_cqring_overflow_flush(s cqe = NULL; list_for_each_entry_safe(req, tmp, &ctx->cq_overflow_list, compl.list) { - if (tsk && req->task != tsk) - continue; - if (!io_match_files(req, files)) + if (!io_match_task(req, tsk, files)) continue; cqe = io_get_cqring(ctx); @@ -8635,8 +8637,7 @@ static void io_cancel_defer_files(struct spin_lock_irq(&ctx->completion_lock); list_for_each_entry_reverse(de, &ctx->defer_list, list) { - if (io_task_match(de->req, task) && - io_match_files(de->req, files)) { + if (io_match_task(de->req, task, files)) { list_cut_position(&list, &ctx->defer_list, &de->list); break; }
[PATCH 5.10 20/54] regulator: core: avoid regulator_resolve_supply() race condition
From: David Collins [ Upstream commit eaa7995c529b54d68d97a30f6344cc6ca2f214a7 ] The final step in regulator_register() is to call regulator_resolve_supply() for each registered regulator (including the one in the process of being registered). The regulator_resolve_supply() function first checks if rdev->supply is NULL, then it performs various steps to try to find the supply. If successful, rdev->supply is set inside of set_supply(). This procedure can encounter a race condition if two concurrent tasks call regulator_register() near to each other on separate CPUs and one of the regulators has rdev->supply_name specified. There is currently nothing guaranteeing atomicity between the rdev->supply check and set steps. Thus, both tasks can observe rdev->supply==NULL in their regulator_resolve_supply() calls. This then results in both creating a struct regulator for the supply. One ends up actually stored in rdev->supply and the other is lost (though still present in the supply's consumer_list). Here is a kernel log snippet showing the issue: [ 12.421768] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.425854] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.429064] debugfs: Directory 'regulator.4-SUPPLY' with parent '17a0.rsc:rpmh-regulator-gfxlvl-pm8350_s5_level' already present! Avoid this race condition by holding the rdev->mutex lock inside of regulator_resolve_supply() while checking and setting rdev->supply. Signed-off-by: David Collins Link: https://lore.kernel.org/r/1610068562-4410-1-git-send-email-colli...@codeaurora.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/core.c | 39 --- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 42bbd99a36acf..2c31f04ff950f 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1813,23 +1813,34 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) { struct regulator_dev *r; struct device *dev = rdev->dev.parent; - int ret; + int ret = 0; /* No supply to resolve? */ if (!rdev->supply_name) return 0; - /* Supply already resolved? */ + /* Supply already resolved? (fast-path without locking contention) */ if (rdev->supply) return 0; + /* +* Recheck rdev->supply with rdev->mutex lock held to avoid a race +* between rdev->supply null check and setting rdev->supply in +* set_supply() from concurrent tasks. +*/ + regulator_lock(rdev); + + /* Supply just resolved by a concurrent task? */ + if (rdev->supply) + goto out; + r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); /* Did the lookup explicitly defer for us? */ if (ret == -EPROBE_DEFER) - return ret; + goto out; if (have_full_constraints()) { r = dummy_regulator_rdev; @@ -1837,15 +1848,18 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) } else { dev_err(dev, "Failed to resolve %s-supply for %s\n", rdev->supply_name, rdev->desc->name); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } if (r == rdev) { dev_err(dev, "Supply for %s (%s) resolved to itself\n", rdev->desc->name, rdev->supply_name); - if (!have_full_constraints()) - return -EINVAL; + if (!have_full_constraints()) { + ret = -EINVAL; + goto out; + } r = dummy_regulator_rdev; get_device(&r->dev); } @@ -1859,7 +1873,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (r->dev.parent && r->dev.parent != rdev->dev.parent) { if (!device_is_bound(r->dev.parent)) { put_device(&r->dev); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } @@ -1867,13 +1882,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) ret = regulator_resolve_supply(r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } ret = set_supply(rdev, r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } /* @@ -1886,11 +1901,13 @@ static int regulator_resolve_supply(struct regulator_dev
Re: [PATCH v1 0/9] x86/platform: Remove SFI framework and users
On Thu, Feb 11, 2021 at 2:50 PM Andy Shevchenko wrote: > > This is last part of Intel MID (SFI based) removal. We have no more users of > it > in the kernel and since SFI has been marked Obsolete for a few years already, > Remove all the stuff altogether. > > Note, the more recent platforms (Intel Merrifield and Moorefield) still work > as > long as they provide correct ACPI tables. > > The series requires two prerequisite branches to be pulled first, i.e. > - one form Rafael's PM tree (currently bleeding-edge) > - one form TIP tree (x86/platform), actually only one patch is needed from it > > Due to above it's convenient to proceed all of these via Rafael's PM tree, > > Note, atomisp change is tagged by Sakari on behalf of media tree maintainers. > > Andy Shevchenko (9): > media: atomisp: Remove unused header > cpufreq: sfi-cpufreq: Remove driver for deprecated firmware > sfi: Remove framework for deprecated firmware > x86/PCI: Get rid of custom x86 model comparison > x86/PCI: Describe @reg for type1_access_ok() > x86/platform/intel-mid: Get rid of intel_scu_ipc_legacy.h > x86/platform/intel-mid: Drop unused __intel_mid_cpu_chip and Co. > x86/platform/intel-mid: Remove unused header inclusion in intel-mid.h > x86/platform/intel-mid: Update Copyright year and drop file names > > Documentation/ABI/testing/sysfs-firmware-sfi | 15 - > Documentation/ABI/testing/sysfs-platform-kim | 2 +- > MAINTAINERS | 7 - > arch/x86/Kconfig | 7 +- > arch/x86/include/asm/intel-mid.h | 65 +-- > arch/x86/include/asm/intel_scu_ipc.h | 2 - > arch/x86/include/asm/intel_scu_ipc_legacy.h | 74 --- > arch/x86/include/asm/platform_sst_audio.h | 2 - > arch/x86/kernel/apic/io_apic.c| 4 +- > arch/x86/kernel/setup.c | 2 - > arch/x86/pci/intel_mid_pci.c | 18 +- > arch/x86/pci/mmconfig-shared.c| 6 +- > arch/x86/platform/Makefile| 1 - > arch/x86/platform/intel-mid/Makefile | 5 - > .../platform/intel-mid/device_libs/Makefile | 23 - > .../intel-mid/device_libs/platform_bcm43xx.c | 101 > .../intel-mid/device_libs/platform_bma023.c | 16 - > .../intel-mid/device_libs/platform_bt.c | 101 > .../intel-mid/device_libs/platform_emc1403.c | 39 -- > .../device_libs/platform_gpio_keys.c | 81 --- > .../intel-mid/device_libs/platform_lis331.c | 37 -- > .../intel-mid/device_libs/platform_max7315.c | 77 --- > .../intel-mid/device_libs/platform_mpu3050.c | 32 -- > .../device_libs/platform_mrfld_pinctrl.c | 39 -- > .../device_libs/platform_mrfld_rtc.c | 44 -- > .../intel-mid/device_libs/platform_mrfld_sd.c | 43 -- > .../device_libs/platform_mrfld_spidev.c | 50 -- > .../device_libs/platform_pcal9555a.c | 95 > .../intel-mid/device_libs/platform_tc35876x.c | 42 -- > .../intel-mid/device_libs/platform_tca6416.c | 53 -- > arch/x86/platform/intel-mid/intel-mid.c | 27 +- > arch/x86/platform/intel-mid/sfi.c | 419 -- > arch/x86/platform/sfi/Makefile| 2 - > arch/x86/platform/sfi/sfi.c | 100 > drivers/Makefile | 2 +- > drivers/cpufreq/Kconfig.x86 | 10 - > drivers/cpufreq/Makefile | 1 - > drivers/cpufreq/sfi-cpufreq.c | 127 - > drivers/platform/x86/intel_scu_pcidrv.c | 22 +- > drivers/sfi/Kconfig | 18 - > drivers/sfi/Makefile | 4 - > drivers/sfi/sfi_acpi.c| 214 --- > drivers/sfi/sfi_core.c| 522 -- > drivers/sfi/sfi_core.h| 81 --- > .../atomisp/include/linux/atomisp_platform.h | 1 - > include/linux/sfi.h | 210 --- > include/linux/sfi_acpi.h | 93 > init/main.c | 2 - > 48 files changed, 37 insertions(+), 2901 deletions(-) > delete mode 100644 Documentation/ABI/testing/sysfs-firmware-sfi > delete mode 100644 arch/x86/include/asm/intel_scu_ipc_legacy.h > delete mode 100644 arch/x86/platform/intel-mid/device_libs/Makefile > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_bcm43xx.c > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_bma023.c > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_bt.c > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_emc1403.c > delete mode 100644 > arch/x86/platform/intel-mid/device_libs/platform_gpio_keys.c > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_lis331.c > delete mode 100644 arch/x86/platform/intel-mid/device_libs/platform_max7315.c > delete mode 100644 arch/x86
[PATCH 5.10 03/54] io_uring: dont iterate io_uring_cancel_files()
From: Pavel Begunkov [ Upstream commit b52fda00dd9df8b4a6de5784df94f9617f6133a1 ] io_uring_cancel_files() guarantees to cancel all matching requests, that's not necessary to do that in a loop. Move it up in the callchain into io_uring_cancel_task_requests(). Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 34 -- 1 file changed, 12 insertions(+), 22 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8654,16 +8654,10 @@ static void io_cancel_defer_files(struct } } -/* - * Returns true if we found and killed one or more files pinning requests - */ -static bool io_uring_cancel_files(struct io_ring_ctx *ctx, +static void io_uring_cancel_files(struct io_ring_ctx *ctx, struct task_struct *task, struct files_struct *files) { - if (list_empty_careful(&ctx->inflight_list)) - return false; - while (!list_empty_careful(&ctx->inflight_list)) { struct io_kiocb *cancel_req = NULL, *req; DEFINE_WAIT(wait); @@ -8698,8 +8692,6 @@ static bool io_uring_cancel_files(struct schedule(); finish_wait(&ctx->inflight_wait, &wait); } - - return true; } static bool io_cancel_task_cb(struct io_wq_work *work, void *data) @@ -8710,15 +8702,12 @@ static bool io_cancel_task_cb(struct io_ return io_task_match(req, task); } -static bool __io_uring_cancel_task_requests(struct io_ring_ctx *ctx, - struct task_struct *task, - struct files_struct *files) +static void __io_uring_cancel_task_requests(struct io_ring_ctx *ctx, + struct task_struct *task) { - bool ret; - - ret = io_uring_cancel_files(ctx, task, files); - if (!files) { + while (1) { enum io_wq_cancel cret; + bool ret = false; cret = io_wq_cancel_cb(ctx->io_wq, io_cancel_task_cb, task, true); if (cret != IO_WQ_CANCEL_NOTFOUND) @@ -8734,9 +8723,11 @@ static bool __io_uring_cancel_task_reque ret |= io_poll_remove_all(ctx, task); ret |= io_kill_timeouts(ctx, task); + if (!ret) + break; + io_run_task_work(); + cond_resched(); } - - return ret; } static void io_disable_sqo_submit(struct io_ring_ctx *ctx) @@ -8771,11 +8762,10 @@ static void io_uring_cancel_task_request io_cancel_defer_files(ctx, task, files); io_cqring_overflow_flush(ctx, true, task, files); + io_uring_cancel_files(ctx, task, files); - while (__io_uring_cancel_task_requests(ctx, task, files)) { - io_run_task_work(); - cond_resched(); - } + if (!files) + __io_uring_cancel_task_requests(ctx, task); if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) { atomic_dec(&task->io_uring->in_idle);
[PATCH 5.10 28/54] ALSA: hda: intel-dsp-config: add PCI id for TGL-H
From: Bard Liao [ Upstream commit c5b5ff607d6fe5f4284acabd07066f96ecf96ac4 ] Adding PCI id for TGL-H. Like for other TGL platforms, SOF is used if Soundwire codecs or PCH-DMIC is detected. Signed-off-by: Bard Liao Reviewed-by: Xiuli Pan Reviewed-by: Libin Yang Signed-off-by: Kai Vehmanen Link: https://lore.kernel.org/r/20210125083051.828205-1-kai.vehma...@linux.intel.com Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/intel-dsp-config.c | 4 1 file changed, 4 insertions(+) diff --git a/sound/hda/intel-dsp-config.c b/sound/hda/intel-dsp-config.c index 1c5114dedda92..fe49e9a97f0ec 100644 --- a/sound/hda/intel-dsp-config.c +++ b/sound/hda/intel-dsp-config.c @@ -306,6 +306,10 @@ static const struct config_entry config_table[] = { .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE, .device = 0xa0c8, }, + { + .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE, + .device = 0x43c8, + }, #endif /* Elkhart Lake */ -- 2.27.0
[PATCH] staging: vt6656: Fixed issue with alignment in rf.c
This change fixes a checkpatch CHECK style issue for "Alignment should match open parenthesis". Signed-off-by: Pritthijit Nath --- drivers/staging/vt6656/rf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/vt6656/rf.c b/drivers/staging/vt6656/rf.c index 5b8da06e3916..bcd4d467e03a 100644 --- a/drivers/staging/vt6656/rf.c +++ b/drivers/staging/vt6656/rf.c @@ -687,7 +687,7 @@ static int vnt_rf_set_txpower(struct vnt_private *priv, u8 power, if (hw_value < ARRAY_SIZE(vt3226d0_lo_current_table)) { ret = vnt_rf_write_embedded(priv, - vt3226d0_lo_current_table[hw_value]); + vt3226d0_lo_current_table[hw_value]); if (ret) return ret; } -- 2.25.1
[PATCH 5.10 15/54] io_uring: reinforce cancel on flush during exit
From: Pavel Begunkov [ Upstream commit 3a7efd1ad269ccaf9c1423364d97c9661ba6dafa ] What 84965ff8a84f0 ("io_uring: if we see flush on exit, cancel related tasks") really wants is to cancel all relevant REQ_F_INFLIGHT requests reliably. That can be achieved by io_uring_cancel_files(), but we'll miss it calling io_uring_cancel_task_requests(files=NULL) from io_uring_flush(), because it will go through __io_uring_cancel_task_requests(). Just always call io_uring_cancel_files() during cancel, it's good enough for now. Cc: sta...@vger.kernel.org # 5.9+ Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8692,10 +8692,9 @@ static void io_uring_cancel_task_request io_cancel_defer_files(ctx, task, files); io_cqring_overflow_flush(ctx, true, task, files); + io_uring_cancel_files(ctx, task, files); if (!files) __io_uring_cancel_task_requests(ctx, task); - else - io_uring_cancel_files(ctx, task, files); if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) { atomic_dec(&task->io_uring->in_idle);
[PATCH 5.10 19/54] af_key: relax availability checks for skb size calculation
From: Cong Wang [ Upstream commit afbc293add6466f8f3f0c3d944d85f53709c170f ] xfrm_probe_algs() probes kernel crypto modules and changes the availability of struct xfrm_algo_desc. But there is a small window where ealg->available and aalg->available get changed between count_ah_combs()/count_esp_combs() and dump_ah_combs()/dump_esp_combs(), in this case we may allocate a smaller skb but later put a larger amount of data and trigger the panic in skb_put(). Fix this by relaxing the checks when counting the size, that is, skipping the test of ->available. We may waste some memory for a few of sizeof(struct sadb_comb), but it is still much better than a panic. Reported-by: syzbot+b2bf2652983d23734...@syzkaller.appspotmail.com Cc: Steffen Klassert Cc: Herbert Xu Signed-off-by: Cong Wang Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin --- net/key/af_key.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/key/af_key.c b/net/key/af_key.c index c12dbc51ef5fe..ef9b4ac03e7b7 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2902,7 +2902,7 @@ static int count_ah_combs(const struct xfrm_tmpl *t) break; if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } return sz + sizeof(struct sadb_prop); @@ -2920,7 +2920,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!ealg->pfkey_supported) continue; - if (!(ealg_tmpl_set(t, ealg) && ealg->available)) + if (!(ealg_tmpl_set(t, ealg))) continue; for (k = 1; ; k++) { @@ -2931,7 +2931,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } } -- 2.27.0
include/linux/compiler_types.h:319:38: error: call to '__compiletime_assert_234' declared with attribute error: BUILD_BUG_ON failed: FIX_KMAP_SLOTS > PTRS_PER_PTE
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 291009f656e8eaebbdfd3a8d99f6b190a9ce9deb commit: 6e799cb69a70eedbb41561b750f7180c12cff280 mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL date: 3 months ago config: arc-randconfig-r032-20210209 (attached as .config) compiler: arceb-elf-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6e799cb69a70eedbb41561b750f7180c12cff280 git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git fetch --no-tags linus master git checkout 6e799cb69a70eedbb41561b750f7180c12cff280 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): In file included from : arch/arc/mm/highmem.c: In function 'kmap_init': >> include/linux/compiler_types.h:319:38: error: call to >> '__compiletime_assert_234' declared with attribute error: BUILD_BUG_ON >> failed: FIX_KMAP_SLOTS > PTRS_PER_PTE 319 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:300:4: note: in definition of macro '__compiletime_assert' 300 |prefix ## suffix();\ |^~ include/linux/compiler_types.h:319:2: note: in expansion of macro '_compiletime_assert' 319 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~ include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~ arch/arc/mm/highmem.c:69:2: note: in expansion of macro 'BUILD_BUG_ON' 69 | BUILD_BUG_ON(FIX_KMAP_SLOTS > PTRS_PER_PTE); | ^~~~ vim +/__compiletime_assert_234 +319 include/linux/compiler_types.h eb5c2d4b45e3d2 Will Deacon 2020-07-21 305 eb5c2d4b45e3d2 Will Deacon 2020-07-21 306 #define _compiletime_assert(condition, msg, prefix, suffix) \ eb5c2d4b45e3d2 Will Deacon 2020-07-21 307 __compiletime_assert(condition, msg, prefix, suffix) eb5c2d4b45e3d2 Will Deacon 2020-07-21 308 eb5c2d4b45e3d2 Will Deacon 2020-07-21 309 /** eb5c2d4b45e3d2 Will Deacon 2020-07-21 310 * compiletime_assert - break build and emit msg if condition is false eb5c2d4b45e3d2 Will Deacon 2020-07-21 311 * @condition: a compile-time constant condition to check eb5c2d4b45e3d2 Will Deacon 2020-07-21 312 * @msg: a message to emit if condition is false eb5c2d4b45e3d2 Will Deacon 2020-07-21 313 * eb5c2d4b45e3d2 Will Deacon 2020-07-21 314 * In tradition of POSIX assert, this macro will break the build if the eb5c2d4b45e3d2 Will Deacon 2020-07-21 315 * supplied condition is *false*, emitting the supplied error message if the eb5c2d4b45e3d2 Will Deacon 2020-07-21 316 * compiler has support to do so. eb5c2d4b45e3d2 Will Deacon 2020-07-21 317 */ eb5c2d4b45e3d2 Will Deacon 2020-07-21 318 #define compiletime_assert(condition, msg) \ eb5c2d4b45e3d2 Will Deacon 2020-07-21 @319 _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) eb5c2d4b45e3d2 Will Deacon 2020-07-21 320 :: The code at line 319 was first introduced by commit :: eb5c2d4b45e3d2d5d052ea6b8f1463976b1020d5 compiler.h: Move compiletime_assert() macros into compiler_types.h :: TO: Will Deacon :: CC: Will Deacon --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [BUG REPORT] media: coda: mpeg4 decode corruption on i.MX6qp only
Hi Philipp, thank you so much for looking into this, I really appreciate it ! On Thu, Feb 11, 2021 at 9:32 AM Philipp Zabel wrote: > > Another thing that might help to identify who is writing where might be to > clear the whole OCRAM region and dump it after running only decode or only > PRE/PRG scanout, for example: Great idea, I will try that out. This might take a few days. I am also dealing with higher priority issues, > > Could you check /sys/kernel/debug/dri/?/state while running the error case? dri state in non-error case: # cat state plane[31]: plane-0 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=0 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[35]: plane-1 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=1 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[38]: plane-2 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=0 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[42]: plane-3 crtc=crtc-2 fb=59 allocated by = X refcount=2 format=XR24 little-endian (0x34325258) modifier=0x0 size=1280x1088 layers: size[0]=1280x1088 pitch[0]=5120 offset[0]=0 obj[0]: name=2 refcount=4 start=000105e4 size=5570560 imported=no paddr=0xee80 vaddr=78a02004 crtc-pos=1280x800+0+0 src-pos=1280.00x800.00+0.00+0.00 rotation=1 normalized-zpos=0 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[46]: plane-4 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=1 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[49]: plane-5 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=0 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range crtc[34]: crtc-0 enable=0 active=0 self_refresh_active=0 planes_changed=0 mode_changed=0 active_changed=0 connectors_changed=0 color_mgmt_changed=0 plane_mask=0 connector_mask=0 encoder_mask=0 mode: "": 0 0 0 0 0 0 0 0 0 0 0x0 0x0 crtc[41]: crtc-1 enable=0 active=0 self_refresh_active=0 planes_changed=0 mode_changed=0 active_changed=0 connectors_changed=0 color_mgmt_changed=0 plane_mask=0 connector_mask=0 encoder_mask=0 mode: "": 0 0 0 0 0 0 0 0 0 0 0x0 0x0 crtc[45]: crtc-2 enable=1 active=1 self_refresh_active=0 planes_changed=0 mode_changed=0 active_changed=0 connectors_changed=0 color_mgmt_changed=0 plane_mask=8 connector_mask=2 encoder_mask=2 mode: "": 60 67880 1280 1344 1345 1350 800 838 839 841 0x0 0x0 crtc[52]: crtc-3 enable=0 active=0 self_refresh_active=0 planes_changed=0 mode_changed=0 active_changed=0 connectors_changed=0 color_mgmt_changed=0 plane_mask=0 connector_mask=0 encoder_mask=0 mode: "": 0 0 0 0 0 0 0 0 0 0 0x0 0x0 connector[54]: HDMI-A-1 crtc=(null) self_refresh_aware=0 connector[57]: LVDS-1 crtc=crtc-2 self_refresh_aware=0 dri state in error case: # cat state plane[31]: plane-0 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=0 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[35]: plane-1 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1 normalized-zpos=1 color-encoding=ITU-R BT.601 YCbCr color-range=YCbCr limited range plane[38]: plane-2 crtc=(null) fb=0 crtc-pos=0x0+0+0 src-pos=0.00x0.00+0.00+0.00 rotation=1
Re: [PATCH/v2] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH
On 2/10/21 3:50 PM, Willem de Bruijn wrote: On Wed, Feb 10, 2021 at 1:59 AM huangxuesen wrote: From: huangxuesen bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets encapsulation. But that is not appropriate when pushing Ethernet header. Add an option to further specify encap L2 type and set the inner_protocol as ETH_P_TEB. Suggested-by: Willem de Bruijn Signed-off-by: huangxuesen Signed-off-by: chengzhiyong Signed-off-by: wangli Thanks, this is exactly what I meant. Acked-by: Willem de Bruijn One small point regarding Signed-off-by: It is customary to capitalize family and given names. +1, huangxuesen, would be great if you could resubmit with capitalized names in your SoB as well as From (both seem affected). Thanks, Daniel
Re: [PATCH v4 net-next 09/11] skbuff: allow to optionally use NAPI cache from __alloc_skb()
From: Paolo Abeni Date: Thu, 11 Feb 2021 15:55:04 +0100 > On Thu, 2021-02-11 at 14:28 +, Alexander Lobakin wrote: > > From: Paolo Abeni on Thu, 11 Feb 2021 11:16:40 +0100 > > wrote: > > > What about changing __napi_alloc_skb() to always use > > > the __napi_build_skb(), for both kmalloc and page backed skbs? That is, > > > always doing the 'data' allocation in __napi_alloc_skb() - either via > > > page_frag or via kmalloc() - and than call __napi_build_skb(). > > > > > > I think that should avoid adding more checks in __alloc_skb() and > > > should probably reduce the number of conditional used > > > by __napi_alloc_skb(). > > > > I thought of this too. But this will introduce conditional branch > > to set or not skb->head_frag. So one branch less in __alloc_skb(), > > one branch more here, and we also lose the ability to __alloc_skb() > > with decached head. > > Just to try to be clear, I mean something alike the following (not even > build tested). In the fast path it has less branches than the current > code - for both kmalloc and page_frag allocation. > > --- > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 785daff48030..a242fbe4730e 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -506,23 +506,12 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct > *napi, unsigned int len, >gfp_t gfp_mask) > { > struct napi_alloc_cache *nc; > + bool head_frag, pfmemalloc; > struct sk_buff *skb; > void *data; > > len += NET_SKB_PAD + NET_IP_ALIGN; > > - /* If requested length is either too small or too big, > - * we use kmalloc() for skb->head allocation. > - */ > - if (len <= SKB_WITH_OVERHEAD(1024) || > - len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > - (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > - skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); > - if (!skb) > - goto skb_fail; > - goto skb_success; > - } > - > nc = this_cpu_ptr(&napi_alloc_cache); > len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > len = SKB_DATA_ALIGN(len); > @@ -530,25 +519,34 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct > *napi, unsigned int len, > if (sk_memalloc_socks()) > gfp_mask |= __GFP_MEMALLOC; > > - data = page_frag_alloc(&nc->page, len, gfp_mask); > + if (len <= SKB_WITH_OVERHEAD(1024) || > +len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > +(gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > + data = kmalloc_reserve(len, gfp_mask, NUMA_NO_NODE, > &pfmemalloc); > + head_frag = 0; > + len = 0; > + } else { > + data = page_frag_alloc(&nc->page, len, gfp_mask); > + pfmemalloc = nc->page.pfmemalloc; > + head_frag = 1; > + } > if (unlikely(!data)) > return NULL; Sure. I have a separate WIP series that reworks all three *alloc_skb() functions, as there's a nice room for optimization, especially after that tiny skbs now fall back to __alloc_skb(). It will likely hit mailing lists after the merge window and next net-next season, not now. And it's not really connected with NAPI cache reusing. > skb = __build_skb(data, len); > if (unlikely(!skb)) { > - skb_free_frag(data); > + if (head_frag) > + skb_free_frag(data); > + else > + kfree(data); > return NULL; > } > > - if (nc->page.pfmemalloc) > - skb->pfmemalloc = 1; > - skb->head_frag = 1; > + skb->pfmemalloc = pfmemalloc; > + skb->head_frag = head_frag; > > -skb_success: > skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); > skb->dev = napi->dev; > - > -skb_fail: > return skb; > } > EXPORT_SYMBOL(__napi_alloc_skb); Al
[net-next] net: mvpp2: fix interrupt mask/unmask skip condition
From: Stefan Chulski The condition should be skipped if CPU ID equal to nthreads. The patch doesn't fix any actual issue since nthreads = min_t(unsigned int, num_present_cpus(), MVPP2_MAX_THREADS). On all current Armada platforms, the number of CPU's is less than MVPP2_MAX_THREADS. Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads") Reported-by: Russell King Signed-off-by: Stefan Chulski --- drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index a07cf60..74613d3 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -1135,7 +1135,7 @@ static void mvpp2_interrupts_mask(void *arg) struct mvpp2_port *port = arg; /* If the thread isn't used, don't do anything */ - if (smp_processor_id() > port->priv->nthreads) + if (smp_processor_id() >= port->priv->nthreads) return; mvpp2_thread_write(port->priv, @@ -1153,7 +1153,7 @@ static void mvpp2_interrupts_unmask(void *arg) u32 val; /* If the thread isn't used, don't do anything */ - if (smp_processor_id() > port->priv->nthreads) + if (smp_processor_id() >= port->priv->nthreads) return; val = MVPP2_CAUSE_MISC_SUM_MASK | -- 1.9.1
Re: [PATCH 4.19 07/24] regulator: core: avoid regulator_resolve_supply() race condition
On Thu, Feb 11, 2021 at 04:02:41PM +0100, Greg Kroah-Hartman wrote: > From: David Collins > > [ Upstream commit eaa7995c529b54d68d97a30f6344cc6ca2f214a7 ] > > The final step in regulator_register() is to call > regulator_resolve_supply() for each registered regulator This is buggy without a followup which doesn't seem to have been backported here. signature.asc Description: PGP signature
[PATCH 2/2] quota: wire up quotactl_path
Wire up the quotactl_path syscall added in the previous patch. Signed-off-by: Sascha Hauer --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl| 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/syscalls.h| 2 ++ include/uapi/asm-generic/unistd.h | 4 +++- kernel/sys_ni.c | 1 + 21 files changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index a6617067dbe6..3fe90880c821 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -481,3 +481,4 @@ 549common faccessat2 sys_faccessat2 550common process_madvise sys_process_madvise 551common epoll_pwait2sys_epoll_pwait2 +552common quotactl_path sys_quotactl_path diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 20e1170e2e0a..a62509df217f 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -455,3 +455,4 @@ 439common faccessat2 sys_faccessat2 440common process_madvise sys_process_madvise 441common epoll_pwait2sys_epoll_pwait2 +442common quotactl_path sys_quotactl_path diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 86a9d7b3eabe..949788f5ba40 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 442 +#define __NR_compat_syscalls 443 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index cccfbbefbf95..734c254ca1b6 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -891,6 +891,8 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2) __SYSCALL(__NR_process_madvise, sys_process_madvise) #define __NR_epoll_pwait2 441 __SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2) +#define __NR_quotactl_path 442 +__SYSCALL(__NR_quotactl_path, sys_quotactl_path) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index bfc00f2bd437..4758a22a4d80 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -362,3 +362,4 @@ 439common faccessat2 sys_faccessat2 440common process_madvise sys_process_madvise 441common epoll_pwait2sys_epoll_pwait2 +442common quotactl_path sys_quotactl_path diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 7fe4e45c864c..b9072d2f1fdc 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -441,3 +441,4 @@ 439common faccessat2 sys_faccessat2 440common process_madvise sys_process_madvise 441common epoll_pwait2sys_epoll_pwait2 +442common quotactl_path sys_quotactl_path diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index a522adf194ab..95e0cb59e8c1 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -447,3 +447,4 @@ 439common faccessat2 sys_faccessat2 440common process_madvise sys_process_madvise 441common epoll_pwait2sys_epoll_pwait2 +442common quotactl_path sys_quotactl_path diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 0f03ad223f33..027fe0351e66 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n
[PATCH v2 0/2] quota: Add mountpath based quota support
Current quotactl syscall uses a path to a block device to specify the filesystem to work on which makes it unsuitable for filesystems that do not have a block device. This series adds a new syscall quotactl_path() which replaces the path to the block device with a mountpath, but otherwise behaves like original quotactl. This is done to add quota support to UBIFS. UBIFS quota support has been posted several times with different approaches to put the mountpath into the existing quotactl() syscall until it has been suggested to make it a new syscall instead, so here it is. I'm not posting the full UBIFS quota series here as it remains unchanged and I'd like to get feedback to the new syscall first. For those interested the most recent series can be found here: https://lwn.net/Articles/810463/ Changes since (implicit) v1: - Ignore second path argument to Q_QUOTAON. With this quotactl_path() can only do the Q_QUOTAON operation on filesystems which use hidden inodes for quota metadata storage - Drop unnecessary quotactl_cmd_onoff() check Sascha Hauer (2): quota: Add mountpath based quota support quota: wire up quotactl_path arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 + arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl| 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + fs/quota/quota.c| 49 + include/linux/syscalls.h| 2 + include/uapi/asm-generic/unistd.h | 4 +- kernel/sys_ni.c | 1 + 22 files changed, 74 insertions(+), 2 deletions(-) -- 2.20.1
[PATCH] quotactl.2: Add documentation for quotactl_path()
Expand the quotactl.2 manpage with a description for quotactl_path() that takes a mountpoint path instead of a path to a block device. Signed-off-by: Sascha Hauer --- man2/quotactl.2 | 31 --- man2/quotactl_path.2 | 1 + 2 files changed, 29 insertions(+), 3 deletions(-) create mode 100644 man2/quotactl_path.2 diff --git a/man2/quotactl.2 b/man2/quotactl.2 index 7869c64ea..76505c668 100644 --- a/man2/quotactl.2 +++ b/man2/quotactl.2 @@ -34,6 +34,8 @@ quotactl \- manipulate disk quotas .PP .BI "int quotactl(int " cmd ", const char *" special ", int " id \ ", caddr_t " addr ); +.BI "int quotactl_path(int " cmd ", const char *" mountpoint ", int " id \ +", caddr_t " addr ); .fi .SH DESCRIPTION The quota system can be used to set per-user, per-group, and per-project limits @@ -48,7 +50,11 @@ after this, the soft limit counts as a hard limit. .PP The .BR quotactl () -call manipulates disk quotas. +and +.BR quotactl_path () +calls manipulate disk quotas. The difference between both functions is the way +how the filesystem being manipulated is specified, see description of the arguments +below. The .I cmd argument indicates a command to be applied to the user or @@ -75,10 +81,19 @@ value is described below. .PP The .I special -argument is a pointer to a null-terminated string containing the pathname +argument to +.BR quotactl () +is a pointer to a null-terminated string containing the pathname of the (mounted) block special device for the filesystem being manipulated. .PP The +.I mountpoint +argument to +.BR quotactl_path () +is a pointer to a null-terminated string containing the pathname +of the mountpoint for the filesystem being manipulated. +.PP +The .I addr argument is the address of an optional, command-specific, data structure that is copied in or out of the system. @@ -133,7 +148,17 @@ flag in the .I dqi_flags field returned by the .B Q_GETINFO -operation. +operation. The +.BR quotactl_path () +variant of this syscall generally ignores the +.IR addr +and +.IR id +arguments, so the +.B Q_QUOTAON +operation of +.BR quotactl_path () +is only suitable for work with hidden system inodes. .IP This operation requires privilege .RB ( CAP_SYS_ADMIN ). diff --git a/man2/quotactl_path.2 b/man2/quotactl_path.2 new file mode 100644 index 0..5f63187c6 --- /dev/null +++ b/man2/quotactl_path.2 @@ -0,0 +1 @@ +.so man2/quotactl.2 -- 2.20.1
[PATCH 1/2] quota: Add mountpath based quota support
Add syscall quotactl_path, a variant of quotactl which allows to specify the mountpath instead of a path of to a block device. The quotactl syscall expects a path to the mounted block device to specify the filesystem to work on. This limits usage to filesystems which actually have a block device. quotactl_path replaces the path to the block device with a path where the filesystem is mounted at. The global Q_SYNC command to sync all filesystems is not supported for this new syscall, otherwise quotactl_path behaves like quotactl. Signed-off-by: Sascha Hauer --- fs/quota/quota.c | 49 1 file changed, 49 insertions(+) diff --git a/fs/quota/quota.c b/fs/quota/quota.c index 6d16b2be5ac4..6f1df32abeea 100644 --- a/fs/quota/quota.c +++ b/fs/quota/quota.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include "compat.h" @@ -968,3 +969,51 @@ SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special, path_put(pathp); return ret; } + +SYSCALL_DEFINE4(quotactl_path, unsigned int, cmd, const char __user *, + mountpoint, qid_t, id, void __user *, addr) +{ + struct super_block *sb; + struct path mountpath; + unsigned int cmds = cmd >> SUBCMDSHIFT; + unsigned int type = cmd & SUBCMDMASK; + int ret; + + if (type >= MAXQUOTAS) + return -EINVAL; + + if (!mountpoint) + return -ENODEV; + + ret = user_path_at(AT_FDCWD, mountpoint, +LOOKUP_FOLLOW | LOOKUP_AUTOMOUNT, &mountpath); + if (ret) + return ret; + + sb = mountpath.dentry->d_inode->i_sb; + + if (quotactl_cmd_write(cmds)) { + ret = mnt_want_write(mountpath.mnt); + if (ret) + goto out; + } + + if (quotactl_cmd_onoff(cmds)) + down_write(&sb->s_umount); + else + down_read(&sb->s_umount); + + ret = do_quotactl(sb, type, cmds, id, addr, ERR_PTR(-EINVAL)); + + if (quotactl_cmd_onoff(cmds)) + up_write(&sb->s_umount); + else + up_read(&sb->s_umount); + + if (quotactl_cmd_write(cmds)) + mnt_drop_write(mountpath.mnt); +out: + path_put(&mountpath); + + return ret; +} -- 2.20.1
[PATCH 5.4 03/24] regulator: core: avoid regulator_resolve_supply() race condition
From: David Collins [ Upstream commit eaa7995c529b54d68d97a30f6344cc6ca2f214a7 ] The final step in regulator_register() is to call regulator_resolve_supply() for each registered regulator (including the one in the process of being registered). The regulator_resolve_supply() function first checks if rdev->supply is NULL, then it performs various steps to try to find the supply. If successful, rdev->supply is set inside of set_supply(). This procedure can encounter a race condition if two concurrent tasks call regulator_register() near to each other on separate CPUs and one of the regulators has rdev->supply_name specified. There is currently nothing guaranteeing atomicity between the rdev->supply check and set steps. Thus, both tasks can observe rdev->supply==NULL in their regulator_resolve_supply() calls. This then results in both creating a struct regulator for the supply. One ends up actually stored in rdev->supply and the other is lost (though still present in the supply's consumer_list). Here is a kernel log snippet showing the issue: [ 12.421768] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.425854] gpu_cc_gx_gdsc: supplied by pm8350_s5_level [ 12.429064] debugfs: Directory 'regulator.4-SUPPLY' with parent '17a0.rsc:rpmh-regulator-gfxlvl-pm8350_s5_level' already present! Avoid this race condition by holding the rdev->mutex lock inside of regulator_resolve_supply() while checking and setting rdev->supply. Signed-off-by: David Collins Link: https://lore.kernel.org/r/1610068562-4410-1-git-send-email-colli...@codeaurora.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/core.c | 39 --- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index c9b8613e69db2..5e0490e18b46a 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1772,23 +1772,34 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) { struct regulator_dev *r; struct device *dev = rdev->dev.parent; - int ret; + int ret = 0; /* No supply to resolve? */ if (!rdev->supply_name) return 0; - /* Supply already resolved? */ + /* Supply already resolved? (fast-path without locking contention) */ if (rdev->supply) return 0; + /* +* Recheck rdev->supply with rdev->mutex lock held to avoid a race +* between rdev->supply null check and setting rdev->supply in +* set_supply() from concurrent tasks. +*/ + regulator_lock(rdev); + + /* Supply just resolved by a concurrent task? */ + if (rdev->supply) + goto out; + r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); /* Did the lookup explicitly defer for us? */ if (ret == -EPROBE_DEFER) - return ret; + goto out; if (have_full_constraints()) { r = dummy_regulator_rdev; @@ -1796,15 +1807,18 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) } else { dev_err(dev, "Failed to resolve %s-supply for %s\n", rdev->supply_name, rdev->desc->name); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } if (r == rdev) { dev_err(dev, "Supply for %s (%s) resolved to itself\n", rdev->desc->name, rdev->supply_name); - if (!have_full_constraints()) - return -EINVAL; + if (!have_full_constraints()) { + ret = -EINVAL; + goto out; + } r = dummy_regulator_rdev; get_device(&r->dev); } @@ -1818,7 +1832,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (r->dev.parent && r->dev.parent != rdev->dev.parent) { if (!device_is_bound(r->dev.parent)) { put_device(&r->dev); - return -EPROBE_DEFER; + ret = -EPROBE_DEFER; + goto out; } } @@ -1826,13 +1841,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) ret = regulator_resolve_supply(r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } ret = set_supply(rdev, r); if (ret < 0) { put_device(&r->dev); - return ret; + goto out; } /* @@ -1845,11 +1860,13 @@ static int regulator_resolve_supply(struct regulator_dev
[PATCH 5.4 13/24] iwlwifi: mvm: invalidate IDs of internal stations at mvm start
From: Gregory Greenman [ Upstream commit e223e42aac30bf81f9302c676cdf58cf2bf36950 ] Having sta_id not set for aux_sta and snif_sta can potentially lead to a hard to debug issue in case remove station is called without an add. In this case sta_id 0, an unrelated regular station, will be removed. In fact, we do have a FW assert that occures rarely and from the debug data analysis it looks like sta_id 0 is removed by mistake, though it's hard to pinpoint the exact flow. The WARN_ON in this patch should help to find it. Signed-off-by: Gregory Greenman Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/iwlwifi.20210122144849.5dc6dd9b22d5.I2add1b5ad24d0d0a221de79d439c09f88fcaf15d@changeid Signed-off-by: Sasha Levin --- drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 4 drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 6 ++ 2 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c index b04cc6214bac8..bc25a59807c34 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c @@ -838,6 +838,10 @@ iwl_op_mode_mvm_start(struct iwl_trans *trans, const struct iwl_cfg *cfg, if (!mvm->scan_cmd) goto out_free; + /* invalidate ids to prevent accidental removal of sta_id 0 */ + mvm->aux_sta.sta_id = IWL_MVM_INVALID_STA; + mvm->snif_sta.sta_id = IWL_MVM_INVALID_STA; + /* Set EBS as successful as long as not stated otherwise by the FW. */ mvm->last_ebs_successful = true; diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c index a36aa9e85e0b3..40cafcf40ccf0 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c @@ -2070,6 +2070,9 @@ int iwl_mvm_rm_snif_sta(struct iwl_mvm *mvm, struct ieee80211_vif *vif) lockdep_assert_held(&mvm->mutex); + if (WARN_ON_ONCE(mvm->snif_sta.sta_id == IWL_MVM_INVALID_STA)) + return -EINVAL; + iwl_mvm_disable_txq(mvm, NULL, mvm->snif_queue, IWL_MAX_TID_COUNT, 0); ret = iwl_mvm_rm_sta_common(mvm, mvm->snif_sta.sta_id); if (ret) @@ -2084,6 +2087,9 @@ int iwl_mvm_rm_aux_sta(struct iwl_mvm *mvm) lockdep_assert_held(&mvm->mutex); + if (WARN_ON_ONCE(mvm->aux_sta.sta_id == IWL_MVM_INVALID_STA)) + return -EINVAL; + iwl_mvm_disable_txq(mvm, NULL, mvm->aux_queue, IWL_MAX_TID_COUNT, 0); ret = iwl_mvm_rm_sta_common(mvm, mvm->aux_sta.sta_id); if (ret) -- 2.27.0