date:20210201

RE: [RFC PATCH v1 0/4] vfio: Add IOPF support for VFIO passthrough

2021-02-01 Thread Tian, Kevin

> From: Alex Williamson 
> Sent: Saturday, January 30, 2021 6:58 AM
> 
> On Mon, 25 Jan 2021 17:03:58 +0800
> Shenming Lu  wrote:
> 
> > Hi,
> >
> > The static pinning and mapping problem in VFIO and possible solutions
> > have been discussed a lot [1, 2]. One of the solutions is to add I/O
> > page fault support for VFIO devices. Different from those relatively
> > complicated software approaches such as presenting a vIOMMU that
> provides
> > the DMA buffer information (might include para-virtualized optimizations),
> > IOPF mainly depends on the hardware faulting capability, such as the PCIe
> > PRI extension or Arm SMMU stall model. What's more, the IOPF support in
> > the IOMMU driver is being implemented in SVA [3]. So do we consider to
> > add IOPF support for VFIO passthrough based on the IOPF part of SVA at
> > present?
> >
> > We have implemented a basic demo only for one stage of translation (GPA
> > -> HPA in virtualization, note that it can be configured at either stage),
> > and tested on Hisilicon Kunpeng920 board. The nested mode is more
> complicated
> > since VFIO only handles the second stage page faults (same as the non-
> nested
> > case), while the first stage page faults need to be further delivered to
> > the guest, which is being implemented in [4] on ARM. My thought on this
> > is to report the page faults to VFIO regardless of the occured stage (try
> > to carry the stage information), and handle respectively according to the
> > configured mode in VFIO. Or the IOMMU driver might evolve to support
> more...
> >
> > Might TODO:
> >  - Optimize the faulting path, and measure the performance (it might still
> >be a big issue).
> >  - Add support for PRI.
> >  - Add a MMU notifier to avoid pinning.
> >  - Add support for the nested mode.
> > ...
> >
> > Any comments and suggestions are very welcome. :-)
> 
> I expect performance to be pretty bad here, the lookup involved per
> fault is excessive.  There are cases where a user is not going to be
> willing to have a slow ramp up of performance for their devices as they
> fault in pages, so we might need to considering making this
> configurable through the vfio interface.  Our page mapping also only

There is another factor to be considered. The presence of IOMMU_
DEV_FEAT_IOPF just indicates the device capability of triggering I/O 
page fault through the IOMMU, but not exactly means that the device 
can tolerate I/O page fault for arbitrary DMA requests. In reality, many 
devices allow I/O faulting only in selective contexts. However, there
is no standard way (e.g. PCISIG) for the device to report whether 
arbitrary I/O fault is allowed. Then we may have to maintain device
specific knowledge in software, e.g. in an opt-in table to list devices
which allows arbitrary faults. For devices which only support selective 
faulting, a mediator (either through vendor extensions on vfio-pci-core
or a mdev wrapper) might be necessary to help lock down non-faultable 
mappings and then enable faulting on the rest mappings.

> grows here, should mappings expire or do we need a least recently
> mapped tracker to avoid exceeding the user's locked memory limit?  How
> does a user know what to set for a locked memory limit?  The behavior
> here would lead to cases where an idle system might be ok, but as soon
> as load increases with more inflight DMA, we start seeing
> "unpredictable" I/O faults from the user perspective.  Seems like there
> are lots of outstanding considerations and I'd also like to hear from
> the SVA folks about how this meshes with their work.  Thanks,
> 

The main overlap between this feature and SVA is the IOPF reporting
framework, which currently still has gap to support both in nested
mode, as discussed here:

https://lore.kernel.org/linux-acpi/YAaxjmJW+ZMvrhac@myrica/

Once that gap is resolved in the future, the VFIO fault handler just 
adopts different actions according to the fault-level: 1st level faults
are forwarded to userspace thru the vSVA path while 2nd-level faults
are fixed (or warned if not intended) by VFIO itself thru the IOMMU
mapping interface.

Thanks
Kevin

[PATCH] riscv: Improve kasan population by using hugepages when possible

2021-02-01 Thread Alexandre Ghiti

Kasan function that populates the shadow regions used to allocate them
page by page and did not take advantage of hugepages, so fix this by
trying to allocate hugepages of 1GB and fallback to 2MB hugepages or 4K
pages in case it fails.

This reduces the page table memory consumption and improves TLB usage,
as shown below:

Before this patch:

---[ Kasan shadow start ]---
0xffc0-0xffc40x818ef00016G PTE 
. A . . . . R V
0xffc4-0xffc447fc0x0002b7f4f000   1179392K PTE 
D A . . . W R V
0xffc48000-0xffc80x818ef00014G PTE 
. A . . . . R V
---[ Kasan shadow end ]---

After this patch:

---[ Kasan shadow start ]---
0xffc0-0xffc40x818ef00016G PTE 
. A . . . . R V
0xffc4-0xffc440000x00024000 1G PGD 
D A . . . W R V
0xffc44000-0xffc447e00x0002b7e0   126M PMD 
D A . . . W R V
0xffc447e0-0xffc447fc0x0002b818f000  1792K PTE 
D A . . . W R V
0xffc48000-0xffc80x818ef00014G PTE 
. A . . . . R V
---[ Kasan shadow end ]---

Signed-off-by: Alexandre Ghiti 
---
 arch/riscv/mm/kasan_init.c | 101 +++--
 1 file changed, 73 insertions(+), 28 deletions(-)

diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index a8a2ffd9114a..8f11b73018b1 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -47,37 +47,82 @@ asmlinkage void __init kasan_early_init(void)
local_flush_tlb_all();
 }
 
-static void __init populate(void *start, void *end)
+static void kasan_populate_pte(pmd_t *pmd, unsigned long vaddr, unsigned long 
end)
+{
+   phys_addr_t phys_addr;
+   pte_t *ptep = memblock_alloc(PTRS_PER_PTE * sizeof(pte_t), PAGE_SIZE);
+
+   do {
+   phys_addr = memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
+   set_pte(ptep, pfn_pte(PFN_DOWN(phys_addr), PAGE_KERNEL));
+   } while (ptep++, vaddr += PAGE_SIZE, vaddr != end);
+
+   set_pmd(pmd, pfn_pmd(PFN_DOWN(__pa(ptep)), PAGE_TABLE));
+}
+
+static void kasan_populate_pmd(pgd_t *pgd, unsigned long vaddr, unsigned long 
end)
+{
+   phys_addr_t phys_addr;
+   pmd_t *pmdp = memblock_alloc(PTRS_PER_PMD * sizeof(pmd_t), PAGE_SIZE);
+   unsigned long next;
+
+   do {
+   next = pmd_addr_end(vaddr, end);
+
+   if (IS_ALIGNED(vaddr, PMD_SIZE) && (next - vaddr) >= PMD_SIZE) {
+   phys_addr = memblock_phys_alloc(PMD_SIZE, PMD_SIZE);
+   if (phys_addr) {
+   set_pmd(pmdp, pfn_pmd(PFN_DOWN(phys_addr), 
PAGE_KERNEL));
+   continue;
+   }
+   }
+
+   kasan_populate_pte(pmdp, vaddr, end);
+   } while (pmdp++, vaddr = next, vaddr != end);
+
+   /*
+* Wait for the whole PGD to be populated before setting the PGD in
+* the page table, otherwise, if we did set the PGD before populating
+* it entirely, memblock could allocate a page at a physical address
+* where KASAN is not populated yet and then we'd get a page fault.
+*/
+   set_pgd(pgd, pfn_pgd(PFN_DOWN(__pa(pmdp)), PAGE_TABLE));
+}
+
+static void kasan_populate_pgd(unsigned long vaddr, unsigned long end)
+{
+   phys_addr_t phys_addr;
+   pgd_t *pgdp = pgd_offset_k(vaddr);
+   unsigned long next;
+
+   do {
+   next = pgd_addr_end(vaddr, end);
+
+   if (IS_ALIGNED(vaddr, PGDIR_SIZE) && (next - vaddr) >= 
PGDIR_SIZE) {
+   phys_addr = memblock_phys_alloc(PGDIR_SIZE, PGDIR_SIZE);
+   if (phys_addr) {
+   set_pgd(pgdp, pfn_pgd(PFN_DOWN(phys_addr), 
PAGE_KERNEL));
+   continue;
+   }
+   }
+
+   kasan_populate_pmd(pgdp, vaddr, end);
+   } while (pgdp++, vaddr = next, vaddr != end);
+}
+
+/*
+ * This function populates KASAN shadow region focusing on hugepages in
+ * order to minimize the page table cost and TLB usage too.
+ * Note that start must be PGDIR_SIZE-aligned in SV39 which amounts to be
+ * 1G aligned (that represents a 8G alignment constraint on virtual address
+ * ranges because of KASAN_SHADOW_SCALE_SHIFT).
+ */
+static void __init kasan_populate(void *start, void *end)
 {
-   unsigned long i, offset;
unsigned long vaddr = (unsigned long)start & PAGE_MASK;
unsigned long vend = PAGE_ALIGN((unsigned long)end);
-   unsigned long n_pages = (vend - vaddr) / PAGE_SIZE;
-   unsigned long n_ptes =
-   ((n_pages + PTRS_PER_PTE) & -PTRS_PER_PTE) / PTRS_PER_PTE;
-   unsigned long n_pmds =
-   ((n_ptes + PTRS_PER_PMD) & -PTRS_PER_PMD) / PTRS_PER_PMD;
-
-   pte_t *pte =

Re: [PATCH 3/8] scsi: ufshpb: Add region's reads counter

2021-02-01 Thread gre...@linuxfoundation.org

On Mon, Feb 01, 2021 at 07:51:19AM +, Avri Altman wrote:
> > 
> > On Mon, Feb 01, 2021 at 07:12:53AM +, Avri Altman wrote:
> > > > > +#define WORK_PENDING 0
> > > > > +#define ACTIVATION_THRSHLD 4 /* 4 IOs */
> > > > Rather than fixing it with macro, how about using sysfs and make it
> > > > configurable?
> > > Yes.
> > > I will add a patch making all the logic configurable.
> > > As all those are hpb-related parameters, I think module parameters are
> > more adequate.
> > 
> > No, this is not the 1990's, please never add new module parameters to
> > drivers.  If not for the basic problem of they do not work on a
> > per-device basis, but on a per-driver basis, which is what you almost
> > never want.
> OK.
> 
> > 
> > But why would you want to change this value, why can't the driver "just
> > work" and not need manual intervention?
> It is.
> But those are a knobs each vendor may want to tweak,
> So it'll be optimized with its internal device's implementation.
> 
> Tweaking the parameters, as well as the entire logic, is really an endless 
> task.
> Some logic works better for some scenarios, while falling behind on others.

Shouldn't the hardware know how to handle this dynamically?  If not, how
is a user going to know?

> How about leaving it for now, to be elaborated it in the future?

I do not care, just do not make it a module parameter for the reason
that does not work on a per-device basis.

> Maybe even can be a part of a scheme, to make the logic proprietary?

What do you mean by "proprietary"?

thanks,

greg k-h

[PATCH] ASoC: Intel: catpt: remove unneeded semicolon

2021-02-01 Thread Yang Li

Eliminate the following coccicheck warning:
./sound/soc/intel/catpt/pcm.c:355:2-3: Unneeded semicolon

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 sound/soc/intel/catpt/pcm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/catpt/pcm.c b/sound/soc/intel/catpt/pcm.c
index e5d54bb..88a0879 100644
--- a/sound/soc/intel/catpt/pcm.c
+++ b/sound/soc/intel/catpt/pcm.c
@@ -352,7 +352,7 @@ static int catpt_dai_apply_usettings(struct snd_soc_dai 
*dai,
break;
default:
return 0;
-   };
+   }
 
list_for_each_entry(pos, &component->card->snd_card->controls, list) {
if (pos->private_data == component &&
-- 
1.8.3.1

KASAN: use-after-free Read in rxrpc_send_data_packet

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:78031381 bpf: Drop disabled LSM hooks from the sleepable set
git tree:   bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=11274530d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=be33d8015c9de024
dashboard link: https://syzkaller.appspot.com/bug?extid=174de899852504e4a74a
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+174de899852504e4a...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in rxrpc_send_data_packet+0x19b4/0x1e70 
net/rxrpc/output.c:372
Read of size 4 at addr 888011606e04 by task kworker/0:0/5

CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.11.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: krxrpcd rxrpc_process_call
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:230
 __kasan_report mm/kasan/report.c:396 [inline]
 kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
 rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372
 rxrpc_resend net/rxrpc/call_event.c:266 [inline]
 rxrpc_process_call+0x1634/0x1f60 net/rxrpc/call_event.c:412
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Allocated by task 2318:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:401 [inline]
 kasan_kmalloc.constprop.0+0x82/0xa0 mm/kasan/common.c:429
 kasan_slab_alloc include/linux/kasan.h:209 [inline]
 slab_post_alloc_hook mm/slab.h:512 [inline]
 slab_alloc_node mm/slub.c:2891 [inline]
 kmem_cache_alloc_node+0x1e0/0x470 mm/slub.c:2927
 __alloc_skb+0x71/0x5a0 net/core/skbuff.c:198
 alloc_skb include/linux/skbuff.h:1099 [inline]
 alloc_skb_with_frags+0x93/0x5d0 net/core/skbuff.c:5894
 sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2348
 rxrpc_send_data+0xb51/0x2bf0 net/rxrpc/sendmsg.c:358
 rxrpc_do_sendmsg+0xc03/0x1350 net/rxrpc/sendmsg.c:744
 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:560
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 sys_sendmsg+0x6e8/0x810 net/socket.c:2345
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2399
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 2318:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:356
 kasan_slab_free+0xe1/0x110 mm/kasan/common.c:362
 kasan_slab_free include/linux/kasan.h:192 [inline]
 slab_free_hook mm/slub.c:1547 [inline]
 slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1580
 slab_free mm/slub.c:3142 [inline]
 kmem_cache_free+0x82/0x350 mm/slub.c:3158
 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:636
 __kfree_skb net/core/skbuff.c:693 [inline]
 kfree_skb net/core/skbuff.c:710 [inline]
 kfree_skb+0x140/0x3f0 net/core/skbuff.c:704
 rxrpc_free_skb+0x11d/0x150 net/rxrpc/skbuff.c:78
 rxrpc_cleanup_ring net/rxrpc/call_object.c:485 [inline]
 rxrpc_release_call+0x5dd/0x860 net/rxrpc/call_object.c:552
 rxrpc_release_calls_on_socket+0x21c/0x300 net/rxrpc/call_object.c:579
 rxrpc_release_sock net/rxrpc/af_rxrpc.c:885 [inline]
 rxrpc_release+0x263/0x5a0 net/rxrpc/af_rxrpc.c:916
 __sock_release+0xcd/0x280 net/socket.c:597
 sock_close+0x18/0x20 net/socket.c:1256
 __fput+0x283/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 get_signal+0x1c7f/0x20f0 kernel/signal.c:2554
 arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
 handle_signal_work kernel/entry/common.c:147 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at 888011606dc0
 which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 68 bytes inside of
 232-byte region [888011606dc0, 888011606ea8)
The buggy address belongs to the page:
page:03512b7c refcount:1 mapcount:0 mapping: index:0x0 
pfn:0x11606
flags: 0xfff200(slab)
raw: 00fff200 ea8b6e00 000b000b 888010cbbc80
raw:  000c000c 0001 
page dumped because: kasan: bad access detected

Memory state a

Re: possible deadlock in send_sigio (2)

2021-02-01 Thread Dmitry Vyukov

On Fri, Jan 29, 2021 at 6:36 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 8d1ddb5e79374fb277985a6b3faa2ed8631c5b4c
> Author: Boqun Feng 
> Date:   Thu Nov 5 06:23:51 2020 +
>
> fcntl: Fix potential deadlock in send_sig{io, urg}()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17455db4d0
> start commit:   7b1b868e Merge tag 'for-linus' of git://git.kernel.org/pub..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3416bb960d5c705d
> dashboard link: https://syzkaller.appspot.com/bug?extid=907b8537e3b0e55151fc
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=163e046b50
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12f8b62350
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: fcntl: Fix potential deadlock in send_sig{io, urg}()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: fcntl: Fix potential deadlock in send_sig{io, urg}()

Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Marek Szyprowski

Hi Saravana,

On 30.01.2021 05:08, Saravana Kannan wrote:
> On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  wrote:
>> This patch series solves two general issues with fw_devlink=on
>>
>> Patch 1/2 addresses the issue of firmware nodes that look like they'll
>> have struct devices created for them, but will never actually have
>> struct devices added for them. For example, DT nodes with a compatible
>> property that don't have devices added for them.
>>
>> Patch 2/2 address (for static kernels) the issue of optional suppliers
>> that'll never have a driver registered for them. So, if the device could
>> have probed with fw_devlink=permissive with a static kernel, this patch
>> should allow those devices to probe with a fw_devlink=on. This doesn't
>> solve it for the case where modules are enabled because there's no way
>> to tell if a driver will never be registered or it's just about to be
>> registered. I have some other ideas for that, but it'll have to come
>> later thinking about it a bit.
>>
>> These two patches might remove the need for several other patches that
>> went in as fixes for commit e590474768f1 ("driver core: Set
>> fw_devlink=on by default"), but I think all those fixes are good
>> changes. So I think we should leave those in.
>>
>> Marek, Geert,
>>
>> Can you try this series on a static kernel with your OF_POPULATED
>> changes reverted? I just want to make sure these patches can identify
>> and fix those cases.
>>
>> Tudor,
>>
>> You should still make the clock driver fix (because it's a bug), but I
>> think this series will fix your issue too (even without the clock driver
>> fix). Can you please give this a shot?
> Marek, Geert, Tudor,
>
> Forgot to say that this will probably fix your issues only in a static
> kernel. So please try this with a static kernel. If you can also try
> and confirm that this does not fix the issue for a modular kernel,
> that'd be good too.

I've checked those patches on top of linux next-20210129 with 
c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform 
driver") commit reverted. Sadly it doesn't help. All devices that belong 
to the Exynos power domains are never probed and stay endlessly on the 
deferred devices list. I've used static kernel build - the one from 
exynos_defconfig.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

[PATCH] drm/amd/amdgpu/amdgpu_debugfs: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

2021-02-01 Thread Jiapeng Chong

Fix the following coccicheck warning:

./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1591:0-23: WARNING:
fops_sclk_set should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1588:0-23: WARNING:
fops_ib_preempt should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index a6667a2..54f3f68 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1585,11 +1585,8 @@ static int amdgpu_debugfs_sclk_set(void *data, u64 val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt, NULL,
-   amdgpu_debugfs_ib_preempt, "%llu\n");
-
-DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL,
-   amdgpu_debugfs_sclk_set, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(fops_ib_preempt, NULL, amdgpu_debugfs_ib_preempt, 
"%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(fops_sclk_set, NULL, amdgpu_debugfs_sclk_set, 
"%llu\n");
 
 int amdgpu_debugfs_init(struct amdgpu_device *adev)
 {
-- 
1.8.3.1

[PATCH] cpufreq: Remove CPUFREQ_STICKY flag

2021-02-01 Thread Viresh Kumar

During cpufreq driver's registration, if the ->init() callback for all
the CPUs fail then there is not much point in keeping the driver around
as it will only account for more unnecessary noise, for example cpufreq
core will try to suspend/resume the driver which never got registered
properly.

The removal of such a driver is avoided if the driver carries the
CPUFREQ_STICKY flag. This was added way back [1] in 2004 and perhaps no
one should ever need it now. A lot of driver do set this flag, probably
because they just copied it from another driver.

Remove the flag and update the relevant drivers.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/include/linux/cpufreq.h?id=7cc9f0d9a1ab04cedc60d64fd8dcf7df224a3b4d

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq-dt.c   |  2 +-
 drivers/cpufreq/cpufreq.c  |  3 +--
 drivers/cpufreq/davinci-cpufreq.c  |  2 +-
 drivers/cpufreq/loongson1-cpufreq.c|  2 +-
 drivers/cpufreq/mediatek-cpufreq.c |  2 +-
 drivers/cpufreq/omap-cpufreq.c |  2 +-
 drivers/cpufreq/qcom-cpufreq-hw.c  |  2 +-
 drivers/cpufreq/s3c24xx-cpufreq.c  |  2 +-
 drivers/cpufreq/s5pv210-cpufreq.c  |  2 +-
 drivers/cpufreq/sa1100-cpufreq.c   |  2 +-
 drivers/cpufreq/sa1110-cpufreq.c   |  2 +-
 drivers/cpufreq/scmi-cpufreq.c |  2 +-
 drivers/cpufreq/scpi-cpufreq.c |  2 +-
 drivers/cpufreq/spear-cpufreq.c|  2 +-
 drivers/cpufreq/tegra186-cpufreq.c |  2 +-
 drivers/cpufreq/tegra194-cpufreq.c |  3 +--
 drivers/cpufreq/vexpress-spc-cpufreq.c |  3 +--
 include/linux/cpufreq.h| 17 +++--
 18 files changed, 24 insertions(+), 30 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index ad4234518ef6..b1e1bdc63b01 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -175,7 +175,7 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
 }
 
 static struct cpufreq_driver dt_cpufreq_driver = {
-   .flags = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+   .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
 CPUFREQ_IS_COOLING_DEV,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = set_target,
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d0a3525ce27f..7d0ae968def7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2810,8 +2810,7 @@ int cpufreq_register_driver(struct cpufreq_driver 
*driver_data)
if (ret)
goto err_boost_unreg;
 
-   if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
-   list_empty(&cpufreq_policy_list)) {
+   if (unlikely(list_empty(&cpufreq_policy_list))) {
/* if all ->init() calls failed, unregister */
ret = -ENODEV;
pr_debug("%s: No CPU initialized for driver %s\n", __func__,
diff --git a/drivers/cpufreq/davinci-cpufreq.c 
b/drivers/cpufreq/davinci-cpufreq.c
index 91f477a6cbc4..9e97f60f8199 100644
--- a/drivers/cpufreq/davinci-cpufreq.c
+++ b/drivers/cpufreq/davinci-cpufreq.c
@@ -95,7 +95,7 @@ static int davinci_cpu_init(struct cpufreq_policy *policy)
 }
 
 static struct cpufreq_driver davinci_driver = {
-   .flags  = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+   .flags  = CPUFREQ_NEED_INITIAL_FREQ_CHECK,
.verify = cpufreq_generic_frequency_table_verify,
.target_index   = davinci_target,
.get= cpufreq_generic_get,
diff --git a/drivers/cpufreq/loongson1-cpufreq.c 
b/drivers/cpufreq/loongson1-cpufreq.c
index 86f612593e49..fb72d709db56 100644
--- a/drivers/cpufreq/loongson1-cpufreq.c
+++ b/drivers/cpufreq/loongson1-cpufreq.c
@@ -116,7 +116,7 @@ static int ls1x_cpufreq_exit(struct cpufreq_policy *policy)
 
 static struct cpufreq_driver ls1x_cpufreq_driver = {
.name   = "cpufreq-ls1x",
-   .flags  = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+   .flags  = CPUFREQ_NEED_INITIAL_FREQ_CHECK,
.verify = cpufreq_generic_frequency_table_verify,
.target_index   = ls1x_cpufreq_target,
.get= cpufreq_generic_get,
diff --git a/drivers/cpufreq/mediatek-cpufreq.c 
b/drivers/cpufreq/mediatek-cpufreq.c
index 022e3e966e71..f2e491b25b07 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -463,7 +463,7 @@ static int mtk_cpufreq_exit(struct cpufreq_policy *policy)
 }
 
 static struct cpufreq_driver mtk_cpufreq_driver = {
-   .flags = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+   .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
 CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
 CPUFREQ_IS_COOLING_DEV,
.verify = cpufreq_generic_frequency_table_verify,
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 3694bb030df3..e035ee216b0f 100644
--- a/drivers/cpuf

[PATCH] ASoC: fsl_xcvr: remove unneeded semicolon

2021-02-01 Thread Yang Li

Eliminate the following coccicheck warning:
./sound/soc/fsl/fsl_xcvr.c:739:2-3: Unneeded semicolon

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 sound/soc/fsl/fsl_xcvr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c
index 3d58c88..65b388a 100644
--- a/sound/soc/fsl/fsl_xcvr.c
+++ b/sound/soc/fsl/fsl_xcvr.c
@@ -736,7 +736,7 @@ static int fsl_xcvr_load_firmware(struct fsl_xcvr *xcvr)
/* clean current page, including data memory */
memset_io(xcvr->ram_addr, 0, size);
}
-   };
+   }
 
 err_firmware:
release_firmware(fw);
-- 
1.8.3.1

[PATCH 1/2] KVM: selftests: Keep track of memslots more efficiently

2021-02-01 Thread Maciej S. Szmigiero

From: "Maciej S. Szmigiero" 

The KVM selftest framework was using a simple list for keeping track of
the memslots currently in use.
This resulted in lookups and adding a single memslot being O(n), the
later due to linear scanning of the existing memslot set to check for
the presence of any conflicting entries.

Before this change, benchmarking high count of memslots was more or less
impossible as pretty much all the benchmark time was spent in the
selftest framework code.

We can simply use a rbtree for keeping track of both of gfn and hva.
We don't need an interval tree for hva here as we can't have overlapping
memslots because we allocate a completely new memory chunk for each new
memslot.

Signed-off-by: Maciej S. Szmigiero 
---
 tools/testing/selftests/kvm/Makefile  |   2 +-
 tools/testing/selftests/kvm/lib/kvm_util.c| 141 ++
 .../selftests/kvm/lib/kvm_util_internal.h |  15 +-
 tools/testing/selftests/kvm/lib/rbtree.c  |   1 +
 4 files changed, 124 insertions(+), 35 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/lib/rbtree.c

diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index fe41c6a0fa67..e7c6237d7383 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -33,7 +33,7 @@ ifeq ($(ARCH),s390)
UNAME_M := s390x
 endif
 
-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c 
lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
+LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c 
lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
 LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c 
lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c 
lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c 
b/tools/testing/selftests/kvm/lib/kvm_util.c
index fa5a90e6c6f0..632433dbfa25 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -195,7 +195,9 @@ struct kvm_vm *vm_create(enum vm_guest_mode mode, uint64_t 
phy_pages, int perm)
TEST_ASSERT(vm != NULL, "Insufficient Memory");
 
INIT_LIST_HEAD(&vm->vcpus);
-   INIT_LIST_HEAD(&vm->userspace_mem_regions);
+   vm->regions.gpa_tree = RB_ROOT;
+   vm->regions.hva_tree = RB_ROOT;
+   hash_init(vm->regions.slot_hash);
 
vm->mode = mode;
vm->type = 0;
@@ -347,13 +349,14 @@ struct kvm_vm *vm_create_default(uint32_t vcpuid, 
uint64_t extra_mem_pages,
  */
 void kvm_vm_restart(struct kvm_vm *vmp, int perm)
 {
+   int ctr;
struct userspace_mem_region *region;
 
vm_open(vmp, perm);
if (vmp->has_irqchip)
vm_create_irqchip(vmp);
 
-   list_for_each_entry(region, &vmp->userspace_mem_regions, list) {
+   hash_for_each(vmp->regions.slot_hash, ctr, region, slot_node) {
int ret = ioctl(vmp->fd, KVM_SET_USER_MEMORY_REGION, 
®ion->region);
TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION IOCTL 
failed,\n"
"  rc: %i errno: %i\n"
@@ -416,14 +419,21 @@ uint32_t kvm_vm_reset_dirty_ring(struct kvm_vm *vm)
 static struct userspace_mem_region *
 userspace_mem_region_find(struct kvm_vm *vm, uint64_t start, uint64_t end)
 {
-   struct userspace_mem_region *region;
+   struct rb_node *node;
 
-   list_for_each_entry(region, &vm->userspace_mem_regions, list) {
+   for (node = vm->regions.gpa_tree.rb_node; node; ) {
+   struct userspace_mem_region *region =
+   container_of(node, struct userspace_mem_region, 
gpa_node);
uint64_t existing_start = region->region.guest_phys_addr;
uint64_t existing_end = region->region.guest_phys_addr
+ region->region.memory_size - 1;
if (start <= existing_end && end >= existing_start)
return region;
+
+   if (start < existing_start)
+   node = node->rb_left;
+   else
+   node = node->rb_right;
}
 
return NULL;
@@ -538,11 +548,16 @@ void kvm_vm_release(struct kvm_vm *vmp)
 }
 
 static void __vm_mem_region_delete(struct kvm_vm *vm,
-  struct userspace_mem_region *region)
+  struct userspace_mem_region *region,
+  bool unlink)
 {
int ret;
 
-   list_del(®ion->list);
+   if (unlink) {
+   rb_erase(®ion->gpa_node, &vm->regions.gpa_tree);
+   rb_erase(®ion->hva_node, &vm->regions.hva_tree);
+   hash_del(®ion->slot_node);
+   }
 
region->region.memory_size = 0;
ret = ioctl(vm->fd, KVM_SET_USER_MEMORY_REGION, ®ion->region);
@@ -5

Re: [PATCH V1 1/3] scsi: ufs: export api for use in vendor file

2021-02-01 Thread nitirawa


On 2021-01-31 19:29, Avri Altman wrote:


Exporting functions ufshcd_set_dev_pwr_mode, ufshcd_disable_vreg
and ufshcd_enable_vreg so that vendor drivers can make use of
them in setting vendor specific regulator setting
in vendor specific file.

As for ufshcd_{enable,disable}_vreg - maybe inline ufshcd_toggle_vreg
and use it instead?



Signed-off-by: Nitin Rawat 
Signed-off-by: Bao D. Nguyen 
Signed-off-by: Veerabhadrarao Badiganti 
---
 drivers/scsi/ufs/ufshcd.c | 9 ++---
 drivers/scsi/ufs/ufshcd.h | 4 
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 9c691e4..000a03a 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8091,7 +8091,7 @@ static int ufshcd_config_vreg(struct device 
*dev,

return ret;
 }

-static int ufshcd_enable_vreg(struct device *dev, struct ufs_vreg 
*vreg)

+int ufshcd_enable_vreg(struct device *dev, struct ufs_vreg *vreg)
 {
int ret = 0;

@@ -8110,8 +8110,9 @@ static int ufshcd_enable_vreg(struct device 
*dev,

struct ufs_vreg *vreg)
 out:
return ret;
 }
+EXPORT_SYMBOL(ufshcd_enable_vreg);

Why do you need to export it across the kernel?
Isn't making it non-static suffices?
Do you need it for a loadable module?



-static int ufshcd_disable_vreg(struct device *dev, struct ufs_vreg 
*vreg)

+int ufshcd_disable_vreg(struct device *dev, struct ufs_vreg *vreg)
 {
int ret = 0;

@@ -8131,6 +8132,7 @@ static int ufshcd_disable_vreg(struct device 
*dev,

struct ufs_vreg *vreg)
 out:
return ret;
 }
+EXPORT_SYMBOL(ufshcd_disable_vreg);

 static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on)
 {
@@ -8455,7 +8457,7 @@ ufshcd_send_request_sense(struct ufs_hba *hba,
struct scsi_device *sdp)
  * Returns 0 if requested power mode is set successfully
  * Returns non-zero if failed to set the requested power mode
  */
-static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
+int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 enum ufs_dev_pwr_mode pwr_mode)
 {
unsigned char cmd[6] = { START_STOP };
@@ -8513,6 +8515,7 @@ static int ufshcd_set_dev_pwr_mode(struct 
ufs_hba

*hba,
hba->host->eh_noresume = 0;
return ret;
 }
+EXPORT_SYMBOL(ufshcd_set_dev_pwr_mode);

 static int ufshcd_link_state_transition(struct ufs_hba *hba,
enum uic_link_state 
req_link_state,

diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index ee61f82..1410c95 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -997,6 +997,10 @@ extern int ufshcd_dme_get_attr(struct ufs_hba 
*hba,

u32 attr_sel,
   u32 *mib_val, u8 peer);
 extern int ufshcd_config_pwr_mode(struct ufs_hba *hba,
struct ufs_pa_layer_attr *desired_pwr_mode);
+extern int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
+   enum ufs_dev_pwr_mode 
pwr_mode);
+extern int ufshcd_enable_vreg(struct device *dev, struct ufs_vreg 
*vreg);
+extern int ufshcd_disable_vreg(struct device *dev, struct ufs_vreg 
*vreg);


 /* UIC command interfaces for DME primitives */
 #define DME_LOCAL  0
--
2.7.4


Hi Avri,
Thanks for reviewing it.
ufs-qcom.c can be a loadable module, so just inlining won't suffice in 
that case.

Hence export is needed.

Thanks,
Nitin

[PATCH 2/2] KVM: selftests: add a memslot-related performance benchmark

2021-02-01 Thread Maciej S. Szmigiero

From: "Maciej S. Szmigiero" 

This benchmark contains the following tests:
* Map test, where the host unmaps guest memory while the guest writes to
it (maps it).

The test is designed in a way to make the unmap operation on the host
take a negligible amount of time in comparison with the mapping
operation in the guest.

The test area is actually split in two: the first half is being mapped
by the guest while the second half in being unmapped by the host.
Then a guest <-> host sync happens and the areas are reversed.

* Unmap test which is broadly similar to the above map test, but it is
designed in an opposite way: to make the mapping operation in the guest
take a negligible amount of time in comparison with the unmap operation
on the host.
This test is available in two variants: with per-page unmap operation
or a chunked one (using 2 MiB chunk size).

* Move active area test which involves moving the last (highest gfn)
memslot a bit back and forth on the host while the guest is
concurrently writing around the area being moved (including over the
moved memslot).

* Move inactive area test which is similar to the previous move active
area test, but now guest writes all happen outside of the area being
moved.

* Read / write test in which the guest writes to the beginning of each
page of the test area while the host writes to the middle of each such
page.
Then each side checks the values the other side has written.
This particular test is not expected to give different results depending
on particular memslots implementation, it is meant as a rough sanity
check and to provide insight on the spread of test results expected.

Each test performs its operation in a loop until a test period ends
(this is 5 seconds by default, but it is configurable).
Then the total count of loops done is divided by the actual elapsed
time to give the test result.

The tests have a configurable memslot cap with the "-s" test option, by
default the system maximum is used.
Each test is repeated a particular number of times (by default 20
times), the best result achieved is printed.

The test memory area is divided equally between memslots, the reminder
is added to the last memslot.
The test area size does not depend on the number of memslots in use.

The tests also measure the time that it took to add all these memslots.
The best result from the tests that use the whole test area is printed
after all the requested tests are done.

In general, these tests are designed to use as much memory as possible
(within reason) while still doing 100+ loops even on high memslot counts
with the default test length.
Increasing the test runtime makes it increasingly more likely that some
event will happen on the system during the test run, which might lower
the test result.

Signed-off-by: Maciej S. Szmigiero 
---
 tools/testing/selftests/kvm/.gitignore|1 +
 tools/testing/selftests/kvm/Makefile  |1 +
 .../testing/selftests/kvm/memslot_perf_test.c | 1039 +
 3 files changed, 1041 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/memslot_perf_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore 
b/tools/testing/selftests/kvm/.gitignore
index ce8f4ad39684..059a655053ca 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -31,3 +31,4 @@
 /kvm_create_max_vcpus
 /set_memory_region_test
 /steal_time
+/memslot_perf_test
diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index e7c6237d7383..2abc9e182c30 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -65,6 +65,7 @@ TEST_GEN_PROGS_x86_64 += dirty_log_perf_test
 TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus
 TEST_GEN_PROGS_x86_64 += set_memory_region_test
 TEST_GEN_PROGS_x86_64 += steal_time
+TEST_GEN_PROGS_x86_64 += memslot_perf_test
 
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list-sve
diff --git a/tools/testing/selftests/kvm/memslot_perf_test.c 
b/tools/testing/selftests/kvm/memslot_perf_test.c
new file mode 100644
index ..1a632094d2eb
--- /dev/null
+++ b/tools/testing/selftests/kvm/memslot_perf_test.c
@@ -0,0 +1,1039 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * A memslot-related performance benchmark.
+ *
+ * Copyright (C) 2021 Oracle and/or its affiliates.
+ *
+ * Basic guest setup / host vCPU thread code lifted from 
set_memory_region_test.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+
+#define VCPU_ID 0
+
+#define MEM_SIZE   ((512U << 20) + 4096)
+#define MEM_SIZE_PAGES (MEM_SIZE / 4096)
+#define MEM_GPA0x1000UL
+#define MEM_AUX_GPAMEM_GPA
+#define MEM_SYNC_GPA   MEM_AUX_GPA
+#define MEM_TEST_GPA   (MEM_AUX_GPA + 4096)
+#define MEM_TEST_SIZE

Re: [PATCH 23/29] fuse: Avoid comma separated statements

2021-02-01 Thread Miklos Szeredi

On Tue, Aug 25, 2020 at 6:57 AM Joe Perches  wrote:
>
> Use semicolons and braces.

Reference to coding style doc?  Or other important reason?  Or just
personal preference?

Thanks,
Miklos

[PATCH 1/2] KVM: x86/mmu: Make HVA handler retpoline-friendly

2021-02-01 Thread Maciej S. Szmigiero

From: "Maciej S. Szmigiero" 

When retpolines are enabled they have high overhead in the inner loop
inside kvm_handle_hva_range() that iterates over the provided memory area.

Implement a static dispatch there, just like commit 7a02674d154d
("KVM: x86/mmu: Avoid retpoline on ->page_fault() with TDP") did for the
MMU page fault handler.

This significantly improves performance on the unmap test on the existing
kernel memslot code (tested on a Xeon 8167M machine):
30 slots in use:
TestBeforeAfter Improvement
Unmap   0.0368s   0.0353s4%
Unmap 2M0.000952s 0.000431s 55%

509 slots in use:
Unmap   0.0872s   0.0777s   11%
Unmap 2M0.00236s  0.00168s  29%

Looks like performing this indirect call via a retpoline might have
interfered with unrolling of the whole loop in the CPU.

Provide such static dispatch only for kvm_unmap_rmapp() and
kvm_age_rmapp() and their TDP MMU equivalents since other handlers are
called in ranges of single byte only, so they already have high overhead
to begin with if walking over a large memory area.

Signed-off-by: Maciej S. Szmigiero 
---
 arch/x86/kvm/mmu/mmu.c |  59 +--
 arch/x86/kvm/mmu/tdp_mmu.c | 116 ++---
 2 files changed, 112 insertions(+), 63 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6d16481aa29d..4140e308cf30 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1456,6 +1456,45 @@ static void slot_rmap_walk_next(struct 
slot_rmap_walk_iterator *iterator)
 slot_rmap_walk_okay(_iter_);   \
 slot_rmap_walk_next(_iter_))
 
+static int kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+struct kvm_memory_slot *slot, gfn_t gfn, int level,
+unsigned long data)
+{
+   u64 *sptep;
+   struct rmap_iterator iter;
+   int young = 0;
+
+   for_each_rmap_spte(rmap_head, &iter, sptep)
+   young |= mmu_spte_age(sptep);
+
+   trace_kvm_age_page(gfn, level, slot, young);
+   return young;
+}
+
+static int kvm_handle_hva_do(struct kvm *kvm,
+struct slot_rmap_walk_iterator *iterator,
+struct kvm_memory_slot *memslot,
+unsigned long data,
+int (*handler)(struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head,
+   struct kvm_memory_slot *slot,
+   gfn_t gfn,
+   int level,
+   unsigned long data))
+{
+#ifdef CONFIG_RETPOLINE
+   if (handler == kvm_unmap_rmapp)
+   return kvm_unmap_rmapp(kvm, iterator->rmap, memslot,
+  iterator->gfn, iterator->level, data);
+   else if (handler == kvm_age_rmapp)
+   return kvm_age_rmapp(kvm, iterator->rmap, memslot,
+iterator->gfn, iterator->level, data);
+   else
+#endif
+   return handler(kvm, iterator->rmap, memslot,
+  iterator->gfn, iterator->level, data);
+}
+
 static int kvm_handle_hva_range(struct kvm *kvm,
unsigned long start,
unsigned long end,
@@ -1495,8 +1534,9 @@ static int kvm_handle_hva_range(struct kvm *kvm,
 KVM_MAX_HUGEPAGE_LEVEL,
 gfn_start, gfn_end - 1,
 &iterator)
-   ret |= handler(kvm, iterator.rmap, memslot,
-  iterator.gfn, iterator.level, 
data);
+   ret |= kvm_handle_hva_do(kvm, &iterator,
+memslot, data,
+handler);
}
}
 
@@ -1539,21 +1579,6 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
return r;
 }
 
-static int kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
-struct kvm_memory_slot *slot, gfn_t gfn, int level,
-unsigned long data)
-{
-   u64 *sptep;
-   struct rmap_iterator iter;
-   int young = 0;
-
-   for_each_rmap_spte(rmap_head, &iter, sptep)
-   young |= mmu_spte_age(sptep);
-
-   trace_kvm_age_page(gfn, level, slot, young);
-   return young;
-}
-
 static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
  struct kvm_memory_slot *slot, gfn_t gfn,
  int level, unsigned long data)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kv

[PATCH] gcc-plugins: Remove unneeded return variable

2021-02-01 Thread Yang Li

This patch removes unneeded return variables, using only
'0' instead.
It fixes the following warning detected by coccinelle:
./scripts/gcc-plugins/structleak_plugin.c:173:14-17: Unneeded variable:
"ret". Return "0" on line 203

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 scripts/gcc-plugins/structleak_plugin.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/scripts/gcc-plugins/structleak_plugin.c 
b/scripts/gcc-plugins/structleak_plugin.c
index 29b480c..d7190e4 100644
--- a/scripts/gcc-plugins/structleak_plugin.c
+++ b/scripts/gcc-plugins/structleak_plugin.c
@@ -170,7 +170,6 @@ static void initialize(tree var)
 static unsigned int structleak_execute(void)
 {
basic_block bb;
-   unsigned int ret = 0;
tree var;
unsigned int i;
 
@@ -200,7 +199,7 @@ static unsigned int structleak_execute(void)
initialize(var);
}
 
-   return ret;
+   return 0;
 }
 
 #define PASS_NAME structleak
-- 
1.8.3.1

[PATCH 2/2] KVM: Scalable memslots implementation

2021-02-01 Thread Maciej S. Szmigiero

From: "Maciej S. Szmigiero" 

The current memslot code uses a (reverse) gfn-ordered memslot array for
keeping track of them.
This only allows quick binary search by gfn, quick lookup by hva is not
possible (the implementation has to do a linear scan of the whole memslot
array).

Because the memslot array that is currently in use cannot be modified
every memslot management operation (create, delete, move, change flags)
has to make a copy of the whole array so it has a scratch copy to work
on.

Strictly speaking, however, it is only necessary to make copy of the
memslot that is being modified, copying all the memslots currently
present is just a limitation of the array-based memslot implementation.

Two memslot sets, however, are still needed so the VM continues to
run on the currently active set while the requested operation is being
performed on the second, currently inactive one.

In order to have two memslot sets, but only one copy of the actual
memslots it is necessary to split out the memslot data from the
memslot sets.

The memslots themselves should be also kept independent of each other
so they can be individually added or deleted.

These two memslot sets should normally point to the same set of
memslots. They can, however, be desynchronized when performing a
memslot management operation by replacing the memslot to be modified
by its copy.
After the operation is complete, both memslot sets once again
point to the same, common set of memslot data.

This commit implements the aforementioned idea.

The new implementation uses two trees to perform quick lookups:
For tracking of gfn an ordinary rbtree is used since memslots cannot
overlap in the guest address space and so this data structure is
sufficient for ensuring that lookups are done quickly.

For tracking of hva, however, an interval tree is needed since they
can overlap between memslots.

ID to memslot mappings are kept in a hash table instead of using
a statically allocated "id_to_index" array.

The "lru slot" mini-cache, that keeps track of the last found-by-gfn
memslot, is still present in the new code.

There was also a desire to make the new structure operate on
"pay as you go" basis, that is, that the user only pays the price of the
memslot count that is actually used, not of the maximum count allowed.
Because of that, the implementation makes is possible to make
available for use the maximum memslot count allowed by the KVM API.

The operation semantics were carefully matched to the original
implementation, the outside-visible behavior should not change.
Only the timing will be different.

Making lookup and memslot management operations O(log(n)) brings
some performance benefits (tested on a Xeon 8167M machine):
509 slots in use:
TestBefore  After   Improvement
Map 0,0246s 0,0240s  2%
Unmap   0,0833s 0,0318s 62%
Unmap 2M0,00177s0,000917s   48%
Move active 0,959s  0,816s  15%
Move inactive   0,960s  0,799s  17%
Slot setup  0,0107s 0,00825s23%

100 slots in use:
TestBefore  After   Improvement
Map 0,0208s 0,0207s None
Unmap   0,0406s 0,0315s 22%
Unmap 2M0,000534s   0,000504s6%
Move active 0,845s  0,828s   2%
Move inactive   0,861s  0,805s   7%
Slot setup  0,00193s0,00181s 6%

50 slots in use:
TestBefore  After   Improvement
Map 0,0207s 0,0202s  2%
Unmap   0,0360s 0,0317s 12%
Unmap 2M0,000454s   0,000449s   None
Move active 0,890s  0,875s   2%
Move inactive   0,807s  0,806s  None
Slot setup  0,00108s0,00103s 4%

30 slots in use:
TestBefore  After   Improvement
Map 0,0205s 0,0202s  1%
Unmap   0,0342s 0,0317s  7%
Unmap 2M0,000426s   0,000430s   -1% / None
Move active 0,868s  0,841s   3%
Move inactive   0,908s  0,882s   3%
Slot setup  0,000810s   0,000777s4%

10 slots in use:
TestBefore  After   Improvement
Map 0,0205s 0,0203s None
Unmap   0,0319s 0,0312s   2%
Unmap 2M0,000399s   0,000406s-2%
Move active 0,955s  0,953s  None
Move inactive   0,911s  0,909s  None
Slot setup  0,000286s   0,000284s   None

For comparison, 32k memslots get the following results with
the new code:
Map (8194)  0,0563s
Unmap   0,0351s
Unmap 2M0,0350s
Move active 0,812s
Move inactive   0,847s
Slot setup  0,585s

Since the map test can be done with up to 8194 slots, the result above
for this

Re: [PATCH] ALSA: intel8x0: Fix missing check in snd_intel8x0m_create

2021-02-01 Thread Takashi Iwai

On Sun, 31 Jan 2021 11:09:14 +0100,
Dinghao Liu wrote:
> 
> When device_type == DEVICE_ALI, we should also check the return
> value of pci_iomap() to avoid potential null pointer dereference.
> 
> Signed-off-by: Dinghao Liu 

Thanks, applied.


Takashi

[PATCH v2] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-02-01 Thread Aili Yao

When one page is already hwpoisoned by AO action, process may not be
killed, the process mapping this page may make a syscall include this
page and result to trigger a VM_FAULT_HWPOISON fault, if it's in kernel
mode it may be fixed by fixup_exception. Current code will just return
error code to user process.

This is not sufficient, we should send a SIGBUS to the process and log
the info to console, as we can't trust the process will handle the error
correctly.

Suggested-by: Feng Yang 
Signed-off-by: Aili Yao 
---
 arch/x86/mm/fault.c | 34 +++---
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index f1f1b5a0956a..23095b94cf42 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -631,7 +631,7 @@ static void set_signal_archinfo(unsigned long address,
 
 static noinline void
 no_context(struct pt_regs *regs, unsigned long error_code,
-  unsigned long address, int signal, int si_code)
+  unsigned long address, int signal, int si_code, vm_fault_t fault)
 {
struct task_struct *tsk = current;
unsigned long flags;
@@ -662,12 +662,32 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 * In this case we need to make sure we're not recursively
 * faulting through the emulate_vsyscall() logic.
 */
+
+   if (IS_ENABLED(CONFIG_MEMORY_FAILURE) &&
+   fault & (VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {
+   unsigned int lsb = 0;
+
+   pr_err("MCE: Killing %s:%d due to hardware memory 
corruption fault at %lx\n",
+   current->comm, current->pid, address);
+
+   sanitize_error_code(address, &error_code);
+   set_signal_archinfo(address, error_code);
+
+   if (fault & VM_FAULT_HWPOISON_LARGE)
+   lsb = 
hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault));
+   if (fault & VM_FAULT_HWPOISON)
+   lsb = PAGE_SHIFT;
+
+   force_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, 
lsb);
+
+   return;
+   }
+
if (current->thread.sig_on_uaccess_err && signal) {
sanitize_error_code(address, &error_code);
 
set_signal_archinfo(address, error_code);
 
-   /* XXX: hwpoison faults will set the wrong code. */
force_sig_fault(signal, si_code, (void __user 
*)address);
}
 
@@ -836,7 +856,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long 
error_code,
if (is_f00f_bug(regs, address))
return;
 
-   no_context(regs, error_code, address, SIGSEGV, si_code);
+   no_context(regs, error_code, address, SIGSEGV, si_code, 0);
 }
 
 static noinline void
@@ -927,7 +947,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, 
unsigned long address,
 {
/* Kernel mode? Handle exceptions or die: */
if (!(error_code & X86_PF_USER)) {
-   no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
+   no_context(regs, error_code, address, SIGBUS, BUS_ADRERR, 
fault);
return;
}
 
@@ -966,7 +986,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long 
error_code,
   unsigned long address, vm_fault_t fault)
 {
if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
-   no_context(regs, error_code, address, 0, 0);
+   no_context(regs, error_code, address, 0, 0, 0);
return;
}
 
@@ -974,7 +994,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long 
error_code,
/* Kernel mode? Handle exceptions or die: */
if (!(error_code & X86_PF_USER)) {
no_context(regs, error_code, address,
-  SIGSEGV, SEGV_MAPERR);
+  SIGSEGV, SEGV_MAPERR, 0);
return;
}
 
@@ -1396,7 +1416,7 @@ void do_user_addr_fault(struct pt_regs *regs,
if (fault_signal_pending(fault, regs)) {
if (!user_mode(regs))
no_context(regs, hw_error_code, address, SIGBUS,
-  BUS_ADRERR);
+  BUS_ADRERR, 0);
return;
}
 
-- 
2.25.1

[PATCH v2] nbd: Fix NULL pointer in flush_workqueue

2021-02-01 Thread Sun Ke

Open /dev/nbdX first, the config_refs will be 1 and
the pointers in nbd_device are still null. Disconnect
/dev/nbdX, then reference a null recv_workq. The
protection by config_refs in nbd_genl_disconnect is useless.

[  656.366194] BUG: kernel NULL pointer dereference, address: 0020
[  656.368943] #PF: supervisor write access in kernel mode
[  656.369844] #PF: error_code(0x0002) - not-present page
[  656.370717] PGD 10cc87067 P4D 10cc87067 PUD 1074b4067 PMD 0
[  656.371693] Oops: 0002 [#1] SMP
[  656.372242] CPU: 5 PID: 7977 Comm: nbd-client Not tainted 
5.11.0-rc5-00040-g76c057c84d28 #1
[  656.373661] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[  656.375904] RIP: 0010:mutex_lock+0x29/0x60
[  656.376627] Code: 00 0f 1f 44 00 00 55 48 89 fd 48 83 05 6f d7 fe 08 01 e8 
7a c3 ff ff 48 83 05 6a d7 fe 08 01 31 c0 65 48 8b 14 25 00 6d 01 00  48 0f 
b1 55 d
[  656.378934] RSP: 0018:c95eb9b0 EFLAGS: 00010246
[  656.379350] RAX:  RBX:  RCX: 
[  656.379915] RDX: 888104cf2600 RSI: aae8f452 RDI: 0020
[  656.380473] RBP: 0020 R08:  R09: 88813bd6b318
[  656.381039] R10: 00c7 R11: fefefefefefefeff R12: 888102710b40
[  656.381599] R13: c95eb9e0 R14: b2930680 R15: 88810770ef00
[  656.382166] FS:  7fdf117ebb40() GS:88813bd4() 
knlGS:
[  656.382806] CS:  0010 DS:  ES:  CR0: 80050033
[  656.383261] CR2: 0020 CR3: 000100c84000 CR4: 06e0
[  656.383819] DR0:  DR1:  DR2: 
[  656.384370] DR3:  DR6: fffe0ff0 DR7: 0400
[  656.384927] Call Trace:
[  656.385111]  flush_workqueue+0x92/0x6c0
[  656.385395]  nbd_disconnect_and_put+0x81/0xd0
[  656.385716]  nbd_genl_disconnect+0x125/0x2a0
[  656.386034]  genl_family_rcv_msg_doit.isra.0+0x102/0x1b0
[  656.386422]  genl_rcv_msg+0xfc/0x2b0
[  656.386685]  ? nbd_ioctl+0x490/0x490
[  656.386954]  ? genl_family_rcv_msg_doit.isra.0+0x1b0/0x1b0
[  656.387354]  netlink_rcv_skb+0x62/0x180
[  656.387638]  genl_rcv+0x34/0x60
[  656.387874]  netlink_unicast+0x26d/0x590
[  656.388162]  netlink_sendmsg+0x398/0x6c0
[  656.388451]  ? netlink_rcv_skb+0x180/0x180
[  656.388750]  sys_sendmsg+0x1da/0x320
[  656.389038]  ? sys_recvmsg+0x130/0x220
[  656.389334]  ___sys_sendmsg+0x8e/0xf0
[  656.389605]  ? ___sys_recvmsg+0xa2/0xf0
[  656.389889]  ? handle_mm_fault+0x1671/0x21d0
[  656.390201]  __sys_sendmsg+0x6d/0xe0
[  656.390464]  __x64_sys_sendmsg+0x23/0x30
[  656.390751]  do_syscall_64+0x45/0x70
[  656.391017]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

To fix it, just add a check for a non null task_recv in
nbd_genl_disconnect.

Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke 
---
v2: Use jump target unlock.
---
 drivers/block/nbd.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 6727358e147d..fb62b57102c6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -2011,12 +2011,14 @@ static int nbd_genl_disconnect(struct sk_buff *skb, 
struct genl_info *info)
   index);
return -EINVAL;
}
+   mutex_lock(&nbd->config_lock);
if (!refcount_inc_not_zero(&nbd->refs)) {
-   mutex_unlock(&nbd_index_mutex);
-   printk(KERN_ERR "nbd: device at index %d is going down\n",
-  index);
-   return -EINVAL;
+   goto unlock;
}
+   if (!nbd->recv_workq) {
+   goto unlock;
+   }
+   mutex_unlock(&nbd->config_lock);
mutex_unlock(&nbd_index_mutex);
if (!refcount_inc_not_zero(&nbd->config_refs)) {
nbd_put(nbd);
@@ -2026,6 +2028,12 @@ static int nbd_genl_disconnect(struct sk_buff *skb, 
struct genl_info *info)
nbd_config_put(nbd);
nbd_put(nbd);
return 0;
+
+unlock:
+   mutex_unlock(&nbd->config_lock);
+   mutex_unlock(&nbd_index_mutex);
+   printk(KERN_ERR "nbd: device at index %d is going down\n", index);
+   return -EINVAL;
 }
 
 static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
-- 
2.25.4

RE: [PATCH 3/8] scsi: ufshpb: Add region's reads counter

2021-02-01 Thread Avri Altman

> 
> On Mon, Feb 01, 2021 at 07:51:19AM +, Avri Altman wrote:
> > >
> > > On Mon, Feb 01, 2021 at 07:12:53AM +, Avri Altman wrote:
> > > > > > +#define WORK_PENDING 0
> > > > > > +#define ACTIVATION_THRSHLD 4 /* 4 IOs */
> > > > > Rather than fixing it with macro, how about using sysfs and make it
> > > > > configurable?
> > > > Yes.
> > > > I will add a patch making all the logic configurable.
> > > > As all those are hpb-related parameters, I think module parameters are
> > > more adequate.
> > >
> > > No, this is not the 1990's, please never add new module parameters to
> > > drivers.  If not for the basic problem of they do not work on a
> > > per-device basis, but on a per-driver basis, which is what you almost
> > > never want.
> > OK.
> >
> > >
> > > But why would you want to change this value, why can't the driver "just
> > > work" and not need manual intervention?
> > It is.
> > But those are a knobs each vendor may want to tweak,
> > So it'll be optimized with its internal device's implementation.
> >
> > Tweaking the parameters, as well as the entire logic, is really an endless
> task.
> > Some logic works better for some scenarios, while falling behind on others.
> 
> Shouldn't the hardware know how to handle this dynamically?  If not, how
> is a user going to know?
There is one "brain".
It is either in the device - in device mode, Or in the host - in host mode 
control.
The "brain" decides which region is active, thus carrying the physical address 
along with the logical -
minimizing context switches in the device's RAM.

There can be up to N active regions.
Activation and deactivation has its overhead.
So basically it is a constraint-optimization problem.

> 
> > How about leaving it for now, to be elaborated it in the future?
> 
> I do not care, just do not make it a module parameter for the reason
> that does not work on a per-device basis.
OK.  Will make it a sysfs per hpb-lun, like Daejun proposed.

Thanks,
Avri

Re: [PATCH 1/2] media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()

2021-02-01 Thread Takashi Iwai

On Sun, 31 Jan 2021 15:53:20 +0100,
Sean Young wrote:
> 
> On Wed, Jan 20, 2021 at 11:20:56AM +0100, Takashi Iwai wrote:
> > dvb_usb_device_init() allocates a dvb_usb_device object, but it
> > doesn't release it even when returning an error.  The callers don't
> > seem caring it as well, hence those memories are leaked.
> > 
> > This patch assures releasing the memory at the error path in
> > dvb_usb_device_init().  Also it makes sure that USB intfdata is reset
> > and don't return the bogus pointer to the caller at the error path,
> > too.
> > 
> > Cc: 
> > Signed-off-by: Takashi Iwai 
> > ---
> >  drivers/media/usb/dvb-usb/dvb-usb-init.c | 18 --
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/media/usb/dvb-usb/dvb-usb-init.c 
> > b/drivers/media/usb/dvb-usb/dvb-usb-init.c
> > index c1a7634e27b4..5befec87f26a 100644
> > --- a/drivers/media/usb/dvb-usb/dvb-usb-init.c
> > +++ b/drivers/media/usb/dvb-usb/dvb-usb-init.c
> > @@ -281,15 +281,21 @@ int dvb_usb_device_init(struct usb_interface *intf,
> >  
> > usb_set_intfdata(intf, d);
> >  
> > -   if (du != NULL)
> > +   ret = dvb_usb_init(d, adapter_nums);
> 
> dvb_usb_init() has different errors paths. 
> 
> 1. It can return -ENOMEM if it cannot kzalloc(). No other side affects.
> 2. It can return an error if dvb_usb_i2c_init() or dvb_usb_adapter_init()
>fails. In this case, dvb_usb_exit() is called, which frees 
>struct dvb_usb_device*
> 
> In the last case we now have a double free.

A good catch, indeed the function has inconsistent behavior.
I'll update the patch and resubmit to address it.


thanks,

Takashi

[PATCH] crypto: caam -Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

2021-02-01 Thread Jiapeng Chong

Fix the following coccicheck warning:

./drivers/crypto/caam/debugfs.c:23:0-23: WARNING: caam_fops_u64_ro
should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./drivers/crypto/caam/debugfs.c:22:0-23: WARNING: caam_fops_u32_ro
should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/crypto/caam/debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/caam/debugfs.c b/drivers/crypto/caam/debugfs.c
index 8ebf183..806bb20 100644
--- a/drivers/crypto/caam/debugfs.c
+++ b/drivers/crypto/caam/debugfs.c
@@ -19,8 +19,8 @@ static int caam_debugfs_u32_get(void *data, u64 *val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(caam_fops_u32_ro, caam_debugfs_u32_get, NULL, 
"%llu\n");
-DEFINE_SIMPLE_ATTRIBUTE(caam_fops_u64_ro, caam_debugfs_u64_get, NULL, 
"%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(caam_fops_u32_ro, caam_debugfs_u32_get, NULL, 
"%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(caam_fops_u64_ro, caam_debugfs_u64_get, NULL, 
"%llu\n");
 
 #ifdef CONFIG_CAAM_QI
 /*
-- 
1.8.3.1

Re: [PATCH 1/2] KVM: x86/mmu: Make HVA handler retpoline-friendly

2021-02-01 Thread Paolo Bonzini


On 01/02/21 09:13, Maciej S. Szmigiero wrote:

  static int kvm_handle_hva_range(struct kvm *kvm,
unsigned long start,
unsigned long end,
@@ -1495,8 +1534,9 @@ static int kvm_handle_hva_range(struct kvm *kvm,




-static int kvm_tdp_mmu_handle_hva_range(struct kvm *kvm, unsigned long start,
-   unsigned long end, unsigned long data,
-   int (*handler)(struct kvm *kvm, struct kvm_memory_slot *slot,
-  struct kvm_mmu_page *root, gfn_t start,
-  gfn_t end, unsigned long data))
-{


Can you look into just marking these functions __always_inline?  This 
should help the compiler change (*handler)(...) into a regular function 
call.


Paolo

Re: [PATCH 2/2] media: dvb-usb: Fix use-after-free access

2021-02-01 Thread Takashi Iwai

On Sun, 31 Jan 2021 16:04:56 +0100,
Sean Young wrote:
> 
> Hi Takashi,
> 
> On Fri, Jan 22, 2021 at 04:47:44PM +0100, Robert Foss wrote:
> > Hey Takashi,
> > 
> > This patch is generating a checkpatch warning, but I think it is
> > spurious and can be ignored.
> 
> The checkpatch warning isn't superious and should really be corrected.

It's case-by-case, checkpatch is no bible by itself.  In this
particular case, it was rather a false-positive of checkpatch: the
commit reference including a line-break.

This issue has been always annoying and I wish this will be dropped
from checkpatch in near future...

thanks,

Takashi

[PATCH] hugetlbfs: rework calculation code of Hugepage size in hugetlbfs_show_options()

2021-02-01 Thread Miaohe Lin

Rework calculation code of the Hugepage size to make it more readable and
straightforward.

Signed-off-by: Miaohe Lin 
---
 fs/hugetlbfs/inode.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 3a08fbae3b53..1be18de4b537 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1014,11 +1014,12 @@ static int hugetlbfs_show_options(struct seq_file *m, 
struct dentry *root)
if (sbinfo->max_inodes != -1)
seq_printf(m, ",nr_inodes=%lu", sbinfo->max_inodes);
 
-   hpage_size /= 1024;
-   mod = 'K';
-   if (hpage_size >= 1024) {
-   hpage_size /= 1024;
+   if (hpage_size >= SZ_1M) {
+   hpage_size /= SZ_1M;
mod = 'M';
+   } else {
+   hpage_size /= SZ_1K;
+   mod = 'K';
}
seq_printf(m, ",pagesize=%lu%c", hpage_size, mod);
if (spool) {
-- 
2.19.1

Re: LINE_MAX: was: Re: [PATCH printk-rework 04/12] printk: define CONSOLE_LOG_MAX in printk.h

2021-02-01 Thread John Ogness

On 2021-01-29, Petr Mladek  wrote:
>> diff --git a/include/linux/printk.h b/include/linux/printk.h
>> index fe7eb2351610..6d8f844bfdff 100644
>> --- a/include/linux/printk.h
>> +++ b/include/linux/printk.h
>> @@ -45,6 +45,7 @@ static inline const char *printk_skip_headers(const char 
>> *buffer)
>>  }
>>  
>>  #define CONSOLE_EXT_LOG_MAX 8192
>> +#define CONSOLE_LOG_MAX 1024
>>  
>>  /* printk's without a loglevel use this.. */
>>  #define MESSAGE_LOGLEVEL_DEFAULT CONFIG_MESSAGE_LOGLEVEL_DEFAULT
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index ec2174882b8e..5faf9c0db171 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -410,7 +410,7 @@ static u64 clear_seq;
>>  #else
>>  #define PREFIX_MAX  32
>>  #endif
>> -#define LOG_LINE_MAX(1024 - PREFIX_MAX)
>> +#define LOG_LINE_MAX(CONSOLE_LOG_MAX - PREFIX_MAX)
>
> CONSOLE_LOG_MAX defines size of buffers that are written by
> record_print_text(). We must make sure that all stored
> messages can actually get printed even with the trailing '\0'.
>
> We should limit the stored messages by:
>
> /*
>  * Console log buffer needs extra space for the trailing '\0',
>  * see reccord_print_text().
>  */
> #define LOG_LINE_MAX  (CONSOLE_LOG_MAX - PREFIX_MAX - 1)
>
> It should not be a big problem. The PREFIX_MAX size has already
> increased in the patch, for example, because of the caller ID.
>
> Does it make sense, please?

If we want to make sure "all stored messages can actually get printed",
then CONSOLE_LOG_MAX needs to be set to:

   PREFIX_MAX * LOG_LINE_MAX + 1

and we should be specifying LOG_LINE_MAX instead of
CONSOLE_LOG_MAX. record_print_text() adds up to PREFIX_MAX for every
'\n' in the message.

I was initially confused by this, which led to my patch [0] to fix
it. But then I realized that the buffer is way too small anyway. If we
want to fix the issue, then LOG_LINE_MAX needs to be much larger.

IMO it makes no sense to do the -1 change because the buffer is too
small anyway.

John Ogness

[0] https://lkml.kernel.org/r/20210120194106.26441-2-john.ogn...@linutronix.de

Re: [PATCH 1/1] vsock: fix the race conditions in multi-transport support

2021-02-01 Thread Stefano Garzarella


On Sun, Jan 31, 2021 at 01:59:14PM +0300, Alexander Popov wrote:

There are multiple similar bugs implicitly introduced by the
commit c0cfa2d8a788fcf4 ("vsock: add multi-transports support") and
commit 6a2c0962105ae8ce ("vsock: prevent transport modules unloading").

The bug pattern:
[1] vsock_sock.transport pointer is copied to a local variable,
[2] lock_sock() is called,
[3] the local variable is used.
VSOCK multi-transport support introduced the race condition:
vsock_sock.transport value may change between [1] and [2].

Let's copy vsock_sock.transport pointer to local variables after
the lock_sock() call.


We can add:

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")



Signed-off-by: Alexander Popov 
---
net/vmw_vsock/af_vsock.c | 17 -
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index d10916ab4526..28edac1f9aa6 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -997,9 +997,12 @@ static __poll_t vsock_poll(struct file *file, struct 
socket *sock,
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;

} else if (sock->type == SOCK_STREAM) {
-   const struct vsock_transport *transport = vsk->transport;
+   const struct vsock_transport *transport = NULL;


I think we can avoid initializing to NULL since we assign it shortly 
after.



+
lock_sock(sk);

+   transport = vsk->transport;
+
/* Listening sockets that have connections in their accept
 * queue can be read.
 */
@@ -1082,10 +1085,11 @@ static int vsock_dgram_sendmsg(struct socket *sock, 
struct msghdr *msg,
err = 0;
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;

lock_sock(sk);

+   transport = vsk->transport;
+
err = vsock_auto_bind(vsk);
if (err)
goto out;
@@ -1544,10 +1548,11 @@ static int vsock_stream_setsockopt(struct 
socket *sock,

err = 0;
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;

lock_sock(sk);

+   transport = vsk->transport;
+
switch (optname) {
case SO_VM_SOCKETS_BUFFER_SIZE:
COPY_IN(val);
@@ -1680,7 +1685,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,

sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
total_written = 0;
err = 0;

@@ -1689,6 +1693,8 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,

lock_sock(sk);

+   transport = vsk->transport;
+
/* Callers should not provide a destination with stream sockets. */
if (msg->msg_namelen) {
err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
@@ -1823,11 +1829,12 @@ vsock_stream_recvmsg(struct socket *sock, struct msghdr 
*msg, size_t len,

sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
err = 0;

lock_sock(sk);

+   transport = vsk->transport;
+
if (!transport || sk->sk_state != TCP_ESTABLISHED) {
/* Recvmsg is supposed to return 0 if a peer performs an
 * orderly shutdown. Differentiate between that case and when a
--
2.26.2



Thanks for fixing this issues. With the small changes applied:

Reviewed-by: Stefano Garzarella 

Thanks,
Stefano

Re: [PATCH] platform/x86: dell-wmi-sysman: fix a NULL pointer dereference

2021-02-01 Thread Hans de Goede

Hi,

On 2/1/21 3:36 AM, Perry Yuan wrote:



> Hi Hans.
> Could you share your the commit link after you apply this patch to your 
> for-next branch?

https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/commit/?h=for-next&id=64b0efa18f8c3b1baac369b8d74d0fdae02bc4bc

Regards,

Hans

Re: [PATCH v6 4/4] ARM: Add support for Hisilicon Kunpeng L3 cache controller

2021-02-01 Thread Arnd Bergmann

On Mon, Feb 1, 2021 at 4:36 AM Zhen Lei  wrote:
>
> Add support for the Hisilicon Kunpeng L3 cache controller as used with
> Kunpeng506 and Kunpeng509 SoCs.
>
> These Hisilicon SoCs support LPAE, so the physical addresses is wider than
> 32-bits, but the actual bit width does not exceed 36 bits. When the cache
> operation is performed based on the address range, the upper 30 bits of
> the physical address are recorded in registers L3_MAINT_START and
> L3_MAINT_END, and ignore the lower 6 bits cacheline offset.
>
> Signed-off-by: Zhen Lei 

Reviewed-by: Arnd Bergmann 

If you add one more thing:

> +static void l3cache_maint_common(u32 range, u32 op_type)
> +{
> +   u32 reg;
> +
> +   reg = readl_relaxed(l3_ctrl_base + L3_MAINT_CTRL);
> +   reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK);
> +   reg |= range | op_type;
> +   reg |= L3_MAINT_STATUS_START;
> +   writel(reg, l3_ctrl_base + L3_MAINT_CTRL);
> +
> +   /* Wait until the hardware maintenance operation is complete. */
> +   do {
> +   cpu_relax();
> +   reg = readl(l3_ctrl_base + L3_MAINT_CTRL);
> +   } while ((reg & L3_MAINT_STATUS_MASK) != L3_MAINT_STATUS_END);
> +}
> +
> +static void l3cache_maint_range(phys_addr_t start, phys_addr_t end, u32 
> op_type)
> +{
> +   start = start >> L3_CACHE_LINE_SHITF;
> +   end = ((end - 1) >> L3_CACHE_LINE_SHITF) + 1;
> +
> +   writel_relaxed(start, l3_ctrl_base + L3_MAINT_START);
> +   writel_relaxed(end, l3_ctrl_base + L3_MAINT_END);
> +
> +   l3cache_maint_common(L3_MAINT_RANGE_ADDR, op_type);
> +}

As mentioned, I'd like to see a code comment that explains the use
the of relaxed() vs non-relaxed MMIO accessors, as it will be impossible
for a reader to later understand why you picked a mix of the two,
and it also ensures that you have considered which one is the best
option to use here and that your explanation matches what you do.

Based on Russell's comments, I had expected that you would use
only relaxed accessors, plus explicit barriers if you change it, matching
what l2x0 does (l2x0 has to do it because of __l2c210_cache_sync(),
while you don't have a sync callback and don't need to).

  Arnd

[PATCH v2 1/2] media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()

2021-02-01 Thread Takashi Iwai

dvb_usb_device_init() allocates a dvb_usb_device object, but it
doesn't release the object by itself even at errors.  The object is
released in the callee side (dvb_usb_init()) in some error cases via
dvb_usb_exit() call, but it also missed the object free in other error
paths.  And, the caller (it's only dvb_usb_device_init()) doesn't seem
caring the resource management as well, hence those memories are
leaked.

This patch assures releasing the memory at the error path in
dvb_usb_device_init().  Now dvb_usb_init() frees the resources it
allocated but leaves the passed dvb_usb_device object intact.  In
turn, the dvb_usb_device object is released in dvb_usb_device_init()
instead.
We could use dvb_usb_exit() function for releasing everything in the
callee (as it was used for some error cases in the original code), but
releasing the passed object in the callee is non-intuitive and
error-prone.  So I took this approach (which is more standard in Linus
kernel code) although it ended with a bit more open codes.

Along with the change, the patch makes sure that USB intfdata is reset
and don't return the bogus pointer to the caller of
dvb_usb_device_init() at the error path, too.

Cc: 
Signed-off-by: Takashi Iwai 
---
v1->v2: Fix double-free and reorganize the code

 drivers/media/usb/dvb-usb/dvb-usb-init.c | 47 
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/media/usb/dvb-usb/dvb-usb-init.c 
b/drivers/media/usb/dvb-usb/dvb-usb-init.c
index c1a7634e27b4..c78158d12540 100644
--- a/drivers/media/usb/dvb-usb/dvb-usb-init.c
+++ b/drivers/media/usb/dvb-usb/dvb-usb-init.c
@@ -158,22 +158,20 @@ static int dvb_usb_init(struct dvb_usb_device *d, short 
*adapter_nums)
 
if (d->props.priv_init != NULL) {
ret = d->props.priv_init(d);
-   if (ret != 0) {
-   kfree(d->priv);
-   d->priv = NULL;
-   return ret;
-   }
+   if (ret != 0)
+   goto err_priv_init;
}
}
 
/* check the capabilities and set appropriate variables */
dvb_usb_device_power_ctrl(d, 1);
 
-   if ((ret = dvb_usb_i2c_init(d)) ||
-   (ret = dvb_usb_adapter_init(d, adapter_nums))) {
-   dvb_usb_exit(d);
-   return ret;
-   }
+   ret = dvb_usb_i2c_init(d);
+   if (ret)
+   goto err_i2c_init;
+   ret = dvb_usb_adapter_init(d, adapter_nums);
+   if (ret)
+   goto err_adapter_init;
 
if ((ret = dvb_usb_remote_init(d)))
err("could not initialize remote control.");
@@ -181,6 +179,17 @@ static int dvb_usb_init(struct dvb_usb_device *d, short 
*adapter_nums)
dvb_usb_device_power_ctrl(d, 0);
 
return 0;
+
+err_adapter_init:
+   dvb_usb_adapter_exit(d);
+err_i2c_init:
+   dvb_usb_i2c_exit(d);
+   if (d->priv && d->props.priv_destroy)
+   d->props.priv_destroy(d);
+err_priv_init:
+   kfree(d->priv);
+   d->priv = NULL;
+   return ret;
 }
 
 /* determine the name and the state of the just found USB device */
@@ -281,15 +290,21 @@ int dvb_usb_device_init(struct usb_interface *intf,
 
usb_set_intfdata(intf, d);
 
-   if (du != NULL)
+   ret = dvb_usb_init(d, adapter_nums);
+   if (ret) {
+   info("%s error while loading driver (%d)", desc->name, ret);
+   goto error;
+   }
+
+   if (du)
*du = d;
 
-   ret = dvb_usb_init(d, adapter_nums);
+   info("%s successfully initialized and connected.", desc->name);
+   return 0;
 
-   if (ret == 0)
-   info("%s successfully initialized and connected.", desc->name);
-   else
-   info("%s error while loading driver (%d)", desc->name, ret);
+ error:
+   usb_set_intfdata(intf, NULL);
+   kfree(d);
return ret;
 }
 EXPORT_SYMBOL(dvb_usb_device_init);
-- 
2.26.2

[PATCH v2 2/2] media: dvb-usb: Fix use-after-free access

2021-02-01 Thread Takashi Iwai

dvb_usb_device_init() copies the properties to the own data, so that
the callers can release the original properties later (as done in the
commit 299c7007e936 "media: dw2102: Fix memleak on sequence of
probes").  However, it also stores dev->desc pointer that is a
reference to the original properties data.  Since dev->desc is
referred later, it may result in use-after-free, in the worst case,
leading to a kernel Oops as reported.

This patch addresses the problem by allocating and copying the
properties at first, then get the desc from the copied properties.

Reported-and-tested-by: Stefan Seyfried 
BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=1181104
Reviewed-by: Robert Foss 
Cc: 
Signed-off-by: Takashi Iwai 
---
v1->v2: Only tag update

 drivers/media/usb/dvb-usb/dvb-usb-init.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/media/usb/dvb-usb/dvb-usb-init.c 
b/drivers/media/usb/dvb-usb/dvb-usb-init.c
index c78158d12540..12983baf6c3f 100644
--- a/drivers/media/usb/dvb-usb/dvb-usb-init.c
+++ b/drivers/media/usb/dvb-usb/dvb-usb-init.c
@@ -264,27 +264,30 @@ int dvb_usb_device_init(struct usb_interface *intf,
if (du != NULL)
*du = NULL;
 
-   if ((desc = dvb_usb_find_device(udev, props, &cold)) == NULL) {
+   d = kzalloc(sizeof(struct dvb_usb_device), GFP_KERNEL);
+   if (!d) {
+   err("no memory for 'struct dvb_usb_device'");
+   return -ENOMEM;
+   }
+
+   memcpy(&d->props, props, sizeof(struct dvb_usb_device_properties));
+
+   desc = dvb_usb_find_device(udev, &d->props, &cold);
+   if (!desc) {
deb_err("something went very wrong, device was not found in 
current device list - let's see what comes next.\n");
-   return -ENODEV;
+   ret = -ENODEV;
+   goto error;
}
 
if (cold) {
info("found a '%s' in cold state, will try to load a firmware", 
desc->name);
ret = dvb_usb_download_firmware(udev, props);
if (!props->no_reconnect || ret != 0)
-   return ret;
+   goto error;
}
 
info("found a '%s' in warm state.", desc->name);
-   d = kzalloc(sizeof(struct dvb_usb_device), GFP_KERNEL);
-   if (d == NULL) {
-   err("no memory for 'struct dvb_usb_device'");
-   return -ENOMEM;
-   }
-
d->udev = udev;
-   memcpy(&d->props, props, sizeof(struct dvb_usb_device_properties));
d->desc = desc;
d->owner = owner;
 
-- 
2.26.2

Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device

2021-02-01 Thread Stefano Garzarella


On Sun, Jan 31, 2021 at 05:31:43PM +0200, Max Gurtovoy wrote:


On 1/28/2021 4:41 PM, Stefano Garzarella wrote:

From: Max Gurtovoy 

This will allow running vDPA for virtio block protocol.

Signed-off-by: Max Gurtovoy 
[sgarzare: various cleanups/fixes]
Signed-off-by: Stefano Garzarella 
---
v2:
- rebased on top of other changes (dev_attr, get_config(), notify(), etc.)
- memset to 0 the config structure in vdpasim_blk_get_config()
- used vdpasim pointer in vdpasim_blk_get_config()

v1:
- Removed unused headers
- Used cpu_to_vdpasim*() to store config fields
- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
  option can not depend on other [Jason]
- Start with a single queue for now [Jason]
- Add comments to memory barriers
---
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 145 +++
 drivers/vdpa/Kconfig |   7 ++
 drivers/vdpa/vdpa_sim/Makefile   |   1 +
 3 files changed, 153 insertions(+)
 create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c 
b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
new file mode 100644
index ..999f9ca0b628
--- /dev/null
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA simulator for block device.
+ *
+ * Copyright (c) 2020, Mellanox Technologies. All rights reserved.


I guess we can change the copyright from Mellanox to:

Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.


I'll update in the next version.

Thanks,
Stefano

[PATCH v2 0/2] media: dvb-usb: Fix UAF and memory leaks

2021-02-01 Thread Takashi Iwai

Hi,

here is a revised patch set to address the use-after-free at
disconnecting a USB DVB device that was recently reported on openSUSE
Bugzilla.  The bug itself seems to be a long-standing one, and I
spotted another memory leak there, which is covered in the first
patch.

This revision addressed the double-free bug Sean pointed.

I added Robert's R-b tag only to patch#2, as patch#1 differs from the
v1 significantly.


thanks,

Takashi

===

Takashi Iwai (2):
  media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()
  media: dvb-usb: Fix use-after-free access

 drivers/media/usb/dvb-usb/dvb-usb-init.c | 70 +++-
 1 file changed, 44 insertions(+), 26 deletions(-)

-- 
2.26.2

Re: [PATCH V1 0/3] scsi: ufs: Add a vops to configure VCC voltage level

2021-02-01 Thread nitirawa


On 2021-01-31 19:32, Avri Altman wrote:


UFS specification allows different VCC configurations for UFS devices,
for example,
(1)2.70V - 3.60V (For UFS 2.x devices)
(2)2.40V - 2.70V (For UFS 3.x devices)
For platforms supporting both ufs 2.x (2.7v-3.6v) and
ufs 3.x (2.4v-2.7v), the voltage requirements (VCC) is 2.4v-3.6v.
So to support this, we need to start the ufs device initialization 
with
the common VCC voltage(2.7v) and after reading the device descriptor 
we
need to switch to the correct range(vcc min and vcc max) of VCC 
voltage

as per UFS device type since 2.7v is the marginal voltage as per specs
for both type of devices.

Once VCC regulator supply has been intialised to 2.7v and UFS device
type is read from device descriptor, we follows below steps to
change the VCC voltage values.

1. Set the device to SLEEP state.
2. Disable the Vcc Regulator.
3. Set the vcc voltage according to the device type and reenable
   the regulator.
4. Set the device mode back to ACTIVE.

The above changes are done in vendor specific file by
adding a vops which will be needed for platform
supporting both ufs 2.x and ufs 3.x devices.

The flow should be generic - isn't it?
Why do you need the entire flow to be vendor-specific?
Why not just the parameters vendor-specific?

Thanks,
Avri


Hi Avri,
This vops change was done as per the below mail thread
discussion where it was decided to go with vops and
let vendors handle it, until specs provides more clarity.

https://www.spinics.net/lists/kernel/msg3754995.html

Regards,
Nitin

Re: [Intel-gfx] v5.11-rc5 BUG kmalloc-1k (Not tainted): Redzone overwritten

2021-02-01 Thread Chris Wilson

Quoting Jani Nikula (2021-01-28 13:23:48)
> 
> A number of our CI systems are hitting redzone overwritten errors after
> s2idle, with the errors introduced between v5.11-rc4 and v5.11-rc5. See
> snippet below, full logs for one affected machine at [1].
> 
> Known issue?

Fwiw, I think this should be fixed by

commit 08d60e5999540110576e7c1346d486220751b7f9
Author: John Ogness 
Date:   Sun Jan 24 21:33:28 2021 +0106

printk: fix string termination for record_print_text()

Commit f0e386ee0c0b ("printk: fix buffer overflow potential for
print_text()") added string termination in record_print_text().
However it used the wrong base pointer for adding the terminator.
This led to a 0-byte being written somewhere beyond the buffer.

Use the correct base pointer when adding the terminator.

Fixes: f0e386ee0c0b ("printk: fix buffer overflow potential for 
print_text()")
Reported-by: Sven Schnelle 
Signed-off-by: John Ogness 
Signed-off-by: Petr Mladek 
Link: 
https://lore.kernel.org/r/20210124202728.4718-1-john.ogn...@linutronix.de

din should be rolled forward, but there's yet another regression in rc6
breaking suspend on all machines.
-Chris

Re: [PATCH v6 2/4] ARM: hisi: add support for Kunpeng50x SoC

2021-02-01 Thread Arnd Bergmann

On Mon, Feb 1, 2021 at 4:35 AM Zhen Lei  wrote:
>
> Enable support for the Hisilicon Kunpeng506 and Kunpeng509 SoC.
>
> Signed-off-by: Zhen Lei 

Reviewed-by: Arnd Bergmann 

Russell, do you have a preference for how to get this series merged
after the last comments are resolved?

I think there is no technical problem in having patch two merged through
the soc tree, while merging the other three through your tree, but it
seems more logical to keep all four together in either location.

   Arnd

[PATCH v3] KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged

2021-02-01 Thread Wanpeng Li

From: Wanpeng Li 

The per-cpu vsyscall pvclock data pointer assigns either an element of the 
static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory 
hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if 
kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in 
kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall
and sometimes you end up with vsyscall disabled if the host does something 
strange. This patch fixes it by allocating this dynamically memory 
unconditionally even if vsyscall is disabled.

Fixes: 6a1cac56f4 ("x86/kvm: Use __bss_decrypted attribute in shared variables")
Reported-by: Zelin Deng 
Tested-by: Haiwei Li 
Cc: Brijesh Singh 
Cc: sta...@vger.kernel.org#v4.19-rc5+
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * allocate dynamically memory unconditionally
v1 -> v2:
 * add code comments

 arch/x86/kernel/kvmclock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index aa59374..a72b16e 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -268,6 +268,8 @@ static void __init kvmclock_init_mem(void)
 
 static int __init kvm_setup_vsyscall_timeinfo(void)
 {
+   kvmclock_init_mem();
+
 #ifdef CONFIG_X86_64
u8 flags;
 
@@ -281,8 +283,6 @@ static int __init kvm_setup_vsyscall_timeinfo(void)
kvm_clock.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK;
 #endif
 
-   kvmclock_init_mem();
-
return 0;
 }
 early_initcall(kvm_setup_vsyscall_timeinfo);
-- 
2.7.4

[PATCH] gcc-plugins: remove unneeded semicolon

2021-02-01 Thread Yang Li

Eliminate the following coccicheck warning:
./scripts/gcc-plugins/latent_entropy_plugin.c:527:2-3: Unneeded
semicolon

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 scripts/gcc-plugins/latent_entropy_plugin.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/gcc-plugins/latent_entropy_plugin.c 
b/scripts/gcc-plugins/latent_entropy_plugin.c
index 9dced66..589454b 100644
--- a/scripts/gcc-plugins/latent_entropy_plugin.c
+++ b/scripts/gcc-plugins/latent_entropy_plugin.c
@@ -524,7 +524,7 @@ static unsigned int latent_entropy_execute(void)
while (bb != EXIT_BLOCK_PTR_FOR_FN(cfun)) {
perturb_local_entropy(bb, local_entropy);
bb = bb->next_bb;
-   };
+   }
 
/* 4. mix local entropy into the global entropy variable */
perturb_latent_entropy(local_entropy);
-- 
1.8.3.1

Re: [PATCH 0/8] gpio: implement the configfs testing module

2021-02-01 Thread Bartosz Golaszewski

On Sat, Jan 30, 2021 at 10:20 PM Uwe Kleine-König
 wrote:
>
> Hello,
>
> On Fri, Jan 29, 2021 at 02:46:16PM +0100, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski 
> >
> > This series adds a new GPIO testing module based on configfs committable 
> > items
> > and sysfs. The goal is to provide a testing driver that will be configurable
> > at runtime (won't need module reload) and easily extensible. The control 
> > over
> > the attributes is also much more fine-grained than in gpio-mockup.
> >
> > I am aware that Uwe submitted a virtual driver called gpio-simulator some 
> > time
> > ago and I was against merging it as it wasn't much different from 
> > gpio-mockup.
> > I would ideally want to have a single testing driver to maintain so I am
> > proposing this module as a replacement for gpio-mockup but since selftests
> > and libgpiod depend on it and it also has users in the community, we can't
> > outright remove it until everyone switched to the new interface. As for 
> > Uwe's
> > idea for linking two simulated chips so that one controls the other - while
> > I prefer to have an independent code path for controlling the lines (hence
> > the sysfs attributes), I'm open to implementing it in this new driver. It
> > should be much more feature friendly thanks to configfs than gpio-mockup.
>
> Funny you still think about my simulator driver. I recently thought

It's because I always feel bad when I refuse to merge someone's hard work.

> about reanimating it for my private use. The idea was to implement a
> rotary-encoder driver (that contrast to
> drivers/input/misc/rotary_encoder.c really implements an encoder and not
> a decoder). With the two linked chips I can plug
> drivers/input/misc/rotary_encoder.c on one side and my encoder on the
> other to test both drivers completely in software.
>
> I didn't look into your driver yet, but getting such a driver into
> mainline would be very welcome!
>

My idea for linking chips (although that's not implemented yet) is an
attribute in each configfs group called 'link' or something like that,
that would take as argument the name of the chip to link to making the
'linker' the input and the 'linkee' the output.

It would be tempting to use symbolic links too but I'm afraid this
would need further extension of configfs.

> I intend to look into your driver next week, but please don't hold back
> on merging for my feedback.
>

Don't worry, I'm not really aiming at v5.12 with this.

> Best regards
> Uwe
>

Bart

Re: [PATCH 8/8] gpio: sim: new testing module

2021-02-01 Thread Bartosz Golaszewski

On Sun, Jan 31, 2021 at 1:43 AM Kent Gibson  wrote:
>
> On Sat, Jan 30, 2021 at 09:37:55PM +0100, Bartosz Golaszewski wrote:
> > On Fri, Jan 29, 2021 at 4:57 PM Andy Shevchenko
> >  wrote:
> > >
> > > On Fri, Jan 29, 2021 at 02:46:24PM +0100, Bartosz Golaszewski wrote:
> > > > From: Bartosz Golaszewski 
> > > ...
> > >
>
> [snip]
>
> > > Honestly, I don't like the idea of Yet Another (custom) Parser in the 
> > > kernel.
> > >
> > > Have you investigated existing parsers? We have cmdline.c, 
> > > gpio-aggregator.c,
> > > etc. Besides the fact of test cases which are absent here. And who knows 
> > > what
> > > we allow to be entered.
> > >
> >
> > Yes, I looked all around the kernel to find something I could reuse
> > but failed to find anything useful for this particular purpose. If you
> > have something you could point me towards, I'm open to alternatives.
> >
> > Once we agree on the form of the module, I'll port self-tests to using
> > it instead of gpio-mockup, so we'll have some tests in the tree.
> >
>
> Given the existing selftests focus on testing the gpio-mockup itself, it
> would be more appropriate that you add separate tests for gpio-sim.
>
> As an end user I'm interested in the concrete example of driving gpio-sim
> that selftests would provide, so I'm looking forward to seeing that.
>
> Cheers,
> Kent.

Makes sense, I'll add tests in v2.

Bartosz

[PATCH] KVM: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

2021-02-01 Thread Jiapeng Chong

Fix the following coccicheck warning:

./arch/x86/kvm/debugfs.c:44:0-23: WARNING: vcpu_tsc_scaling_frac_fops
should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./arch/x86/kvm/debugfs.c:36:0-23: WARNING: vcpu_tsc_scaling_fops should
be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./arch/x86/kvm/debugfs.c:27:0-23: WARNING: vcpu_tsc_offset_fops should
be defined with DEFINE_DEBUGFS_ATTRIBUTE.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 arch/x86/kvm/debugfs.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index 7e818d6..9c0e29e 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -15,7 +15,7 @@ static int vcpu_get_timer_advance_ns(void *data, u64 *val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_timer_advance_ns_fops, vcpu_get_timer_advance_ns, 
NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_timer_advance_ns_fops, 
vcpu_get_timer_advance_ns, NULL, "%llu\n");
 
 static int vcpu_get_tsc_offset(void *data, u64 *val)
 {
@@ -24,7 +24,7 @@ static int vcpu_get_tsc_offset(void *data, u64 *val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_offset_fops, vcpu_get_tsc_offset, NULL, 
"%lld\n");
+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_offset_fops, vcpu_get_tsc_offset, NULL, 
"%lld\n");
 
 static int vcpu_get_tsc_scaling_ratio(void *data, u64 *val)
 {
@@ -33,7 +33,7 @@ static int vcpu_get_tsc_scaling_ratio(void *data, u64 *val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_fops, vcpu_get_tsc_scaling_ratio, 
NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_scaling_fops, vcpu_get_tsc_scaling_ratio, 
NULL, "%llu\n");
 
 static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)
 {
@@ -41,7 +41,8 @@ static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 
*val)
return 0;
 }
 
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, 
vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, 
vcpu_get_tsc_scaling_frac_bits,
+ NULL, "%llu\n");
 
 void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu, struct dentry 
*debugfs_dentry)
 {
-- 
1.8.3.1

Re: [PATCH 10/29] drm/i915: Avoid comma separated statements

2021-02-01 Thread Jani Nikula

On Sat, 30 Jan 2021, Joe Perches  wrote:
> On Mon, 2020-08-24 at 21:56 -0700, Joe Perches wrote:
>> Use semicolons and braces.
>
> Ping?

Seems to have fallen between the cracks.

The first two hunks have been fixed, the last two are still there. Care
to respin and rebase against drm-tip (or linux-next) please?

BR,
Jani.

>
>> Signed-off-by: Joe Perches 
>> ---
>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c   | 8 +---
>>  drivers/gpu/drm/i915/gt/intel_gt_requests.c| 6 --
>>  drivers/gpu/drm/i915/gt/selftest_workarounds.c | 6 --
>>  drivers/gpu/drm/i915/intel_runtime_pm.c| 6 --
>>  4 files changed, 17 insertions(+), 9 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> index 699125928272..114c13285ff1 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> @@ -323,10 +323,12 @@ static int __gen8_ppgtt_alloc(struct 
>> i915_address_space * const vm,
>>  }
>>  
>> 
>>  spin_lock(&pd->lock);
>> -if (likely(!pd->entry[idx]))
>> +if (likely(!pd->entry[idx])) {
>>  set_pd_entry(pd, idx, pt);
>> -else
>> -alloc = pt, pt = pd->entry[idx];
>> +} else {
>> +alloc = pt;
>> +pt = pd->entry[idx];
>> +}
>>  }
>>  
>> 
>>  if (lvl) {
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
>> b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>> index 66fcbf9d0fdd..54408d0b5e6e 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>> @@ -139,8 +139,10 @@ long intel_gt_retire_requests_timeout(struct intel_gt 
>> *gt, long timeout)
>>  LIST_HEAD(free);
>>  
>> 
>>  interruptible = true;
>> -if (unlikely(timeout < 0))
>> -timeout = -timeout, interruptible = false;
>> +if (unlikely(timeout < 0)) {
>> +timeout = -timeout;
>> +interruptible = false;
>> +}
>>  
>> 
>>  flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
>>  spin_lock(&timelines->lock);
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c 
>> b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
>> index febc9e6692ba..3e4cbeed20bd 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
>> @@ -521,8 +521,10 @@ static int check_dirty_whitelist(struct intel_context 
>> *ce)
>>  
>> 
>>  srm = MI_STORE_REGISTER_MEM;
>>  lrm = MI_LOAD_REGISTER_MEM;
>> -if (INTEL_GEN(engine->i915) >= 8)
>> -lrm++, srm++;
>> +if (INTEL_GEN(engine->i915) >= 8) {
>> +lrm++;
>> +srm++;
>> +}
>>  
>> 
>>  pr_debug("%s: Writing garbage to %x\n",
>>   engine->name, reg);
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
>> b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 153ca9e65382..f498f1c80755 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -201,8 +201,10 @@ __print_intel_runtime_pm_wakeref(struct drm_printer *p,
>>  unsigned long rep;
>>  
>> 
>>  rep = 1;
>> -while (i + 1 < dbg->count && dbg->owners[i + 1] == stack)
>> -rep++, i++;
>> +while (i + 1 < dbg->count && dbg->owners[i + 1] == stack) {
>> +rep++;
>> +i++;
>> +}
>>  __print_depot_stack(stack, buf, PAGE_SIZE, 2);
>>  drm_printf(p, "Wakeref x%lu taken at:\n%s", rep, buf);
>>  }
>
>

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [PATCH v2] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off

2021-02-01 Thread Paolo Bonzini


On 29/01/21 17:58, Sean Christopherson wrote:

On Fri, Jan 29, 2021, Paolo Bonzini wrote:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..15733013b266 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1401,7 +1401,7 @@ static u64 kvm_get_arch_capabilities(void)
 *This lets the guest use VERW to clear CPU buffers.



This comment be updated to call out the new TSX_CTRL behavior.

/*
 * On TAA affected systems:
 *  - nothing to do if TSX is disabled on the host.
 *  - we emulate TSX_CTRL if present on the host.
 *This lets the guest use VERW to clear CPU buffers.
 */


Ok.


 */
if (!boot_cpu_has(X86_FEATURE_RTM))
-   data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
+   data &= ~ARCH_CAP_TAA_NO;


Hmm, simply clearing TSX_CTRL will only preserve the host value.  Since
ARCH_CAPABILITIES is unconditionally emulated by KVM, wouldn't it make sense to
unconditionally expose TSX_CTRL as well, as opposed to exposing it only if it's
supported in the host?  I.e. allow migrating a TSX-disabled guest to a host
without TSX.  Or am I misunderstanding how TSX_CTRL is checked/used?


I'm a bit wary of having a combination (MDS_NO=0, TSX_CTRL=1) that does 
not exist on bare metal.  There are other cases where such combinations 
can happen, especially with the Spectre and SSBD mitigations (for 
example due to AMD CPUID bits for Intel processors), but at least those 
are just redundancies in the CPUID bits and it's more likely that the 
guest does something sensible with them.


Paolo


else if (!boot_cpu_has_bug(X86_BUG_TAA))
data |= ARCH_CAP_TAA_NO;
  
--

2.26.2

[PATCH v2 1/1] vsock: fix the race conditions in multi-transport support

2021-02-01 Thread Alexander Popov

There are multiple similar bugs implicitly introduced by the
commit c0cfa2d8a788fcf4 ("vsock: add multi-transports support") and
commit 6a2c0962105ae8ce ("vsock: prevent transport modules unloading").

The bug pattern:
 [1] vsock_sock.transport pointer is copied to a local variable,
 [2] lock_sock() is called,
 [3] the local variable is used.
VSOCK multi-transport support introduced the race condition:
vsock_sock.transport value may change between [1] and [2].

Let's copy vsock_sock.transport pointer to local variables after
the lock_sock() call.

Fixes: c0cfa2d8a788fcf4 ("vsock: add multi-transports support")

Reviewed-by: Stefano Garzarella 
Signed-off-by: Alexander Popov 
---
 net/vmw_vsock/af_vsock.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index d10916ab4526..f64e681493a5 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -997,9 +997,12 @@ static __poll_t vsock_poll(struct file *file, struct 
socket *sock,
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
 
} else if (sock->type == SOCK_STREAM) {
-   const struct vsock_transport *transport = vsk->transport;
+   const struct vsock_transport *transport;
+
lock_sock(sk);
 
+   transport = vsk->transport;
+
/* Listening sockets that have connections in their accept
 * queue can be read.
 */
@@ -1082,10 +1085,11 @@ static int vsock_dgram_sendmsg(struct socket *sock, 
struct msghdr *msg,
err = 0;
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
 
lock_sock(sk);
 
+   transport = vsk->transport;
+
err = vsock_auto_bind(vsk);
if (err)
goto out;
@@ -1544,10 +1548,11 @@ static int vsock_stream_setsockopt(struct socket *sock,
err = 0;
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
 
lock_sock(sk);
 
+   transport = vsk->transport;
+
switch (optname) {
case SO_VM_SOCKETS_BUFFER_SIZE:
COPY_IN(val);
@@ -1680,7 +1685,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
 
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
total_written = 0;
err = 0;
 
@@ -1689,6 +1693,8 @@ static int vsock_stream_sendmsg(struct socket *sock, 
struct msghdr *msg,
 
lock_sock(sk);
 
+   transport = vsk->transport;
+
/* Callers should not provide a destination with stream sockets. */
if (msg->msg_namelen) {
err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
@@ -1823,11 +1829,12 @@ vsock_stream_recvmsg(struct socket *sock, struct msghdr 
*msg, size_t len,
 
sk = sock->sk;
vsk = vsock_sk(sk);
-   transport = vsk->transport;
err = 0;
 
lock_sock(sk);
 
+   transport = vsk->transport;
+
if (!transport || sk->sk_state != TCP_ESTABLISHED) {
/* Recvmsg is supposed to return 0 if a peer performs an
 * orderly shutdown. Differentiate between that case and when a
-- 
2.26.2

Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS

2021-02-01 Thread Xu, Like

On 2021/1/29 10:52, Liuxiangdong (Aven, Cloud Infrastructure Service 
Product Dept.) wrote:



On 2021/1/26 15:08, Xu, Like wrote:

On 2021/1/25 22:47, Liuxiangdong (Aven, Cloud Infrastructure Service
Product Dept.) wrote:

Thanks for replying,

On 2021/1/25 10:41, Like Xu wrote:

+ k...@vger.kernel.org

Hi Liuxiangdong,

On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service
Product Dept.) wrote:

Hi Like,

Some questions about
https://lore.kernel.org/kvm/20210104131542.495413-1-like...@linux.intel.com/ 

 




Thanks for trying the PEBS feature in the guest,
and I assume you have correctly applied the QEMU patches for guest PEBS.


Is there any other patch that needs to be apply? I use qemu 5.2.0.
(download from github on January 14th)

Two qemu patches are attached against qemu tree
(commit 31ee895047bdcf7387e3570cbd2a473c6f744b08)
and then run the guest with "-cpu,pebs=true".

Note, this two patch are just for test and not finalized for qemu upstream.

Yes, we can use pebs in IceLake when qemu patches applied.
Thanks very much!


Thanks for your verification on this earlier version.


1)Test in IceLake

In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM,
DS, DTES64, we only support Ice Lake with the following x86_model(s):

#define INTEL_FAM6_ICELAKE_X    0x6A
#define INTEL_FAM6_ICELAKE_D    0x6C

you can check the eax output of "cpuid -l 1 -1 -r",
for example "0x000606a4" meets this requirement.

It's INTEL_FAM6_ICELAKE_X

Yes, it's the target hardware.


cpuid -l 1 -1 -r

CPU:
    0x0001 0x00: eax=0x000606a6 ebx=0xb4800800 ecx=0x7ffefbf7
edx=0xbfebfbff


HOST:

CPU family:  6

Model:   106

Model name:  Intel(R) Xeon(R) Platinum 8378A CPU 
$@ $@


microcode: sig=0x606a6, pf=0x1, revision=0xd000122

As long as you get the latest BIOS from the provider,
you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one.

OK. I'll do it later.

Guest:  linux kernel 5.11.0-rc2

I assume it's the "upstream tag v5.11-rc2" which is fine.

Yes.

We can find pebs/intel_pt flag in guest cpuinfo, but there still exists
error when we use perf

Just a note, intel_pt and pebs are two features and we can write
pebs records to intel_pt buffer with extra hardware support.
(by default, pebs records are written to the pebs buffer)

You may check the output of "dmesg | grep PEBS" in the guest
to see if the guest PEBS cpuinfo is exposed and use "perf record
–e cycles:pp" to see if PEBS feature actually  works in the guest.

I apply only pebs patch set to linux kernel 5.11.0-rc2, test perf in
guest and dump stack when return -EOPNOTSUPP

Yes, you may apply the qemu patches and try it again.


(1)
# perf record -e instructions:pp
Error:
instructions:pp: PMU Hardware doesn't support
sampling/overflow-interrupts. Try 'perf stat'

[  117.793266] Call Trace:
[  117.793270]  dump_stack+0x57/0x6a
[  117.793275]  intel_pmu_setup_lbr_filter+0x137/0x190
[  117.793280]  intel_pmu_hw_config+0x18b/0x320
[  117.793288]  hsw_hw_config+0xe/0xa0
[  117.793290]  x86_pmu_event_init+0x8e/0x210
[  117.793293]  perf_try_init_event+0x40/0x130
[  117.793297]  perf_event_alloc.part.22+0x611/0xde0
[  117.793299]  ? alloc_fd+0xba/0x180
[  117.793302]  __do_sys_perf_event_open+0x1bd/0xd90
[  117.793305]  do_syscall_64+0x33/0x40
[  117.793308]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Do we need lbr when we use pebs?

No, lbr ane pebs are two features and we enable it separately.


I tried to apply lbr patch
set(https://lore.kernel.org/kvm/911adb63-ba05-ea93-c038-1c09cff15...@intel.com/) 


to kernel and qemu, but there is still other problem.
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
event
...

We don't need that patch for PEBS feature.


(2)
# perf record -e instructions:ppp
Error:
instructions:ppp: PMU Hardware doesn't support
sampling/overflow-interrupts. Try 'perf stat'

[  115.188498] Call Trace:
[  115.188503]  dump_stack+0x57/0x6a
[  115.188509]  x86_pmu_hw_config+0x1eb/0x220
[  115.188515]  intel_pmu_hw_config+0x13/0x320
[  115.188519]  hsw_hw_config+0xe/0xa0
[  115.188521]  x86_pmu_event_init+0x8e/0x210
[  115.188524]  perf_try_init_event+0x40/0x130
[  115.188528]  perf_event_alloc.part.22+0x611/0xde0
[  115.188530]  ? alloc_fd+0xba/0x180
[  115.188534]  __do_sys_perf_event_open+0x1bd/0xd90
[  115.188538]  do_syscall_64+0x33/0x40
[  115.188541]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is beacuse x86_pmu.intel_cap.pebs_format is always 0 in
x86_pmu_max_precise().

We rdmsr MSR_IA32_PERF_CAPABILITIES(0x0345)  from HOST, it's f4c5.
 From guest, it's 2000


# perf record –e cycles:pp

Error:

cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts.
Try ‘perf stat’

Could you give some advice?

If you have more specific comments or any concerns, just let me know.


2)Test in Skylake

HOST:

CP

Re: [PATCH 3/8] scsi: ufshpb: Add region's reads counter

2021-02-01 Thread gre...@linuxfoundation.org

On Mon, Feb 01, 2021 at 08:17:59AM +, Avri Altman wrote:
> > 
> > On Mon, Feb 01, 2021 at 07:51:19AM +, Avri Altman wrote:
> > > >
> > > > On Mon, Feb 01, 2021 at 07:12:53AM +, Avri Altman wrote:
> > > > > > > +#define WORK_PENDING 0
> > > > > > > +#define ACTIVATION_THRSHLD 4 /* 4 IOs */
> > > > > > Rather than fixing it with macro, how about using sysfs and make it
> > > > > > configurable?
> > > > > Yes.
> > > > > I will add a patch making all the logic configurable.
> > > > > As all those are hpb-related parameters, I think module parameters are
> > > > more adequate.
> > > >
> > > > No, this is not the 1990's, please never add new module parameters to
> > > > drivers.  If not for the basic problem of they do not work on a
> > > > per-device basis, but on a per-driver basis, which is what you almost
> > > > never want.
> > > OK.
> > >
> > > >
> > > > But why would you want to change this value, why can't the driver "just
> > > > work" and not need manual intervention?
> > > It is.
> > > But those are a knobs each vendor may want to tweak,
> > > So it'll be optimized with its internal device's implementation.
> > >
> > > Tweaking the parameters, as well as the entire logic, is really an endless
> > task.
> > > Some logic works better for some scenarios, while falling behind on 
> > > others.
> > 
> > Shouldn't the hardware know how to handle this dynamically?  If not, how
> > is a user going to know?
> There is one "brain".
> It is either in the device - in device mode, Or in the host - in host mode 
> control.
> The "brain" decides which region is active, thus carrying the physical 
> address along with the logical -
> minimizing context switches in the device's RAM.
> 
> There can be up to N active regions.
> Activation and deactivation has its overhead.
> So basically it is a constraint-optimization problem.

So how do you solve it?  And how would you expect a user to solve it if
the kernel can not?

You better document the heck out of these configuration options :)

thanks,

greg k-h

Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Chao Leng





On 2021/2/1 15:29, Hannes Reinecke wrote:

On 2/1/21 3:16 AM, Chao Leng wrote:



On 2021/1/29 17:20, Hannes Reinecke wrote:

On 1/29/21 9:46 AM, Chao Leng wrote:



On 2021/1/29 16:33, Hannes Reinecke wrote:

On 1/29/21 8:45 AM, Chao Leng wrote:



On 2021/1/29 15:06, Hannes Reinecke wrote:

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

If there is just one path(the "old") and the "old" is deleted,
nvme_next_ns() will return NULL.
The list like this:
head->next = head;
old->next = head;
If there is two or more path and the "old" is deleted,
"for" will be infinite loop. because nvme_next_ns() will return
the path which in the list except the "old", check condition will
be true for ever.


But that will be caught by the statement above:

if (list_is_singular(&head->list))

no?

Two path just a sample example.
If there is just two path, will enter it, may cause no path but there is
actually one path. It is falsely assumed that the "old" must be not deleted.
If there is more than two path, will cause infinite loop.

So you mean we'll need something like this?

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 71696819c228..8ffccaf9c19a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
  static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
 struct nvme_ns *ns)
  {
-   ns = list_next_or_null_rcu(&head->list, &ns->siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
+   if (ns) {
+   ns = list_next_or_null_rcu(&head->list, &ns->siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }

No, in the scenario, ns should not be NULL.


Why not? 'ns == NULL' is precisely the corner-case this is trying to fix...


May be we can do like this:

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 282b7a4ea9a9..b895011a2cbd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -199,30 +199,24 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
 return found;
  }

-static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
-   struct nvme_ns *ns)
-{
-   ns = list_next_or_null_rcu(&head->list, &ns->siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
-   return list_first_or_null_rcu(&head->list, struct nvme_ns, siblings);
-}
+#define nvme_next_ns_condition(head, current, condition) \
+({ \
+   struct nvme_ns *__ptr = list_next_or_null_rcu(&(head)->list, \
+   &(current)->siblings, struct nvme_ns, siblings); \
+   __ptr ? __ptr : (condition) ? (condition) = false, \
+   list_first_or_null_rcu(&(head)->list, struct nvme_ns, \
+   siblings) : NULL; \
+})


Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the current 
(and my fixed) versions is?
I'm still not clear where the problem is once we applied both patches.

For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:
head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;

My patch work flow:
nvme_next_ns_condition(head, old, true

Re: [PATCH] x86: Remove unnecessary kmap() from sgx_ioc_enclave_init()

2021-02-01 Thread Christoph Hellwig

On Fri, Jan 29, 2021 at 09:37:30AM -0800, Sean Christopherson wrote:
> On Thu, Jan 28, 2021, ira.we...@intel.com wrote:
> > From: Ira Weiny 
> > 
> > There is no reason to alloc a page and kmap it to store this temporary
> > data from the user. 
> 
> Actually, there is, it's just poorly documented.  The sigstruct needs to be
> page aligned, and the token needs to be 512-byte aligned.  kmcalloc doesn't
> guarantee alignment.  IIRC things will work until slub_debug is enabled, at
> which point the natural alignment behavior goes out the window.

Well, there still is absolutely no need for the kmap as you can use
page_address for a GFP_KERNEL allocation.

Re: [PATCH] arm64: dts: mt8183: Fix GCE include path

2021-02-01 Thread Matthias Brugger




On 31/01/2021 17:17, Chun-Kuang Hu wrote:
> Hi, Matthias:
> 
>  於 2021年1月31日 週日 下午6:17寫道：
>>
>> From: Matthias Brugger 
>>
>> The header file of GCE should be for MT8183 SoC instead of MT8173.
>>
> 
> Reviewed-by: Chun-Kuang Hu 
> 

Applied to v5.11-next/dts64

Thanks

>> Fixes: 91f9c963ce79 ("arm64: dts: mt8183: Add display nodes for MT8183")
>> Reported-by: CK Hu 
>> Signed-off-by: Matthias Brugger 
>>
>> ---
>>
>>  arch/arm64/boot/dts/mediatek/mt8183.dtsi | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
>> b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> index 5b782a4769e7..80e466ce99f1 100644
>> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> @@ -6,7 +6,7 @@
>>   */
>>
>>  #include 
>> -#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> --
>> 2.30.0
>>
>>
>> ___
>> Linux-mediatek mailing list
>> linux-media...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-mediatek

Re: Process-wide watchpoints

2021-02-01 Thread Dmitry Vyukov

On Sun, Jan 31, 2021 at 11:28 AM Dmitry Vyukov  wrote:
>
> On Sun, Jan 31, 2021 at 11:04 AM Dmitry Vyukov  wrote:
> >
> > On Thu, Nov 12, 2020 at 11:43 AM Dmitry Vyukov  wrote:
> > > > > for sampling race detection),
> > > > > number of threads in the process can be up to, say, ~~10K and the
> > > > > watchpoint is intended to be set for a very brief period of time
> > > > > (~~few ms).
> > > >
> > > > Performance is a consideration here, doing lots of IPIs in such a short
> > > > window, on potentially large machines is a DoS risk.
> > > >
> > > > > This can be done today with both perf_event_open and ptrace.
> > > > > However, the problem is that both APIs work on a single thread level
> > > > > (? perf_event_open can be inherited by children, but not for existing
> > > > > siblings). So doing this would require iterating over, say, 10K
> > > >
> > > > One way would be to create the event before the process starts spawning
> > > > threads and keeping it disabled. Then every thread will inherit it, but
> > > > it'll be inactive.
> > > >
> > > > > I see at least one potential problem: what do we do if some sibling
> > > > > thread already has all 4 watchpoints consumed?
> > > >
> > > > That would be immediately avoided by this, since it will have the
> > > > watchpoint reserved per inheriting the event.
> > > >
> > > > Then you can do ioctl(PERF_EVENT_IOC_{MODIFY_ATTRIBUTES,ENABLE,DISABLE})
> > > > to update the watch location and enable/disable it. This _will_ indeed
> > > > result in a shitload of IPIs if the threads are active, but it should
> > > > work.
> > >
> > > Aha! That's the possibility I missed.
> > > We will try to prototype this and get back with more questions if/when
> > > we have them.
> > > Thanks!
> >
> > Hi Peter,
> >
> > I've tested this approach and it works, but only in half.
> > PERF_EVENT_IOC_{ENABLE,DISABLE} work as advertised.
> > However, PERF_EVENT_IOC_MODIFY_ATTRIBUTES does not work for inherited
> > child events.
> > Does something like this make any sense to you? Are you willing to
> > accept such change?
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 55d18791a72d..f6974807a32c 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -3174,7 +3174,7 @@ int perf_event_refresh(struct perf_event *event,
> > int refresh)
> >  }
> >  EXPORT_SYMBOL_GPL(perf_event_refresh);
> >
> > -static int perf_event_modify_breakpoint(struct perf_event *bp,
> > +static int _perf_event_modify_breakpoint(struct perf_event *bp,
> >  struct perf_event_attr *attr)
> >  {
> > int err;
> > @@ -3189,6 +3189,28 @@ static int perf_event_modify_breakpoint(struct
> > perf_event *bp,
> > return err;
> >  }
> >
> > +static int perf_event_modify_breakpoint(struct perf_event *bp,
> > +   struct perf_event_attr *attr)
> > +{
> > +   struct perf_event *child;
> > +   int err;
> > +
> > +   WARN_ON_ONCE(bp->ctx->parent_ctx);
> > +
> > +   mutex_lock(&bp->child_mutex);
> > +   err = _perf_event_modify_breakpoint(bp, attr);
> > +   if (err)
> > +   goto unlock;
> > +   list_for_each_entry(child, &bp->child_list, child_list) {
> > +   err = _perf_event_modify_breakpoint(child, attr);
> > +   if (err)
> > +   goto unlock;
> > +   }
> > +unlock:
> > +   mutex_unlock(&bp->child_mutex);
> > +   return err;
> > +}
> > +
> >  static int perf_event_modify_attr(struct perf_event *event,
> >   struct perf_event_attr *attr)
>
>
> Not directly related to the above question, but related to my use case.
> Could we extend bpf_perf_event_data with some more data re breakpoint events?
>
> struct bpf_perf_event_data {
> bpf_user_pt_regs_t regs;
> __u64 sample_period;
> __u64 addr;
> };
>
> Ideally, I would like to have an actual access address, size and
> read/write type (may not match bp addr/size). Is that info easily
> available at the point of bpf hook call?
> Or, if that's not available at least breakpoint bp_type/bp_size.
>
> Is it correct that we can materialize in bpf_perf_event_data anything
> that's available in bpf_perf_event_data_kern (if it makes sense in the
> public interface of course)?
>
> struct bpf_perf_event_data_kern {
> bpf_user_pt_regs_t *regs;
> struct perf_sample_data *data;
> struct perf_event *event;
> };
>
> Unfortunately I don't see perf_event_attr.bp_type/bp_size
> stored/accessible anywhere in bpf_perf_event_data_kern. What would be
> the right way to expose them in bpf_perf_event_data?

Or, alternatively would it be reasonable for perf to generate SIGTRAP
directly on watchpoint hit (like ptrace does)? That's what I am
ultimately trying to do by attaching a bpf program.

Re: [PATCH v1 2/2] arm64: configs: Support DEVAPC on MediaTek platforms

2021-02-01 Thread Matthias Brugger




On 31/01/2021 23:23, Arnd Bergmann wrote:
> On Sun, Jan 31, 2021 at 3:07 PM Matthias Brugger  
> wrote:
>> On 23/12/2020 09:44, Neal Liu wrote:
>>> Support DEVAPC on MediaTek platforms by enabling CONFIG_MTK_DEVAPC.
>>>
>>> Signed-off-by: Neal Liu 
>>> ---
>>>  arch/arm64/configs/defconfig |1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
>>> index 17a2df6..a373776 100644
>>> --- a/arch/arm64/configs/defconfig
>>> +++ b/arch/arm64/configs/defconfig
>>> @@ -257,6 +257,7 @@ CONFIG_MTD_NAND_MARVELL=y
>>>  CONFIG_MTD_NAND_FSL_IFC=y
>>>  CONFIG_MTD_NAND_QCOM=y
>>>  CONFIG_MTD_SPI_NOR=y
>>> +CONFIG_MTK_DEVAPC=m
>>>  CONFIG_SPI_CADENCE_QUADSPI=y
>>>  CONFIG_BLK_DEV_LOOP=y
>>>  CONFIG_BLK_DEV_NBD=m
>>>
>>
>> From my understanding, defconfig is for a minimal config that allows to boot 
>> a
>> machine. As MTK_DEVAPC is a rather exotic driver to detect bus access
>> violations, I think it's not a good candidate for inclusion in defconfig.
>>
>> In any case, I added the SoC maintainer, so that they can correct me, if I'm
>> wrong :)
> 
> I generally don't mind adding platform specific drivers as loadable modules
> even if they are somewhat obscure. For built-in drivers, this is
> different though,
> as those have a noticeable impact on other platforms.
> 
> I haven't kept track of what this particular driver does, but from the Kconfig
> description, I'd say it should get enabled in defconfig.
> 

Thanks for the feedback Arnd.
Applied now to v5.11-next/defconfig

Re: [PATCH v8 0/5] media: i2c: Add RDACM21 camera module

2021-02-01 Thread Jacopo Mondi

Hi Sakari,

On Thu, Jan 14, 2021 at 06:04:24PM +0100, Jacopo Mondi wrote:
> One more iteration to squash in all the fixups sent in v7 and address
> a comment from Sergei in [2/5] commit message.
>
> All patches now reviewed and hopefully ready to be collected!

All patches seems reviewed, do you think we can still collect this for
the v5.12 merge window ?

Thanks
  j

>
> Thanks
>   j
>
> Jacopo Mondi (5):
>   media: i2c: Add driver for RDACM21 camera module
>   dt-bindings: media: max9286: Document
> 'maxim,reverse-channel-microvolt'
>   media: i2c: max9286: Break-out reverse channel setup
>   media: i2c: max9286: Make channel amplitude programmable
>   media: i2c: max9286: Configure reverse channel amplitude
>
>  .../bindings/media/i2c/maxim,max9286.yaml |  22 +
>  MAINTAINERS   |  12 +
>  drivers/media/i2c/Kconfig |  13 +
>  drivers/media/i2c/Makefile|   2 +
>  drivers/media/i2c/max9286.c   |  60 +-
>  drivers/media/i2c/rdacm21.c   | 623 ++
>  6 files changed, 719 insertions(+), 13 deletions(-)
>  create mode 100644 drivers/media/i2c/rdacm21.c
>
> --
> 2.29.2
>

Re: [PATCH 1/1] vsock: fix the race conditions in multi-transport support

2021-02-01 Thread Alexander Popov

On 01.02.2021 11:26, Stefano Garzarella wrote:
> On Sun, Jan 31, 2021 at 01:59:14PM +0300, Alexander Popov wrote:
>> There are multiple similar bugs implicitly introduced by the
>> commit c0cfa2d8a788fcf4 ("vsock: add multi-transports support") and
>> commit 6a2c0962105ae8ce ("vsock: prevent transport modules unloading").
>>
>> The bug pattern:
>> [1] vsock_sock.transport pointer is copied to a local variable,
>> [2] lock_sock() is called,
>> [3] the local variable is used.
>> VSOCK multi-transport support introduced the race condition:
>> vsock_sock.transport value may change between [1] and [2].
>>
>> Let's copy vsock_sock.transport pointer to local variables after
>> the lock_sock() call.
> 
> We can add:
> 
> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
> 
>>
>> Signed-off-by: Alexander Popov 
>> ---
>> net/vmw_vsock/af_vsock.c | 17 -
>> 1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index d10916ab4526..28edac1f9aa6 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -997,9 +997,12 @@ static __poll_t vsock_poll(struct file *file, struct 
>> socket *sock,
>>  mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
>>
>>  } else if (sock->type == SOCK_STREAM) {
>> -const struct vsock_transport *transport = vsk->transport;
>> +const struct vsock_transport *transport = NULL;
> 
> I think we can avoid initializing to NULL since we assign it shortly 
> after.
> 
>> +
>>  lock_sock(sk);
>>
>> +transport = vsk->transport;
>> +
>>  /* Listening sockets that have connections in their accept
>>   * queue can be read.
>>   */
>> @@ -1082,10 +1085,11 @@ static int vsock_dgram_sendmsg(struct socket *sock, 
>> struct msghdr *msg,
>>  err = 0;
>>  sk = sock->sk;
>>  vsk = vsock_sk(sk);
>> -transport = vsk->transport;
>>
>>  lock_sock(sk);
>>
>> +transport = vsk->transport;
>> +
>>  err = vsock_auto_bind(vsk);
>>  if (err)
>>  goto out;
>> @@ -1544,10 +1548,11 @@ static int vsock_stream_setsockopt(struct 
>> socket *sock,
>>  err = 0;
>>  sk = sock->sk;
>>  vsk = vsock_sk(sk);
>> -transport = vsk->transport;
>>
>>  lock_sock(sk);
>>
>> +transport = vsk->transport;
>> +
>>  switch (optname) {
>>  case SO_VM_SOCKETS_BUFFER_SIZE:
>>  COPY_IN(val);
>> @@ -1680,7 +1685,6 @@ static int vsock_stream_sendmsg(struct socket *sock, 
>> struct msghdr *msg,
>>
>>  sk = sock->sk;
>>  vsk = vsock_sk(sk);
>> -transport = vsk->transport;
>>  total_written = 0;
>>  err = 0;
>>
>> @@ -1689,6 +1693,8 @@ static int vsock_stream_sendmsg(struct socket *sock, 
>> struct msghdr *msg,
>>
>>  lock_sock(sk);
>>
>> +transport = vsk->transport;
>> +
>>  /* Callers should not provide a destination with stream sockets. */
>>  if (msg->msg_namelen) {
>>  err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
>> @@ -1823,11 +1829,12 @@ vsock_stream_recvmsg(struct socket *sock, struct 
>> msghdr *msg, size_t len,
>>
>>  sk = sock->sk;
>>  vsk = vsock_sk(sk);
>> -transport = vsk->transport;
>>  err = 0;
>>
>>  lock_sock(sk);
>>
>> +transport = vsk->transport;
>> +
>>  if (!transport || sk->sk_state != TCP_ESTABLISHED) {
>>  /* Recvmsg is supposed to return 0 if a peer performs an
>>   * orderly shutdown. Differentiate between that case and when a
>> -- 
>> 2.26.2
>>
> 
> Thanks for fixing this issues. With the small changes applied:
> 
> Reviewed-by: Stefano Garzarella 

Hello Stefano,

Thanks for the review.

I've just sent the v2.

Best regards,
Alexander

Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Hannes Reinecke


On 2/1/21 9:47 AM, Chao Leng wrote:



On 2021/2/1 15:29, Hannes Reinecke wrote:[ .. ]

Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the 
current (and my fixed) versions is?

I'm still not clear where the problem is once we applied both patches.
For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:

head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;


And this is where I have issues with.
Where does 'old' come from?
Clearly it was part of the list at one point; so what happened to it?

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

Re: [PATCH v9 7/7] at24: Support probing while off

2021-02-01 Thread Bartosz Golaszewski

On Fri, Jan 29, 2021 at 1:20 PM Sakari Ailus
 wrote:
>
> Hi Bartosz,
>
> Thanks for the review.
>
> On Fri, Jan 29, 2021 at 11:56:00AM +0100, Bartosz Golaszewski wrote:
> > On Fri, Jan 29, 2021 at 12:27 AM Sakari Ailus
> >  wrote:
> > >
> > > In certain use cases (where the chip is part of a camera module, and the
> > > camera module is wired together with a camera privacy LED), powering on
> > > the device during probe is undesirable. Add support for the at24 to
> > > execute probe while being powered off. For this to happen, a hint in form
> > > of a device property is required from the firmware.
> > >
> > > Signed-off-by: Sakari Ailus 
> > > ---
> > >  drivers/misc/eeprom/at24.c | 43 +++---
> > >  1 file changed, 26 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
> > > index 926408b41270c..dd0b3f24e3808 100644
> > > --- a/drivers/misc/eeprom/at24.c
> > > +++ b/drivers/misc/eeprom/at24.c
> > > @@ -595,6 +595,7 @@ static int at24_probe(struct i2c_client *client)
> > > bool i2c_fn_i2c, i2c_fn_block;
> > > unsigned int i, num_addresses;
> > > struct at24_data *at24;
> > > +   bool low_power;
> > > struct regmap *regmap;
> > > bool writable;
> > > u8 test_byte;
> > > @@ -750,14 +751,16 @@ static int at24_probe(struct i2c_client *client)
> > >
> > > i2c_set_clientdata(client, at24);
> > >
> > > -   err = regulator_enable(at24->vcc_reg);
> > > -   if (err) {
> > > -   dev_err(dev, "Failed to enable vcc regulator\n");
> > > -   return err;
> > > -   }
> > > +   low_power = acpi_dev_state_low_power(&client->dev);
> >
> > I've raised my concern about the naming of this before but no
> > discussion followed. Do we really want to name it: "low power"? This
> > is misleading as the device can actually be powered off at probe().
> > "Low power" suggests some low-power state or even low battery IMO.
>
> This was suggested by Rafael in place of "powered off" as it's not know the
> device is powered off. The same terms should be used in all contexts (ACPI
> and I²C frameworks and drivers). Others haven't expressed concerns.
>

So we're describing a situation where "device may be powered off" by
calling it "low_power". This doesn't make sense. Why not something
like: acpi_dev_may_be_off(), acpi_dev_powerdown_possible(),
acpi_dev_possibly_off(). If I'm reading a driver's code an see
"acpi_dev_state_low_power()", I would have never guessed it refers to
a situation where the device may be potentially powered-down.

> ACPI spec appears to be using terms "on" and "off".
>
> The use of the function is not limited to driver probe time.
>
> >
> > If anything: I'd prefer the 'low_power' local variable be changed to
> > "no_test_read".
>
> That misses the power management related suggestion now present in the name
> --- the device needs to be suspended using runtime PM if probe fails and
> it's not in "low power state".
>
> How about "off_during_probe"?
>

Yes, this is much better than low_power.

Bartosz

> --
> Kind regards,
>
> Sakari Ailus

Re: linux-5.10.11 build failure

2021-02-01 Thread Chris Clayton

Hi Greg,

On 29/01/2021 15:14, Josh Poimboeuf wrote:
> On Fri, Jan 29, 2021 at 12:09:53PM +0100, Greg Kroah-Hartman wrote:
>> On Fri, Jan 29, 2021 at 11:03:26AM +, Chris Clayton wrote:
>>>
>>>
>>> On 29/01/2021 10:11, Greg Kroah-Hartman wrote:
 On Thu, Jan 28, 2021 at 10:00:15AM -0600, Josh Poimboeuf wrote:
...

 It is in Linus's tree now :)

 Now grabbed.

>>>
>>> Are you sure, Greg? I don't see the patch in Linus' tree at
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git. Nor do 
>>> is see it in your stable queue at
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/. 
>>> For clarity, I've attached the patch which
>>> fixes problem I reported and is currently sat in 
>>> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git As I
>>> understand it, the patch is scheduled to be included in a pull request to 
>>> Linus this weekend in time for -rc6.
>>>
>>> In fact, I did a pull from Linus' tree a few minutes ago and the build 
>>> failed in the way I reported in this thread. I
>>> added the patch and the build now succeeds.
>>
>> Ok, sorry, no, I grabbed 1d489151e9f9 ("objtool: Don't fail on missing
>> symbol table") which is what Josh asked me to take.  I got that confused
>> here.
> 
> I'm probably responsible for that confusion, I got mixed up myself.
> It'll be a good idea to take both anyway.
> 

The patch is now in Linus' tree at 5e6dca82bcaa49348f9e5fcb48df4881f6d6c4ae

Thanks.

Chris

Re: [PATCH 1/2] soc: mediatek: pm-domains: Use correct mask for bus_prot_clr

2021-02-01 Thread Matthias Brugger




On 01/02/2021 06:45, Bilal Wasim wrote:
> When "bus_prot_reg_update" is false, the driver should use
> INFRA_TOPAXI_PROTECTEN for both setting and clearing the bus
> protection. However, the driver does not use this mask for
> clearing bus protection which causes failure when booting
> the imgtec gpu.
> 
> Corrected and tested with mt8173 chromebook.
> 
> Signed-off-by: Bilal Wasim 
> ---
>  drivers/soc/mediatek/mtk-pm-domains.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-pm-domains.h 
> b/drivers/soc/mediatek/mtk-pm-domains.h
> index 141dc76054e6..7454c0b4f768 100644
> --- a/drivers/soc/mediatek/mtk-pm-domains.h
> +++ b/drivers/soc/mediatek/mtk-pm-domains.h
> @@ -60,7 +60,7 @@
>  #define BUS_PROT_UPDATE_TOPAXI(_mask)\
>   BUS_PROT_UPDATE(_mask,  \
>   INFRA_TOPAXI_PROTECTEN, \
> - INFRA_TOPAXI_PROTECTEN_CLR, \
> + INFRA_TOPAXI_PROTECTEN, \

BUS_PROT_UPDATE sets bus_prot_reg_update to true, which contradicts what you say
in the commit message.

Please clarify.

Regards,
Matthias

>   INFRA_TOPAXI_PROTECTSTA1)
>  
>  struct scpsys_bus_prot_data {
>

Re: turbostat: Fix Pkg Power on Zen

2021-02-01 Thread Kurt Garloff

Hi Len,

Issue persists on Ryzen in 5.11-rc6:

kvmadmin@KurtSrv2018(//):~ [0]$ sudo 
/casa/src/linux-stable/tools/power/x86/turbostat/turbostat
turbostat version 20.09.30 - Len Brown 
CPUID(0): AuthenticAMD 0x10 CPUID levels; 0x8023 xlevels; 
family:model:stepping 0x19:21:0 (25:33:0)
CPUID(1): SSE3 MONITOR - - - TSC MSR - HT -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, 
No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
RAPL: 234 sec. Joule Counter Range, at 280 Watts
/dev/cpu_dma_latency: 20 usec (default)
current_driver: acpi_idle
current_governor: menu
current_governor_ro: menu
cpu22: POLL: CPUIDLE CORE POLL IDLE
cpu22: C1: ACPI FFH MWAIT 0x0
cpu22: C2: ACPI IOPORT 0x414
cpu22: cpufreq driver: acpi-cpufreq
cpu22: cpufreq governor: schedutil
cpufreq boost: 1
cpu0: MSR_RAPL_PWR_UNIT: 0x000a1003 (0.125000 Watts, 0.15 Joules, 0.000977 
sec.)
kvmadmin@KurtSrv2018(//):~ [243]$

    ^^^ Exit code

With the patch:

kvmadmin@KurtSrv2018(//):~ [243]$ sudo 
/casa/src/linux-stable/tools/power/x86/turbostat/turbostat   
turbostat version 20.09.30 - Len Brown 

CPUID(0): AuthenticAMD 0x10 CPUID levels; 0x8023 xlevels; 
family:model:stepping 0x19:21:0 (25:33:0)
CPUID(1): SSE3 MONITOR - - - TSC MSR - HT - 

CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, 
No-HWPepp, No-HWPpkg, No-EPB 
CPUID(7): No-SGX

RAPL: 234 sec. Joule Counter Range, at 280 Watts

/dev/cpu_dma_latency: 20 usec (default) 

current_driver: acpi_idle   

current_governor: menu  

current_governor_ro: menu   

cpu28: POLL: CPUIDLE CORE POLL IDLE 

cpu28: C1: ACPI FFH MWAIT 0x0   

cpu28: C2: ACPI IOPORT 0x414

cpu28: cpufreq driver: acpi-cpufreq 

cpu28: cpufreq governor: schedutil  

cpufreq boost: 1

cpu0: MSR_RAPL_PWR_UNIT: 0x000a1003 (0.125000 Watts, 0.15 Joules, 0.000977 
sec.)   
Core    CPU Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ POLL    C1  C2  
POLL%   C1% C2% CorWatt PkgWatt
-   -   27  1.04    2562    3411    16046   33  2931    12895   
0.00    0.85    98.48   1.57    18.81
0   0   12  0.55    2193    3400    885 1   111 757 
0.00    1.12    98.42   0.04    18.74
0   16  1   0.05    2351    3400    53  0   3   54  
0.00    0.05    99.92  
1   1   20  0.89    2261    3400    478 0   39  427 
0.00    0.37    98.80   0.06
1   17  9   0.40    2329    3400    308 0   38  282 
0.00    0.35    99.29  
[...]

-- 
Kurt Garloff 
Cologne, Germany

On 26/12/2020 13:13, Kurt Garloff wrote:
> Hi Len,
>
> find attached fix to avoid exiting with -13 on Zen. Patch is against 
> turbostat as included in Linux-5.10.2.
> Please merge.
>
> PS: This is probably material for -stable, as it used to work before on Zen 
> (Zen2 aka Ryzen 3000 in my case).
>

RE: [PATCH 3/8] scsi: ufshpb: Add region's reads counter

2021-02-01 Thread Avri Altman

 
> On Mon, Feb 01, 2021 at 08:17:59AM +, Avri Altman wrote:
> > >
> > > On Mon, Feb 01, 2021 at 07:51:19AM +, Avri Altman wrote:
> > > > >
> > > > > On Mon, Feb 01, 2021 at 07:12:53AM +, Avri Altman wrote:
> > > > > > > > +#define WORK_PENDING 0
> > > > > > > > +#define ACTIVATION_THRSHLD 4 /* 4 IOs */
> > > > > > > Rather than fixing it with macro, how about using sysfs and make 
> > > > > > > it
> > > > > > > configurable?
> > > > > > Yes.
> > > > > > I will add a patch making all the logic configurable.
> > > > > > As all those are hpb-related parameters, I think module parameters
> are
> > > > > more adequate.
> > > > >
> > > > > No, this is not the 1990's, please never add new module parameters to
> > > > > drivers.  If not for the basic problem of they do not work on a
> > > > > per-device basis, but on a per-driver basis, which is what you almost
> > > > > never want.
> > > > OK.
> > > >
> > > > >
> > > > > But why would you want to change this value, why can't the driver
> "just
> > > > > work" and not need manual intervention?
> > > > It is.
> > > > But those are a knobs each vendor may want to tweak,
> > > > So it'll be optimized with its internal device's implementation.
> > > >
> > > > Tweaking the parameters, as well as the entire logic, is really an 
> > > > endless
> > > task.
> > > > Some logic works better for some scenarios, while falling behind on
> others.
> > >
> > > Shouldn't the hardware know how to handle this dynamically?  If not, how
> > > is a user going to know?
> > There is one "brain".
> > It is either in the device - in device mode, Or in the host - in host mode
> control.
> > The "brain" decides which region is active, thus carrying the physical 
> > address
> along with the logical -
> > minimizing context switches in the device's RAM.
> >
> > There can be up to N active regions.
> > Activation and deactivation has its overhead.
> > So basically it is a constraint-optimization problem.
> 
> So how do you solve it?  And how would you expect a user to solve it if
> the kernel can not?
> 
> You better document the heck out of these configuration options :)
Yes.  Will do.

Thanks,
Avri

Re: [PATCH 01/11] x86/fault: Fix AMD erratum #91 errata fixup for user code

2021-02-01 Thread Christoph Hellwig

On Sun, Jan 31, 2021 at 09:24:32AM -0800, Andy Lutomirski wrote:
> While we're at it, disable the workaround on all CPUs except AMD Family
> 0xF.  By my reading of the Revision Guide for AMD Athlon™ 64 and AMD
> Opteron™ Processors, only family 0xF is affected.

I think it would be better to have one no risk refression fix that
just probes both user and kernel addresses and a separate one to
restrict the workaround.

> + if (likely(boot_cpu_data.x86_vendor != X86_VENDOR_AMD
> +|| boot_cpu_data.x86 != 0xf))

Normally kernel style would be to have the || on the first line.

WARNING in cfg80211_inform_single_bss_frame_data

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:6642d600 Merge tag '5.11-rc5-smb3' of git://git.samba.org/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13f36b44d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=96b123631a6700e9
dashboard link: https://syzkaller.appspot.com/bug?extid=405843667e93b9790fc1
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13fdeca0d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=173e9028d0

The issue was bisected to:

commit 4abb52a46e7336c1e568a53761c8b7a81bbaaeaf
Author: Sara Sharon 
Date:   Wed Jan 16 10:14:41 2019 +

mac80211: pass bssids to elements parsing function

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=153eaac4d0
final oops: https://syzkaller.appspot.com/x/report.txt?x=173eaac4d0
console output: https://syzkaller.appspot.com/x/log.txt?x=133eaac4d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+405843667e93b9790...@syzkaller.appspotmail.com
Fixes: 4abb52a46e73 ("mac80211: pass bssids to elements parsing function")

[ cut here ]
WARNING: CPU: 1 PID: 18 at net/wireless/scan.c:2337 
cfg80211_inform_single_bss_frame_data+0xc7f/0xe90 net/wireless/scan.c:2337
Modules linked in:
CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 5.11.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:cfg80211_inform_single_bss_frame_data+0xc7f/0xe90 
net/wireless/scan.c:2337
Code: 0f 0b 45 31 e4 e9 37 fb ff ff e8 3c 4a 3c f9 0f 0b 45 31 e4 e9 28 fb ff 
ff e8 2d 4a 3c f9 0f 0b e9 58 f4 ff ff e8 21 4a 3c f9 <0f> 0b 45 31 e4 e9 0d fb 
ff ff e8 12 4a 3c f9 0f 0b e9 4f fd ff ff
RSP: 0018:c9d874c0 EFLAGS: 00010246
RAX:  RBX: c9d87a38 RCX: 0100
RDX: 888010db3780 RSI: 8836717f RDI: 0003
RBP: 888011512c00 R08: 0023 R09: 0080
R10: 883666e3 R11: 001c R12: 0023
R13: 8880183b0580 R14: 0080 R15: 0080
FS:  () GS:8880b9f0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 2200 CR3: 0b08e000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 cfg80211_inform_bss_frame_data+0xa7/0xb10 net/wireless/scan.c:2433
 ieee80211_bss_info_update+0x3ce/0xb20 net/mac80211/scan.c:190
 ieee80211_scan_rx+0x45f/0x7c0 net/mac80211/scan.c:299
 __ieee80211_rx_handle_packet net/mac80211/rx.c:4558 [inline]
 ieee80211_rx_list+0x1faf/0x2430 net/mac80211/rx.c:4746
 ieee80211_rx_napi+0xf7/0x3d0 net/mac80211/rx.c:4769
 ieee80211_rx include/net/mac80211.h:4508 [inline]
 ieee80211_tasklet_handler+0xd4/0x130 net/mac80211/main.c:235
 tasklet_action_common.constprop.0+0x1d7/0x2d0 kernel/softirq.c:555
 __do_softirq+0x2bc/0xa29 kernel/softirq.c:343
 run_ksoftirqd kernel/softirq.c:650 [inline]
 run_ksoftirqd+0x2d/0x50 kernel/softirq.c:642
 smpboot_thread_fn+0x655/0x9e0 kernel/smpboot.c:165
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

WARNING: suspicious RCU usage in kernfs_iop_permission

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:6642d600 Merge tag '5.11-rc5-smb3' of git://git.samba.org/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=171405bf50
kernel config:  https://syzkaller.appspot.com/x/.config?x=9408d1770a50819c
dashboard link: https://syzkaller.appspot.com/bug?extid=0e507d08417ca2d565bf
compiler:   clang version 11.0.1
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13b8a330d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16f3628cd0

The issue was bisected to:

commit 89bdfaf93d9157499c3a0d61f489df66f2dead7f
Author: Miklos Szeredi 
Date:   Mon Dec 14 14:26:14 2020 +

ovl: make ioctl() safe

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11b85fb4d0
final oops: https://syzkaller.appspot.com/x/report.txt?x=13b85fb4d0
console output: https://syzkaller.appspot.com/x/log.txt?x=15b85fb4d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0e507d08417ca2d56...@syzkaller.appspotmail.com
Fixes: 89bdfaf93d91 ("ovl: make ioctl() safe")

=
WARNING: suspicious RCU usage
5.11.0-rc5-syzkaller #0 Not tainted
-
kernel/sched/core.c:7932 Illegal context switch in RCU-sched read-side critical 
section!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 0
no locks held by systemd/1.

stack backtrace:
CPU: 0 PID: 1 Comm: systemd Not tainted 5.11.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x137/0x1be lib/dump_stack.c:120
 ___might_sleep+0xb4/0x530 kernel/sched/core.c:7932
 __mutex_lock_common+0x4e/0x2f00 kernel/locking/mutex.c:935
 __mutex_lock kernel/locking/mutex.c:1103 [inline]
 mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1118
 kernfs_iop_permission+0x66/0x2f0 fs/kernfs/inode.c:284
 do_inode_permission fs/namei.c:398 [inline]
 inode_permission+0x234/0x4a0 fs/namei.c:463
 may_lookup fs/namei.c:1575 [inline]
 link_path_walk+0x226/0xc10 fs/namei.c:2128
 path_openat+0x1f5/0x37a0 fs/namei.c:3367
 do_filp_open+0x191/0x3a0 fs/namei.c:3398
 do_sys_openat2+0xba/0x380 fs/open.c:1172
 do_sys_open fs/open.c:1188 [inline]
 __do_sys_open fs/open.c:1196 [inline]
 __se_sys_open fs/open.c:1192 [inline]
 __x64_sys_open+0x1af/0x1e0 fs/open.c:1192
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fdaf5f1370d
Code: 30 2c 00 00 75 10 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 
ec 08 e8 fe 9d 01 00 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 
47 9e 01 00 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:7ffedccb56f0 EFLAGS: 0293 ORIG_RAX: 0002
RAX: ffda RBX: 55b998e80590 RCX: 7fdaf5f1370d
RDX: 01b6 RSI: 0008 RDI: 7ffedccb57d0
RBP: 0008 R08: 0008 R09: 0001
R10: 0008 R11: 0293 R12: 7fdaf764d7b4
R13: 0001 R14: 55b998d5ad60 R15: 7ffedccb57d0


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Re: [PATCH] KVM: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

2021-02-01 Thread Paolo Bonzini


On 01/02/21 09:38, Jiapeng Chong wrote:

Fix the following coccicheck warning:

./arch/x86/kvm/debugfs.c:44:0-23: WARNING: vcpu_tsc_scaling_frac_fops
should be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./arch/x86/kvm/debugfs.c:36:0-23: WARNING: vcpu_tsc_scaling_fops should
be defined with DEFINE_DEBUGFS_ATTRIBUTE.

./arch/x86/kvm/debugfs.c:27:0-23: WARNING: vcpu_tsc_offset_fops should
be defined with DEFINE_DEBUGFS_ATTRIBUTE.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
  arch/x86/kvm/debugfs.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index 7e818d6..9c0e29e 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -15,7 +15,7 @@ static int vcpu_get_timer_advance_ns(void *data, u64 *val)
return 0;
  }
  
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_timer_advance_ns_fops, vcpu_get_timer_advance_ns, NULL, "%llu\n");

+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_timer_advance_ns_fops, vcpu_get_timer_advance_ns, NULL, 
"%llu\n");
  
  static int vcpu_get_tsc_offset(void *data, u64 *val)

  {
@@ -24,7 +24,7 @@ static int vcpu_get_tsc_offset(void *data, u64 *val)
return 0;
  }
  
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_offset_fops, vcpu_get_tsc_offset, NULL, "%lld\n");

+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_offset_fops, vcpu_get_tsc_offset, NULL, 
"%lld\n");
  
  static int vcpu_get_tsc_scaling_ratio(void *data, u64 *val)

  {
@@ -33,7 +33,7 @@ static int vcpu_get_tsc_scaling_ratio(void *data, u64 *val)
return 0;
  }
  
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_fops, vcpu_get_tsc_scaling_ratio, NULL, "%llu\n");

+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_scaling_fops, vcpu_get_tsc_scaling_ratio, NULL, 
"%llu\n");
  
  static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)

  {
@@ -41,7 +41,8 @@ static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 
*val)
return 0;
  }
  
-DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");

+DEFINE_DEBUGFS_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, 
vcpu_get_tsc_scaling_frac_bits,
+ NULL, "%llu\n");
  
  void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu, struct dentry *debugfs_dentry)

  {



If you wanted to do this, you would have to use 
debugfs_create_file_unsafe() as well.


In practice, nobody does that because it's not a performance-sensitive 
path and it's not worth the maintenance cost of using a less safe API.


Paolo

Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Saravana Kannan

On Mon, Feb 1, 2021 at 12:05 AM Marek Szyprowski
 wrote:
>
> Hi Saravana,
>
> On 30.01.2021 05:08, Saravana Kannan wrote:
> > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> > wrote:
> >> This patch series solves two general issues with fw_devlink=on
> >>
> >> Patch 1/2 addresses the issue of firmware nodes that look like they'll
> >> have struct devices created for them, but will never actually have
> >> struct devices added for them. For example, DT nodes with a compatible
> >> property that don't have devices added for them.
> >>
> >> Patch 2/2 address (for static kernels) the issue of optional suppliers
> >> that'll never have a driver registered for them. So, if the device could
> >> have probed with fw_devlink=permissive with a static kernel, this patch
> >> should allow those devices to probe with a fw_devlink=on. This doesn't
> >> solve it for the case where modules are enabled because there's no way
> >> to tell if a driver will never be registered or it's just about to be
> >> registered. I have some other ideas for that, but it'll have to come
> >> later thinking about it a bit.
> >>
> >> These two patches might remove the need for several other patches that
> >> went in as fixes for commit e590474768f1 ("driver core: Set
> >> fw_devlink=on by default"), but I think all those fixes are good
> >> changes. So I think we should leave those in.
> >>
> >> Marek, Geert,
> >>
> >> Can you try this series on a static kernel with your OF_POPULATED
> >> changes reverted? I just want to make sure these patches can identify
> >> and fix those cases.
> >>
> >> Tudor,
> >>
> >> You should still make the clock driver fix (because it's a bug), but I
> >> think this series will fix your issue too (even without the clock driver
> >> fix). Can you please give this a shot?
> > Marek, Geert, Tudor,
> >
> > Forgot to say that this will probably fix your issues only in a static
> > kernel. So please try this with a static kernel. If you can also try
> > and confirm that this does not fix the issue for a modular kernel,
> > that'd be good too.
>
> I've checked those patches on top of linux next-20210129 with
> c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform
> driver") commit reverted.

Hi Marek,

Thanks for testing!

> Sadly it doesn't help.

That sucks. I even partly "tested" it out on my platform (that needs
CONFIG_MODULES) by commenting out the CONFIG_MODULES check. And I saw
some device links getting dropped.

> All devices that belong

By belong, I assume you meant "are consumers"?

> to the Exynos power domains are never probed and stay endlessly on the
> deferred devices list. I've used static kernel build - the one from
> exynos_defconfig.

Can you enable the dev_dbg in __device_link_del() (the SRCU variant)?
Hopefully at least some of the device links would be dropped?

If the PD device link is not dropped, I wonder why this condition is
not hitting for consumers of the PD.

if (fw_devlink_def_probe_retry &&
link->flags & DL_FLAG_INFERRED &&
!device_links_probe_blocked_by(link->supplier)) {
device_link_drop_managed(link);
continue;
}

Could you try logging dev, link->supplier and
device_links_probe_blocked_by() return value. That should tell when a
consumer is waiting on a PD, why the PD might appear as waiting on
something else. I can't imagine the DL_FLAG_INFERRED being cleared
(it'll only happen when a driver/framework explicitly creates a device
link). Remind me again where the DT for this board is? Does the PD
depend on something else?

One other possibility is that some of the consumers of the PD could be
using the *_platform_driver_probe() macro/function that never
reattempts a probe. So even though this patch might drop the device
links, the consumer never tries again.

-Saravana

Re: [PATCH V2] scsi: ufs: Add UFS3.0 in ufs HCI version check

2021-02-01 Thread nitirawa


On 2021-01-30 00:55, Bean Huo wrote:

On Tue, 2021-01-19 at 17:37 +0530, Nitin Rawat wrote:

As per JESD223D UFS HCI v3.0 spec, HCI version 3.0
is also supported. Hence Adding UFS3.0 in UFS HCI
version check to avoid logging of the error message.

Signed-off-by: Nitin Rawat 
---
 drivers/scsi/ufs/ufshcd.c | 5 +++--
 drivers/scsi/ufs/ufshci.h | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 82ad317..54ca765 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -9255,8 +9255,9 @@ int ufshcd_init(struct ufs_hba *hba, void
__iomem *mmio_base, unsigned int irq)
if ((hba->ufs_version != UFSHCI_VERSION_10) &&
(hba->ufs_version != UFSHCI_VERSION_11) &&
(hba->ufs_version != UFSHCI_VERSION_20) &&
-   (hba->ufs_version != UFSHCI_VERSION_21))
-   dev_err(hba->dev, "invalid UFS version 0x%x\n",
+   (hba->ufs_version != UFSHCI_VERSION_21) &&
+   (hba->ufs_version != UFSHCI_VERSION_30))
+   dev_err(hba->dev, "invalid UFS HCI version 0x%x\n",
hba->ufs_version);


Hi Nitin
Except HCI 1.0 / 1.1 / 2.0 / 2.1 / 3.0, do you have the other UFS HCI
version? if no, current driver supports all of them,  instead of
scaling these check, and avoid logging of the error message, I suggest
you can directly delete these redundant checkup.

If there is a weird HCI version that not supported by the current
driver, you can only add an unsupported checkup list. thus, you don't
need to scale this useless checkup.

Bean


Hi Bean,
That's a good suggestion. If nobody has any concern, i will
post new patchset by removing these redundant check.

Regards,
Nitin

KMSAN: uninit-value in crc32_le_base (2)

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:73d62e81 kmsan: random: prevent boot-time reports in _mix_..
git tree:   https://github.com/google/kmsan.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=120020acd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=31d3b433c9628854
dashboard link: https://syzkaller.appspot.com/bug?extid=ce18ece82e1fede33bf7
compiler:   clang version 11.0.1
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ce18ece82e1fede33...@syzkaller.appspotmail.com

=
BUG: KMSAN: uninit-value in crc32_body lib/crc32.c:110 [inline]
BUG: KMSAN: uninit-value in crc32_le_generic lib/crc32.c:179 [inline]
BUG: KMSAN: uninit-value in crc32_le_base+0x558/0xe70 lib/crc32.c:197
CPU: 1 PID: 11631 Comm: segctord Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:118
 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
 crc32_body lib/crc32.c:110 [inline]
 crc32_le_generic lib/crc32.c:179 [inline]
 crc32_le_base+0x558/0xe70 lib/crc32.c:197
 nilfs_segbuf_fill_in_segsum_crc fs/nilfs2/segbuf.c:182 [inline]
 nilfs_add_checksums_on_logs+0x388/0xde0 fs/nilfs2/segbuf.c:320
 nilfs_segctor_do_construct+0x7dd6/0xbf20 fs/nilfs2/segment.c:2072
 nilfs_segctor_construct+0x302/0x1040 fs/nilfs2/segment.c:2377
 nilfs_segctor_thread_construct+0xd6/0x840 fs/nilfs2/segment.c:2485
 nilfs_segctor_thread+0xf35/0x13c0 fs/nilfs2/segment.c:2568
 kthread+0x51c/0x560 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Uninit was stored to memory at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
 kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289
 kmsan_memcpy_memmove_metadata+0x25e/0x2d0 mm/kmsan/kmsan.c:226
 kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:246
 __msan_memcpy+0x46/0x60 mm/kmsan/kmsan_instr.c:110
 nilfs_write_dat_node_binfo+0x17a/0x370 fs/nilfs2/segment.c:663
 nilfs_segctor_update_payload_blocknr fs/nilfs2/segment.c:1602 [inline]
 nilfs_segctor_assign fs/nilfs2/segment.c:1625 [inline]
 nilfs_segctor_do_construct+0x4b00/0xbf20 fs/nilfs2/segment.c:2052
 nilfs_segctor_construct+0x302/0x1040 fs/nilfs2/segment.c:2377
 nilfs_segctor_thread_construct+0xd6/0x840 fs/nilfs2/segment.c:2485
 nilfs_segctor_thread+0xf35/0x13c0 fs/nilfs2/segment.c:2568
 kthread+0x51c/0x560 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Local variable binfo.i.i@nilfs_segctor_do_construct created at:
 nilfs_segctor_update_payload_blocknr fs/nilfs2/segment.c:1558 [inline]
 nilfs_segctor_assign fs/nilfs2/segment.c:1625 [inline]
 nilfs_segctor_do_construct+0x3deb/0xbf20 fs/nilfs2/segment.c:2052
 nilfs_segctor_update_payload_blocknr fs/nilfs2/segment.c:1558 [inline]
 nilfs_segctor_assign fs/nilfs2/segment.c:1625 [inline]
 nilfs_segctor_do_construct+0x3deb/0xbf20 fs/nilfs2/segment.c:2052
=


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

KMSAN: uninit-value in reiserfs_new_inode

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:73d62e81 kmsan: random: prevent boot-time reports in _mix_..
git tree:   https://github.com/google/kmsan.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=12a62e6f50
kernel config:  https://syzkaller.appspot.com/x/.config?x=31d3b433c9628854
dashboard link: https://syzkaller.appspot.com/bug?extid=2a318f14e5e6bb69b96b
compiler:   clang version 11.0.1
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2a318f14e5e6bb69b...@syzkaller.appspotmail.com

=
BUG: KMSAN: uninit-value in reiserfs_new_inode+0x207c/0x3c30 
fs/reiserfs/inode.c:2058
CPU: 0 PID: 8539 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:118
 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
 reiserfs_new_inode+0x207c/0x3c30 fs/reiserfs/inode.c:2058
 reiserfs_create+0x89b/0xf00 fs/reiserfs/namei.c:667
 xattr_create fs/reiserfs/xattr.c:69 [inline]
 xattr_lookup+0x495/0x6a0 fs/reiserfs/xattr.c:412
 reiserfs_xattr_set_handle+0x1eb/0x2ab0 fs/reiserfs/xattr.c:540
 reiserfs_xattr_set+0x84d/0x9f0 fs/reiserfs/xattr.c:640
 trusted_set+0x1ea/0x260 fs/reiserfs/xattr_trusted.c:30
 __vfs_setxattr+0x90e/0x960 fs/xattr.c:177
 __vfs_setxattr_noperm+0x376/0xc70 fs/xattr.c:208
 __vfs_setxattr_locked+0x5ed/0x690 fs/xattr.c:266
 vfs_setxattr+0x1e4/0x4d0 fs/xattr.c:283
 setxattr+0x446/0x900 fs/xattr.c:548
 path_setxattr+0x2cd/0x4e0 fs/xattr.c:567
 __do_sys_setxattr fs/xattr.c:582 [inline]
 __se_sys_setxattr+0xee/0x110 fs/xattr.c:578
 __ia32_sys_setxattr+0x62/0x80 fs/xattr.c:578
 do_syscall_32_irqs_on arch/x86/entry/common.c:80 [inline]
 __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:139
 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:162
 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:205
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
RIP: 0023:0xf7fa5549
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 
00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 
eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:f555d0cc EFLAGS: 0296 ORIG_RAX: 00e2
RAX: ffda RBX: 21c0 RCX: 2280
RDX: 22c0 RSI: 0001 RDI: 0003
RBP:  R08:  R09: 
R10:  R11:  R12: 
R13:  R14:  R15: 

Uninit was created at:
 kmsan_save_stack_with_flags+0x3c/0x90 mm/kmsan/kmsan.c:121
 kmsan_alloc_page+0xd3/0x1f0 mm/kmsan/kmsan_shadow.c:274
 __alloc_pages_nodemask+0x827/0xf90 mm/page_alloc.c:4989
 alloc_pages_current+0x7b6/0xb60 mm/mempolicy.c:2271
 alloc_pages include/linux/gfp.h:547 [inline]
 alloc_slab_page mm/slub.c:1630 [inline]
 allocate_slab+0x346/0x11a0 mm/slub.c:1773
 new_slab mm/slub.c:1834 [inline]
 new_slab_objects mm/slub.c:2593 [inline]
 ___slab_alloc+0xd42/0x1930 mm/slub.c:2756
 __slab_alloc mm/slub.c:2796 [inline]
 slab_alloc_node mm/slub.c:2871 [inline]
 slab_alloc mm/slub.c:2915 [inline]
 kmem_cache_alloc+0xb71/0x1040 mm/slub.c:2920
 reiserfs_alloc_inode+0x5a/0x170 fs/reiserfs/super.c:642
 alloc_inode fs/inode.c:234 [inline]
 iget5_locked+0x1d7/0x990 fs/inode.c:1150
 reiserfs_fill_super+0x29a5/0x6010 fs/reiserfs/super.c:2063
 mount_bdev+0x618/0x900 fs/super.c:1419
 get_super_block+0xc9/0xe0 fs/reiserfs/super.c:2606
 legacy_get_tree+0x163/0x2e0 fs/fs_context.c:592
 vfs_get_tree+0xd8/0x5e0 fs/super.c:1549
 do_new_mount fs/namespace.c:2875 [inline]
 path_mount+0x3df0/0x5e50 fs/namespace.c:3205
 do_mount fs/namespace.c:3218 [inline]
 __do_sys_mount fs/namespace.c:3426 [inline]
 __se_sys_mount+0x921/0xa10 fs/namespace.c:3403
 __ia32_sys_mount+0x62/0x80 fs/namespace.c:3403
 do_syscall_32_irqs_on arch/x86/entry/common.c:80 [inline]
 __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:139
 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:162
 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:205
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
=


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Re: [PATCH v2 3/3] arm64: dts: mediatek: mt8183: Add domain supply for mfg

2021-02-01 Thread Matthias Brugger

On 31/01/2021 13:05, Matthias Brugger wrote:
> 
> 
> On 29/01/2021 11:12, Hsin-Yi Wang wrote:
>> Add domain supply node.
>>
>> Signed-off-by: Hsin-Yi Wang 
>> ---
> 
> Applied to v5.11-next/dts64
> 

I just realiezed that we will also need a patch for the MT8183 EVB. I'll leave
this series in, but please provide a follow-up patch for the dts.

Thanks.
Matthias

> Thanks
> 
>>  arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 
>>  arch/arm64/boot/dts/mediatek/mt8183.dtsi   | 2 +-
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi 
>> b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
>> index bf2ad1294dd30..ebd53755d538a 100644
>> --- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
>> +++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
>> @@ -709,6 +709,10 @@ cros_ec {
>>  };
>>  };
>>  
>> +&mfg {
>> +domain-supply = <&mt6358_vgpu_reg>;
>> +};
>> +
>>  &soc_data {
>>  status = "okay";
>>  };
>> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
>> b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> index 5b782a4769e7e..bda283fa92452 100644
>> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
>> @@ -360,7 +360,7 @@ power-domain@MT8183_POWER_DOMAIN_MFG_ASYNC {
>>  #size-cells = <0>;
>>  #power-domain-cells = <1>;
>>  
>> -power-domain@MT8183_POWER_DOMAIN_MFG {
>> +mfg: 
>> power-domain@MT8183_POWER_DOMAIN_MFG {
>>  reg = ;
>>  #address-cells = <1>;
>>  #size-cells = <0>;
>>

Re: [PATCH v2] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off

2021-02-01 Thread Paolo Bonzini


On 01/02/21 09:46, Paolo Bonzini wrote:


This comment be updated to call out the new TSX_CTRL behavior.

/*
 * On TAA affected systems:
 *  - nothing to do if TSX is disabled on the host.
 *  - we emulate TSX_CTRL if present on the host.
 *  This lets the guest use VERW to clear CPU buffers.
 */


Ok.


Hmm, but the comment is even more accurate now than before, isn't it? 
It said nothing about hiding TSX_CTRL, so now it matches the code below.


Paolo

Re: [PATCH 06/11] x86/fault: Improve kernel-executing-user-memory handling

2021-02-01 Thread Christoph Hellwig

On Sun, Jan 31, 2021 at 09:24:37AM -0800, Andy Lutomirski wrote:
>  #if defined(CONFIG_X86_64) && defined(CONFIG_CPU_SUP_AMD)
> + if (likely(boot_cpu_data.x86_vendor != X86_VENDOR_AMD
> +|| boot_cpu_data.x86 != 0xf))

Same nitpick as for the other patch.  Maybe we wan a little inline
helper for the specific erratum that includes the vendor and family
checks in adddition to using IS_ENABLED for the config options?

linux-next: qemu boot failure after merge of the tip tree

2021-02-01 Thread Stephen Rothwell

Hi all,

After merging the tip tree, today's linux-next qemu boot test (powerpc
pseries_le_defconfig) failed like this:

[0.005355][T1] smp: Brought up 1 node, 1 CPU
[0.005415][T1] numa: Node 0 CPUs: 0
[0.005496][T1] BUG: Unable to handle kernel instruction fetch (NULL 
pointer?)
[0.005559][T1] Faulting instruction address: 0x
[0.005613][T1] Oops: Kernel access of bad area, sig: 11 [#1]
[0.005665][T1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[0.005719][T1] Modules linked in:
[0.005754][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.11.0-rc6 #2
[0.005808][T1] NIP:   LR: c01a22ac CTR: 
0001
[0.005870][T1] REGS: c63a3860 TRAP: 0480   Not tainted  
(5.11.0-rc6)
[0.005933][T1] MSR:  82009033   CR: 
24002242  XER: 2000
[0.006014][T1] CFAR: c01a22a8 IRQMASK: 0 
[0.006014][T1] GPR00: c01a21ac c63a3b00 
c1439400  
[0.006014][T1] GPR04:  00c4 
0001 c1509400 
[0.006014][T1] GPR08:  c11f5af0 
7eaa 0001 
[0.006014][T1] GPR12: 0001 c161 
c6350f18 0001 
[0.006014][T1] GPR16: c1507bb0  
c12106b0 c146dce0 
[0.006014][T1] GPR20: c6054a90 0001 
 8ad0 
[0.006014][T1] GPR24: 8ad0 c6054a00 
 c6055000 
[0.006014][T1] GPR28:  c6350f00 
c6350f00 c1472380 
[0.006590][T1] NIP [] 0x0
[0.006633][T1] LR [c01a22ac] build_sched_domains+0x47c/0x1500
[0.006687][T1] Call Trace:
[0.006719][T1] [c63a3b00] [c01a21ac] 
build_sched_domains+0x37c/0x1500 (unreliable)
[0.006794][T1] [c63a3c40] [c01a42d0] 
sched_init_domains+0xe0/0x120
[0.006858][T1] [c63a3c90] [c1075f38] 
sched_init_smp+0x50/0xc4
[0.006922][T1] [c63a3cc0] [c10545a4] 
kernel_init_freeable+0x1d4/0x398
[0.006987][T1] [c63a3da0] [c0013144] 
kernel_init+0x2c/0x168
[0.007051][T1] [c63a3e10] [c000dff0] 
ret_from_kernel_thread+0x5c/0x6c
[0.007116][T1] Instruction dump:
[0.007150][T1]       
  
[0.007226][T1]       
  
[0.007310][T1] ---[ end trace e117133fa9cbc962 ]---

(full boot log attached)

Presumably caused by commit

  620a6dc40754 ("sched/topology: Make sched_init_numa() use a set for the 
deduplicating sort")

I note a similar report from the kernel test robot on LKML.

I have reverted that commit for today (which fixed the boot failure).

-- 
Cheers,
Stephen Rothwell
spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -enable-kvm 
-kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd 
./ppc64le-rootfs.cpio.gz
KVM: Failed to create TCE64 table for liobn 0x7102
KVM: Failed to create TCE64 table for liobn 0x7103
KVM: Failed to create TCE64 table for liobn 0x8000


SLOF[0m[?25l 
**
[1mQEMU Starting
[0m Build Date = Dec 29 2020 12:07:03
 FW Version = release 20200717
 Press "s" to enter Open Firmware.

[0m[?25hCC0100C0120C0140C0200C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating
 /vdevice methods
Populating /vdevice/vty@7100
Populating /vdevice/nvram@7101
Populating /vdevice/l-lan@7102
Populating /vdevice/v-scsi@7103
   SCSI: Looking for devices
  8200 CD-ROM   : "QEMU QEMU CD-ROM  2.5+"
C05A0Populating /pci@8002000
C0600C06C0C0700C0800C0880No NVRAM common partition, re-initializing...
C0890C08A0C08A8C08B0Scanning USB 
C08C0C08D0Using default console: /vdevice/vty@7100
C08E0C08E8Detected RAM kernel at 40 (160a5a8 bytes) 
C08FF 
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@7100
Preparing to boot Linux version 5.11.0-rc6 (sfr@ash) (gcc (Debian 10.2.1-3) 
10.2.1 20201224, GNU ld (GNU Binutils for Debian) 2.35.1) #2 SMP Mon Feb 1 
19:12:28 AEDT 2021
Detected machine type: 0101
command line:  
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 00

Re: [PATCH V5 3/4] s390/mm: Define arch_get_mappable_range()

2021-02-01 Thread David Hildenbrand


On 01.02.21 04:25, Anshuman Khandual wrote:

This overrides arch_get_mappabble_range() on s390 platform which will be
used with recently added generic framework. It modifies the existing range
check in vmem_add_mapping() using arch_get_mappable_range(). It also adds a
VM_BUG_ON() check that would ensure that mhp_range_allowed() has already
been called on the hotplug path.

Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: David Hildenbrand 
Cc: linux-s...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Heiko Carstens 
Signed-off-by: Anshuman Khandual 
---
  arch/s390/mm/init.c |  1 +
  arch/s390/mm/vmem.c | 14 +-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 73a163065b95..0e76b2127dc6 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -297,6 +297,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
return -EINVAL;
  
+	VM_BUG_ON(!mhp_range_allowed(start, size, true));

rc = vmem_add_mapping(start, size);
if (rc)
return rc;
diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index 01f3a5f58e64..82dbf9450105 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -4,6 +4,7 @@
   *Author(s): Heiko Carstens 
   */
  
+#include 

  #include 
  #include 
  #include 
@@ -532,11 +533,22 @@ void vmem_remove_mapping(unsigned long start, unsigned 
long size)
mutex_unlock(&vmem_mutex);
  }
  
+struct range arch_get_mappable_range(void)

+{
+   struct range mhp_range;
+
+   mhp_range.start = 0;
+   mhp_range.end =  VMEM_MAX_PHYS - 1;
+   return mhp_range;
+}
+
  int vmem_add_mapping(unsigned long start, unsigned long size)
  {
+   struct range range = arch_get_mappable_range();
int ret;
  
-	if (start + size > VMEM_MAX_PHYS ||

+   if (start < range.start ||
+   start + size > range.end + 1 ||
start + size < start)
return -ERANGE;
  



Reviewed-by: David Hildenbrand 

--
Thanks,

David / dhildenb

Re: WARNING in sta_info_insert_check

2021-02-01 Thread Johannes Berg

On Sun, 2021-01-31 at 21:26 -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:bec4c296 Merge tag 'ecryptfs-5.11-rc6-setxattr-fix' of git..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11991778d0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f75d66d6d359ef2f
> dashboard link: https://syzkaller.appspot.com/bug?extid=8dcc087eb24227ded47e
> userspace arch: arm64
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+8dcc087eb24227ded...@syzkaller.appspotmail.com

Looks like this is a dup.

#syz dup: WARNING in sta_info_insert_rcu

Just in this case sta_info_insert_check() didn't get inlined into
sta_info_insert_rcu().

johannes

Re: [PATCH 09/11] x86/fault: Rename no_context() to kernelmode_fixup_or_oops()

2021-02-01 Thread Christoph Hellwig

On Sun, Jan 31, 2021 at 09:24:40AM -0800, Andy Lutomirski wrote:
> + kernelmode_fixup_or_oops(regs, error_code, address, pkey, 
> si_code);

>   if (!user_mode(regs)) {
> - no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
> + kernelmode_fixup_or_oops(regs, error_code, address, SIGBUS, 
> BUS_ADRERR);

These overly long lines are a little annoying..

Re: [PATCH 2/2] soc: mediatek: pm-domains: Add domain_supply cap for mfg_async PD

2021-02-01 Thread Matthias Brugger




On 01/02/2021 06:45, Bilal Wasim wrote:
> The mfg_async power domain in mt8173 is used to power up imgtec
> gpu. This domain requires the da9211 regulator to be enabled before
> the power domain can be enabled successfully.
> 
> Signed-off-by: Bilal Wasim 
> ---
>  drivers/soc/mediatek/mt8173-pm-domains.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/soc/mediatek/mt8173-pm-domains.h 
> b/drivers/soc/mediatek/mt8173-pm-domains.h
> index 3e8ee5dabb43..065b8195e7d6 100644
> --- a/drivers/soc/mediatek/mt8173-pm-domains.h
> +++ b/drivers/soc/mediatek/mt8173-pm-domains.h
> @@ -63,6 +63,7 @@ static const struct scpsys_domain_data 
> scpsys_domain_data_mt8173[] = {
>   .ctl_offs = SPM_MFG_ASYNC_PWR_CON,
>   .sram_pdn_bits = GENMASK(11, 8),
>   .sram_pdn_ack_bits = 0,
> + .caps = MTK_SCPD_DOMAIN_SUPPLY,
>   },
>   [MT8173_POWER_DOMAIN_MFG_2D] = {
>   .sta_mask = PWR_STATUS_MFG_2D,
> 

We are missing a third patch for the DTS to actually add the regulator. Please
provide them for both, mt8173-evb.dts and mt8173-elm.dts

Thanks a lot and I'm very happy to see you starting to contribute!

Regards,
Matthias

linux-next: Tree for Feb 1

2021-02-01 Thread Stephen Rothwell

Hi all,

Changes since 20210129:

Removed tree: ia64 (deprecated with maintainer's permission)

The drm tree gained a conflict against Linus' tree.

The drm-misc tree gained a build failure so I used the version from
next-20210129.

The tip tree gained a boot failure so I reverted a commit.

The irqchip tree gained a conflict against the sunxi tree.

Non-merge commits (relative to Linus' tree): 6821
 7529 files changed, 276976 insertions(+), 214054 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig and htmldocs. And finally, a simple boot test
of the powerpc pseries_le_defconfig kernel in qemu (with and without
kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 332 trees (counting Linus' and 86 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (ac8c6edd20bc Merge tag 'efi-urgent-for-v5.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging fixes/fixes (e71ba9452f0b Linux 5.11-rc2)
Merging kbuild-current/fixes (ed4e9e615b7e Documentation/llvm: Add a section 
about supported architectures)
Merging arc-current/for-curr (7c53f6b671f4 Linux 5.11-rc3)
Merging arm-current/fixes (d80cd9abcd94 ARM: decompressor: tidy up register 
usage)
Merging arm64-fixes/for-next/fixes (a1df829ead58 ACPI/IORT: Do not blindly 
trust DMA masks from firmware)
Merging arm-soc-fixes/arm/fixes (e2fc2de8e1aa Merge tag 'amlogic-fixes-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into 
arm/fixes)
Merging drivers-memory-fixes/fixes (5c8fe583cce5 Linux 5.11-rc1)
Merging m68k-current/for-linus (2ae92e8b9b7e MAINTAINERS: Update m68k Mac entry)
Merging powerpc-fixes/fixes (66f0a9e058fa powerpc/vdso64: remove meaningless 
vgettimeofday.o build rule)
Merging s390-fixes/fixes (e82080e1f456 s390: uv: Fix sysfs max number of VCPUs 
reporting)
Merging sparc/master (0a95a6d1a4cd sparc: use for_each_child_of_node() macro)
Merging fscrypt-current/for-stable (d19d8d345eec fscrypt: fix inline encryption 
not used on new files)
Merging net/master (eb4e8fac00d1 neighbour: Prevent a dead entry from updating 
gc_list)
Merging bpf/master (06cc6e5dc659 Merge 
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf)
Merging ipsec/master (da64ae2d35d3 xfrm: Fix wraparound in 
xfrm_policy_addr_delta())
Merging netfilter/master (44a674d6f798 Merge tag 'mlx5-fixes-2021-01-26' of 
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux)
Merging ipvs/master (44a674d6f798 Merge tag 'mlx5-fixes-2021-01-26' of 
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux)
Merging wireless-drivers/master (93a1d4791c10 mt76: dma: fix a possible memory 
leak in mt76_add_fragment())
Merging mac80211/master (44a674d6f798 Merge tag 'mlx5-fixes-2021-01-26' of 
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux)
Merging rdma-fixes/for-rc (f1b0a8ea9f12 Revert "RDMA/rxe: Remove VLAN code 
leftovers from RXE")
Merging sound-current/for-linus (4961167bf748 ALSA: hda/via: Apply the 
workaround generically for Clevo machines)
Merging sound-asoc-fixes/for-linus (87277d99081a Merge remote-tracking branch 
'asoc/for-5.11' into asoc-linus)
Merging regmap-fixes/for-linus (19c329f68089 Linux 5.11-rc4)
Merging regulator-fixes/for-linus (b96353f3607a Merge remote-tracking branch 
'regulator/for-5.11' into regulator-linus)
Merging spi-fixes/for-linus (3277f2e72f86 Merge remote-tracking branch 
'spi/for-5.11' into spi-linus)
Merging pci-current/for-linus (7e69d07d7c3c Revert "PCI/ASPM: Save/restore L1

UBSAN: shift-out-of-bounds in ext4_mb_init

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:6642d600 Merge tag '5.11-rc5-smb3' of git://git.samba.org/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16f064acd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=9408d1770a50819c
dashboard link: https://syzkaller.appspot.com/bug?extid=a8b4b0c60155e87e9484
compiler:   clang version 11.0.1
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=127be3d8d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1065a9e8d0

The issue was bisected to:

commit cfd73237722135807967f389bcbda558a60a30d6
Author: Alex Zhuravlev 
Date:   Tue Apr 21 07:54:07 2020 +

ext4: add prefetching for block allocation bitmaps

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13b5c1d8d0
final oops: https://syzkaller.appspot.com/x/report.txt?x=1075c1d8d0
console output: https://syzkaller.appspot.com/x/log.txt?x=17b5c1d8d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a8b4b0c60155e87e9...@syzkaller.appspotmail.com
Fixes: cfd732377221 ("ext4: add prefetching for block allocation bitmaps")

loop0: detected capacity change from 264192 to 0

UBSAN: shift-out-of-bounds in fs/ext4/mballoc.c:2713:24
shift exponent 60 is too large for 32-bit type 'int'
CPU: 1 PID: 8433 Comm: syz-executor484 Not tainted 5.11.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x137/0x1be lib/dump_stack.c:120
 ubsan_epilogue lib/ubsan.c:148 [inline]
 __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395
 ext4_mb_init_backend fs/ext4/mballoc.c:2713 [inline]
 ext4_mb_init+0x19bc/0x19f0 fs/ext4/mballoc.c:2898
 ext4_fill_super+0xc2ec/0xfbe0 fs/ext4/super.c:4983
 mount_bdev+0x26c/0x3a0 fs/super.c:1366
 legacy_get_tree+0xea/0x180 fs/fs_context.c:592
 vfs_get_tree+0x86/0x270 fs/super.c:1496
 do_new_mount fs/namespace.c:2881 [inline]
 path_mount+0x17ad/0x2a00 fs/namespace.c:3211
 do_mount fs/namespace.c:3224 [inline]
 __do_sys_mount fs/namespace.c:3432 [inline]
 __se_sys_mount+0x28c/0x320 fs/namespace.c:3409
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44710a
Code: b8 08 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd ad fb ff c3 66 2e 0f 1f 
84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 
da ad fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:7ffc5b95ff48 EFLAGS: 0206 ORIG_RAX: 00a5
RAX: ffda RBX: 7ffc5b95ffa0 RCX: 0044710a
RDX: 2000 RSI: 2180 RDI: 7ffc5b95ff60
RBP: 7ffc5b95ff60 R08: 7ffc5b95ffa0 R09: 
R10:  R11: 0206 R12: 0013
R13: 0004 R14: 0003 R15: 0003



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Re: [PATCH v3 2/2] mm: fix initialization of struct page for holes in memory layout

2021-02-01 Thread David Hildenbrand


On 11.01.21 20:40, Mike Rapoport wrote:

From: Mike Rapoport 

There could be struct pages that are not backed by actual physical memory.
This can happen when the actual memory bank is not a multiple of
SECTION_SIZE or when an architecture does not register memory holes
reserved by the firmware as memblock.memory.

Such pages are currently initialized using init_unavailable_mem() function
that iterates through PFNs in holes in memblock.memory and if there is a
struct page corresponding to a PFN, the fields if this page are set to
default values and the page is marked as Reserved.

init_unavailable_mem() does not take into account zone and node the page
belongs to and sets both zone and node links in struct page to zero.

On a system that has firmware reserved holes in a zone above ZONE_DMA, for
instance in a configuration below:

# grep -A1 E820 /proc/iomem
7a17b000-7a216fff : Unknown E820 type
7a217000-7bff : System RAM

unset zone link in struct page will trigger

VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);

because there are pages in both ZONE_DMA32 and ZONE_DMA (unset zone link in
struct page) in the same pageblock.

Update init_unavailable_mem() to use zone constraints defined by an
architecture to properly setup the zone link and use node ID of the
adjacent range in memblock.memory to set the node link.

Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that 
check each PFN")
Reported-by: Andrea Arcangeli 
Signed-off-by: Mike Rapoport 
---
  mm/page_alloc.c | 84 +
  1 file changed, 50 insertions(+), 34 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bdbec4c98173..0b56c3ca354e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7077,23 +7077,26 @@ void __init free_area_init_memoryless_node(int nid)
   * Initialize all valid struct pages in the range [spfn, epfn) and mark them
   * PageReserved(). Return the number of struct pages that were initialized.
   */
-static u64 __init init_unavailable_range(unsigned long spfn, unsigned long 
epfn)
+static u64 __init init_unavailable_range(unsigned long spfn, unsigned long 
epfn,
+int zone, int nid)
  {
-   unsigned long pfn;
+   unsigned long pfn, zone_spfn, zone_epfn;
u64 pgcnt = 0;
  
+	zone_spfn = arch_zone_lowest_possible_pfn[zone];

+   zone_epfn = arch_zone_highest_possible_pfn[zone];
+
+   spfn = clamp(spfn, zone_spfn, zone_epfn);
+   epfn = clamp(epfn, zone_spfn, zone_epfn);
+
for (pfn = spfn; pfn < epfn; pfn++) {
if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
+ pageblock_nr_pages - 1;
continue;
}
-   /*
-* Use a fake node/zone (0) for now. Some of these pages
-* (in memblock.reserved but not in memblock.memory) will
-* get re-initialized via reserve_bootmem_region() later.
-*/
-   __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
+
+   __init_single_page(pfn_to_page(pfn), pfn, zone, nid);
__SetPageReserved(pfn_to_page(pfn));
pgcnt++;
}
@@ -7102,51 +7105,64 @@ static u64 __init init_unavailable_range(unsigned long 
spfn, unsigned long epfn)
  }
  
  /*

- * Only struct pages that are backed by physical memory are zeroed and
- * initialized by going through __init_single_page(). But, there are some
- * struct pages which are reserved in memblock allocator and their fields
- * may be accessed (for example page_to_pfn() on some configuration accesses
- * flags). We must explicitly initialize those struct pages.
+ * Only struct pages that correspond to ranges defined by memblock.memory
+ * are zeroed and initialized by going through __init_single_page() during
+ * memmap_init().
+ *
+ * But, there could be struct pages that correspond to holes in
+ * memblock.memory. This can happen because of the following reasons:
+ * - phyiscal memory bank size is not necessarily the exact multiple of the
+ *   arbitrary section size
+ * - early reserved memory may not be listed in memblock.memory
+ * - memory layouts defined with memmap= kernel parameter may not align
+ *   nicely with memmap sections
   *
- * This function also addresses a similar issue where struct pages are left
- * uninitialized because the physical address range is not covered by
- * memblock.memory or memblock.reserved. That could happen when memblock
- * layout is manually configured via memmap=, or when the highest physical
- * address (max_pfn) does not end on a section boundary.
+ * Explicitly initialize those struct pages so that:
+ * - PG_Reserved is set
+ * - zone link is set accorging to the architecture constrains
+ * - node is set to node id of the next populated regio

INFO: task hung in rsvp_delete_filter_work

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:14e8e0f6 tcp: shrink inet_connection_sock icsk_mtup enable..
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=114362c4d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=ac6e76902c1abb76
dashboard link: https://syzkaller.appspot.com/bug?extid=a2ec7a7fb2331091aecf
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=114d33d8d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12610ac4d0

The issue was bisected to:

commit 0fedc63fadf0404a729e73a35349481c8009c02f
Author: Cong Wang 
Date:   Wed Sep 23 03:56:24 2020 +

net_sched: commit action insertions together

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1065c1d8d0
final oops: https://syzkaller.appspot.com/x/report.txt?x=1265c1d8d0
console output: https://syzkaller.appspot.com/x/log.txt?x=1465c1d8d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a2ec7a7fb2331091a...@syzkaller.appspotmail.com
Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together")

INFO: task kworker/u4:0:8 blocked for more than 143 seconds.
  Not tainted 5.11.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:0state:D stack:23440 pid:8 ppid: 2 flags:0x4000
Workqueue: tc_filter_workqueue rsvp_delete_filter_work
Call Trace:
 context_switch kernel/sched/core.c:4327 [inline]
 __schedule+0x90c/0x21a0 kernel/sched/core.c:5078
 schedule+0xcf/0x270 kernel/sched/core.c:5157
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:5216
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x81a/0x1110 kernel/locking/mutex.c:1103
 rsvp_delete_filter_work+0xe/0x20 net/sched/cls_rsvp.h:293
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
INFO: task kworker/0:3:3217 blocked for more than 143 seconds.
  Not tainted 5.11.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:3 state:D stack:26720 pid: 3217 ppid: 2 flags:0x4000
Workqueue: ipv6_addrconf addrconf_verify_work
Call Trace:
 context_switch kernel/sched/core.c:4327 [inline]
 __schedule+0x90c/0x21a0 kernel/sched/core.c:5078
 schedule+0xcf/0x270 kernel/sched/core.c:5157
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:5216
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x81a/0x1110 kernel/locking/mutex.c:1103
 addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4572
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Showing all locks held in the system:
3 locks held by kworker/u4:0/8:
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
set_work_data kernel/workqueue.c:616 [inline]
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 88814156a938 ((wq_completion)tc_filter_workqueue){+.+.}-{0:0}, at: 
process_one_work+0x871/0x15f0 kernel/workqueue.c:2246
 #1: c9cd7da8 ((work_completion)(&(rwork)->work)){+.+.}-{0:0}, at: 
process_one_work+0x8a5/0x15f0 kernel/workqueue.c:2250
 #2: 8ca5a488 (rtnl_mutex){+.+.}-{3:3}, at: 
rsvp_delete_filter_work+0xe/0x20 net/sched/cls_rsvp.h:293
1 lock held by khungtaskd/1659:
 #0: 8b373d20 (rcu_read_lock){}-{1:2}, at: 
debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6259
3 locks held by kworker/0:3/3217:
 #0: 8881472c5138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 8881472c5138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: 
atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 8881472c5138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 8881472c5138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: 
set_work_data kernel/workqueue.c:616 [inline]
 #0: 8881472c5138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 8881472c5138 ((w

inconsistent lock state in io_dismantle_req

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:b01f250d Add linux-next specific files for 20210129
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=160cda90d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=725bc96dc234fda7
dashboard link: https://syzkaller.appspot.com/bug?extid=81d17233a2b02eafba33
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14f8a330d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c10440d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+81d17233a2b02eafb...@syzkaller.appspotmail.com


WARNING: inconsistent lock state
5.11.0-rc5-next-20210129-syzkaller #0 Not tainted

inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
syz-executor217/8450 [HC1[1]:SC0[0]:HE0:SE1] takes:
888023d6e620 (&fs->lock){?.+.}-{2:2}, at: spin_lock 
include/linux/spinlock.h:354 [inline]
888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_req_clean_work 
fs/io_uring.c:1398 [inline]
888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_dismantle_req+0x66f/0xf60 
fs/io_uring.c:2029
{HARDIRQ-ON-W} state was registered at:
  lock_acquire kernel/locking/lockdep.c:5509 [inline]
  lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5474
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:354 [inline]
  set_fs_pwd+0x85/0x2a0 fs/fs_struct.c:39
  init_chdir+0xdf/0x127 fs/init.c:54
  devtmpfs_setup drivers/base/devtmpfs.c:418 [inline]
  devtmpfsd+0x76/0x333 drivers/base/devtmpfs.c:433
  kthread+0x3b1/0x4a0 kernel/kthread.c:292
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
irq event stamp: 786
hardirqs last  enabled at (785): [] __raw_spin_unlock_irq 
include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last  enabled at (785): [] 
_raw_spin_unlock_irq+0x1f/0x40 kernel/locking/spinlock.c:199
hardirqs last disabled at (786): [] 
sysvec_apic_timer_interrupt+0xc/0x100 arch/x86/kernel/apic/apic.c:1096
softirqs last  enabled at (664): [] read_pnet 
include/net/net_namespace.h:324 [inline]
softirqs last  enabled at (664): [] sock_net 
include/net/sock.h:2550 [inline]
softirqs last  enabled at (664): [] unix_create1+0x484/0x570 
net/unix/af_unix.c:814
softirqs last disabled at (662): [] unix_sockets_unbound 
net/unix/af_unix.c:133 [inline]
softirqs last disabled at (662): [] unix_create1+0x401/0x570 
net/unix/af_unix.c:808

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&fs->lock);
  
lock(&fs->lock);

 *** DEADLOCK ***

1 lock held by syz-executor217/8450:
 #0: 88802417c3e8 (&ctx->uring_lock){+.+.}-{3:3}, at: 
__do_sys_io_uring_enter+0x1071/0x1f30 fs/io_uring.c:9442

stack backtrace:
CPU: 1 PID: 8450 Comm: syz-executor217 Not tainted 
5.11.0-rc5-next-20210129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_usage_bug kernel/locking/lockdep.c:3806 [inline]
 valid_state kernel/locking/lockdep.c:3817 [inline]
 mark_lock_irq kernel/locking/lockdep.c:4020 [inline]
 mark_lock.cold+0x61/0x8e kernel/locking/lockdep.c:4477
 mark_usage kernel/locking/lockdep.c:4369 [inline]
 __lock_acquire+0x1468/0x54c0 kernel/locking/lockdep.c:4853
 lock_acquire kernel/locking/lockdep.c:5509 [inline]
 lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5474
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_req_clean_work fs/io_uring.c:1398 [inline]
 io_dismantle_req+0x66f/0xf60 fs/io_uring.c:2029
 __io_free_req+0x3d/0x2e0 fs/io_uring.c:2046
 io_free_req fs/io_uring.c:2269 [inline]
 io_double_put_req fs/io_uring.c:2392 [inline]
 io_put_req+0xf9/0x570 fs/io_uring.c:2388
 io_link_timeout_fn+0x30c/0x480 fs/io_uring.c:6497
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1085 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1102
 asm_call_irq_on_stack+0xf/0x20
 
 __run_sysvec_on_irqstack arch/x86/include/asm/irq_stack.h:37 [inline]
 run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:89 [inline]
 sysvec_apic_timer_interrupt+0xbd/0x100 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:629
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:169 [inline]
RIP: 0010:_raw_spin_unlock_irq+0x25/0x40 kernel/locking/spinlock.c:199
Code: 0f 1f 44 00 00 55 48 8b 74 24 08 48 89 fd

[PATCH 1/1]: turbostat: Fix Pkg Power on Zen

2021-02-01 Thread Kurt Garloff

commit 5d399d05df42ffcaa2b3836b580631c4024487a0
Author: Kurt Garloff 
Date:   Mon Feb 1 09:01:47 2021 +

    turbostat: Fix Pkg Power tracking on Zen
   
    AMD Zen processors use a different MSR (MSR_PKG_ENERGY_STAT) than intel
    (MSR_PKG_ENERGY_STATUS) to track package power; however we want to record
    it at the same offset in our package_data.
    offset_to_idx() however only recognized the intel MSR, erroring
    out with -13 on Zen.
   
    With this fix, it will support the Zen MSR.
    Tested successfully on Ryzen 3000 & 5000.
   
    Signed-off-by: Kurt Garloff 

diff --git a/tools/power/x86/turbostat/turbostat.c 
b/tools/power/x86/turbostat/turbostat.c
index 389ea5209a83..cb830e73d899 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -325,6 +325,7 @@ int offset_to_idx(int offset)
 int idx;
 
 switch (offset) {
+    case MSR_PKG_ENERGY_STAT:
 case MSR_PKG_ENERGY_STATUS:
     idx = IDX_PKG_ENERGY;
     break;

-- 
Kurt Garloff 
Cologne, Germany

possible deadlock in cfg80211_netdev_notifier_call

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:b01f250d Add linux-next specific files for 20210129
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14daa408d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=725bc96dc234fda7
dashboard link: https://syzkaller.appspot.com/bug?extid=2ae0ca9d7737ad1a62b7
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1757f2a0d0

The issue was bisected to:

commit cc9327f3b085ba5be5639a5ec3ce5b08a0f14a7c
Author: Mike Rapoport 
Date:   Thu Jan 28 07:42:40 2021 +

mm: introduce memfd_secret system call to create "secret" memory areas

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1505d28cd0
final oops: https://syzkaller.appspot.com/x/report.txt?x=1705d28cd0
console output: https://syzkaller.appspot.com/x/log.txt?x=1305d28cd0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2ae0ca9d7737ad1a6...@syzkaller.appspotmail.com
Fixes: cc9327f3b085 ("mm: introduce memfd_secret system call to create "secret" 
memory areas")


WARNING: possible recursive locking detected
5.11.0-rc5-next-20210129-syzkaller #0 Not tainted

syz-executor.1/27924 is trying to acquire lock:
88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock 
include/net/cfg80211.h:5267 [inline]
88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: 
cfg80211_netdev_notifier_call+0x68c/0x1180 net/wireless/core.c:1407

but task is already holding lock:
88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock 
include/net/cfg80211.h:5267 [inline]
88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: 
nl80211_pre_doit+0x347/0x5a0 net/wireless/nl80211.c:14837

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&rdev->wiphy.mtx);
  lock(&rdev->wiphy.mtx);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by syz-executor.1/27924:
 #0: 8cd04eb0 (cb_lock){}-{3:3}, at: genl_rcv+0x15/0x40 
net/netlink/genetlink.c:810
 #1: 8cc75248 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x22/0x5a0 
net/wireless/nl80211.c:14793
 #2: 88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock 
include/net/cfg80211.h:5267 [inline]
 #2: 88801c7305e8 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: 
nl80211_pre_doit+0x347/0x5a0 net/wireless/nl80211.c:14837

stack backtrace:
CPU: 1 PID: 27924 Comm: syz-executor.1 Not tainted 
5.11.0-rc5-next-20210129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4899
 lock_acquire kernel/locking/lockdep.c:5509 [inline]
 lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5474
 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
 __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
 wiphy_lock include/net/cfg80211.h:5267 [inline]
 cfg80211_netdev_notifier_call+0x68c/0x1180 net/wireless/core.c:1407
 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040
 call_netdevice_notifiers_extack net/core/dev.c:2052 [inline]
 call_netdevice_notifiers net/core/dev.c:2066 [inline]
 unregister_netdevice_many+0x943/0x1750 net/core/dev.c:10704
 unregister_netdevice_queue+0x2dd/0x3c0 net/core/dev.c:10638
 register_netdevice+0x109f/0x14a0 net/core/dev.c:10013
 cfg80211_register_netdevice+0x11d/0x2a0 net/wireless/core.c:1349
 ieee80211_if_add+0xfb8/0x18f0 net/mac80211/iface.c:1990
 ieee80211_add_iface+0x99/0x160 net/mac80211/cfg.c:125
 rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline]
 nl80211_new_interface+0x541/0x1100 net/wireless/nl80211.c:3977
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:674
 sys_sendmsg+0x6e8/0x810 net/socket.c:2350
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2437
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45e219
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00

possible deadlock in ovl_dir_real_file

2021-02-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:6642d600 Merge tag '5.11-rc5-smb3' of git://git.samba.org/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148aef78d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=9408d1770a50819c
dashboard link: https://syzkaller.appspot.com/bug?extid=6a023cb2262c79301432
compiler:   clang version 11.0.1

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+6a023cb2262c79301...@syzkaller.appspotmail.com


WARNING: possible recursive locking detected
5.11.0-rc5-syzkaller #0 Not tainted

syz-executor.2/3639 is trying to acquire lock:
888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: inode_lock 
include/linux/fs.h:773 [inline]
888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: 
ovl_dir_real_file+0x20b/0x310 fs/overlayfs/readdir.c:886

but task is already holding lock:
888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: inode_lock 
include/linux/fs.h:773 [inline]
888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: 
ovl_ioctl_set_flags fs/overlayfs/file.c:530 [inline]
888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: 
ovl_ioctl+0x2fb/0x960 fs/overlayfs/file.c:569

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&ovl_i_mutex_dir_key[depth]);
  lock(&ovl_i_mutex_dir_key[depth]);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by syz-executor.2/3639:
 #0: 88807a706460 (sb_writers#17){.+.+}-{0:0}, at: 
mnt_want_write_file+0x5a/0x250 fs/namespace.c:412
 #1: 888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: inode_lock 
include/linux/fs.h:773 [inline]
 #1: 888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: 
ovl_ioctl_set_flags fs/overlayfs/file.c:530 [inline]
 #1: 888084c0b5f0 (&ovl_i_mutex_dir_key[depth]){}-{3:3}, at: 
ovl_ioctl+0x2fb/0x960 fs/overlayfs/file.c:569

stack backtrace:
CPU: 1 PID: 3639 Comm: syz-executor.2 Not tainted 5.11.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x137/0x1be lib/dump_stack.c:120
 __lock_acquire+0x2333/0x5e90 kernel/locking/lockdep.c:4670
 lock_acquire+0x114/0x5e0 kernel/locking/lockdep.c:5442
 down_write+0x56/0x120 kernel/locking/rwsem.c:1406
 inode_lock include/linux/fs.h:773 [inline]
 ovl_dir_real_file+0x20b/0x310 fs/overlayfs/readdir.c:886
 ovl_real_fdget fs/overlayfs/file.c:136 [inline]
 ovl_real_ioctl fs/overlayfs/file.c:499 [inline]
 ovl_ioctl_set_flags fs/overlayfs/file.c:545 [inline]
 ovl_ioctl+0x4de/0x960 fs/overlayfs/file.c:569
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45e219
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7f02ed677c68 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 0003 RCX: 0045e219
RDX:  RSI: 40086602 RDI: 0003
RBP: 0119bfc0 R08:  R09: 
R10:  R11: 0246 R12: 0119bf8c
R13: 7ffd373df6ef R14: 7f02ed6789c0 R15: 0119bf8c


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Re: [PATCH net] net: hdlc_x25: Use qdisc to queue outgoing LAPB frames

2021-02-01 Thread Martin Schiller


On 2021-01-31 04:16, Xie He wrote:
On Sat, Jan 30, 2021 at 11:16 AM Jakub Kicinski  
wrote:


Sounds like too much afford for a sub-optimal workaround.
The qdisc semantics are borken in the proposed scheme (double
counting packets) - both in term of statistics and if user decides
to add a policer, filter etc.


Hmm...

Another solution might be creating another virtual device on top of
the HDLC device (similar to what "hdlc_fr.c" does), so that we can
first queue L3 packets in the virtual device's qdisc queue, and then
queue the L2 frames in the actual HDLC device's qdisc queue. This way
we can avoid the same outgoing data being queued to qdisc twice. But
this would significantly change the way the user uses the hdlc_x25
driver.


Another worry is that something may just inject a packet with
skb->protocol == ETH_P_HDLC but unexpected structure (IDK if
that's a real concern).


This might not be a problem. Ethernet devices also allow the user to
inject raw frames with user constructed headers. "hdlc_fr.c" also
allows the user to bypass the virtual circuit interfaces and inject
raw frames directly on the HDLC interface. I think the receiving side
should be able to recognize and drop invalid frames.


It may be better to teach LAPB to stop / start the internal queue.
The lower level drivers just needs to call LAPB instead of making
the start/wake calls directly to the stack, and LAPB can call the
stack. Would that not work?


I think this is a good solution. But this requires changing a lot of
code. The HDLC subsystem needs to be changed to allow HDLC Hardware
Drivers to ask HDLC Protocol Drivers (like hdlc_x25.c) to stop/wake
the TX queue. The hdlc_x25.c driver can then ask the LAPB module to
stop/wake the queue.

So this means new APIs need to be added to both the HDLC subsystem and
the LAPB module, and a number of HDLC Hardware Drivers need to be
changed to call the new API of the HDLC subsystem.

Martin, do you have any suggestions?


I have thought about this issue again.

I also have to say that I have never noticed any problems in this area
before.

So again for (my) understanding:
When a hardware driver calls netif_stop_queue, the frames sent from
layer 3 (X.25) with dev_queue_xmit are queued and not passed "directly"
to x25_xmit of the hdlc_x25 driver.

So nothing is added to the write_queue anymore (except possibly
un-acked-frames by lapb_requeue_frames).

Shouldn't it actually be sufficient to check for netif_queue_stopped in
lapb_kick and then do "nothing" if necessary?

As soon as the hardware driver calls netif_wake_queue, the whole thing
should just continue running.

Or am I missing something?

Re: [next PATCH] usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints

2021-02-01 Thread Ikjoon Jang

HI Chunfeng,

On Mon, Feb 1, 2021 at 1:58 PM Chunfeng Yun  wrote:
>
> For those unchecked endpoints, we don't allocate bandwidth for
> them, so no need free the bandwidth, otherwise will decrease
> the allocated bandwidth.
> Meanwhile use xhci_dbg() instead of dev_dbg() to print logs and
> rename bw_ep_list_new as bw_ep_chk_list.
>
> Fixes: 1d69f9d901ef ("usb: xhci-mtk: fix unreleased bandwidth data")
> Cc: stable 
> Signed-off-by: Chunfeng Yun 

Reviewed-and-tested-by: Ikjoon Jang 

> ---
>  drivers/usb/host/xhci-mtk-sch.c | 61 ++---
>  drivers/usb/host/xhci-mtk.h |  4 ++-
>  2 files changed, 36 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
> index a313e75ff1c6..dee8a329076d 100644
> --- a/drivers/usb/host/xhci-mtk-sch.c
> +++ b/drivers/usb/host/xhci-mtk-sch.c
> @@ -200,6 +200,7 @@ static struct mu3h_sch_ep_info *create_sch_ep(struct 
> usb_device *udev,
>
> sch_ep->sch_tt = tt;
> sch_ep->ep = ep;
> +   INIT_LIST_HEAD(&sch_ep->endpoint);
> INIT_LIST_HEAD(&sch_ep->tt_endpoint);
>
> return sch_ep;
> @@ -374,6 +375,7 @@ static void update_bus_bw(struct mu3h_sch_bw_info *sch_bw,
> sch_ep->bw_budget_table[j];
> }
> }
> +   sch_ep->allocated = used;

Yes, this is really needed!

>  }
>
>  static int check_sch_tt(struct usb_device *udev,
> @@ -542,6 +544,22 @@ static int check_sch_bw(struct usb_device *udev,
> return 0;
>  }
>
> +static void destroy_sch_ep(struct usb_device *udev,
> +   struct mu3h_sch_bw_info *sch_bw, struct mu3h_sch_ep_info *sch_ep)
> +{
> +   /* only release ep bw check passed by check_sch_bw() */
> +   if (sch_ep->allocated)
> +   update_bus_bw(sch_bw, sch_ep, 0);

So only these two lines really matter.

> +
> +   list_del(&sch_ep->endpoint);
> +
> +   if (sch_ep->sch_tt) {
> +   list_del(&sch_ep->tt_endpoint);
> +   drop_tt(udev);
> +   }
> +   kfree(sch_ep);
> +}
> +
>  static bool need_bw_sch(struct usb_host_endpoint *ep,
> enum usb_device_speed speed, int has_tt)
>  {
> @@ -584,7 +602,7 @@ int xhci_mtk_sch_init(struct xhci_hcd_mtk *mtk)
>
> mtk->sch_array = sch_array;
>
> -   INIT_LIST_HEAD(&mtk->bw_ep_list_new);
> +   INIT_LIST_HEAD(&mtk->bw_ep_chk_list);
>
> return 0;
>  }
> @@ -636,29 +654,12 @@ int xhci_mtk_add_ep_quirk(struct usb_hcd *hcd, struct 
> usb_device *udev,
>
> setup_sch_info(udev, ep_ctx, sch_ep);
>
> -   list_add_tail(&sch_ep->endpoint, &mtk->bw_ep_list_new);
> +   list_add_tail(&sch_ep->endpoint, &mtk->bw_ep_chk_list);
>
> return 0;
>  }
>  EXPORT_SYMBOL_GPL(xhci_mtk_add_ep_quirk);
>
> -static void xhci_mtk_drop_ep(struct xhci_hcd_mtk *mtk, struct usb_device 
> *udev,
> -struct mu3h_sch_ep_info *sch_ep)
> -{
> -   struct xhci_hcd *xhci = hcd_to_xhci(mtk->hcd);
> -   int bw_index = get_bw_index(xhci, udev, sch_ep->ep);
> -   struct mu3h_sch_bw_info *sch_bw = &mtk->sch_array[bw_index];
> -
> -   update_bus_bw(sch_bw, sch_ep, 0);
> -   list_del(&sch_ep->endpoint);
> -
> -   if (sch_ep->sch_tt) {
> -   list_del(&sch_ep->tt_endpoint);
> -   drop_tt(udev);
> -   }
> -   kfree(sch_ep);
> -}
> -
>  void xhci_mtk_drop_ep_quirk(struct usb_hcd *hcd, struct usb_device *udev,
> struct usb_host_endpoint *ep)
>  {
> @@ -688,9 +689,8 @@ void xhci_mtk_drop_ep_quirk(struct usb_hcd *hcd, struct 
> usb_device *udev,
> sch_bw = &sch_array[bw_index];
>
> list_for_each_entry_safe(sch_ep, tmp, &sch_bw->bw_ep_list, endpoint) {
> -   if (sch_ep->ep == ep) {
> -   xhci_mtk_drop_ep(mtk, udev, sch_ep);
> -   }
> +   if (sch_ep->ep == ep)
> +   destroy_sch_ep(udev, sch_bw, sch_ep);

not so critical but I've also missed 'break' here.
Can you please add a break statement here?

> }
>  }
>  EXPORT_SYMBOL_GPL(xhci_mtk_drop_ep_quirk);
> @@ -704,9 +704,9 @@ int xhci_mtk_check_bandwidth(struct usb_hcd *hcd, struct 
> usb_device *udev)
> struct mu3h_sch_ep_info *sch_ep, *tmp;
> int bw_index, ret;
>
> -   dev_dbg(&udev->dev, "%s\n", __func__);
> +   xhci_dbg(xhci, "%s() udev %s\n", __func__, dev_name(&udev->dev));
>
> -   list_for_each_entry(sch_ep, &mtk->bw_ep_list_new, endpoint) {
> +   list_for_each_entry(sch_ep, &mtk->bw_ep_chk_list, endpoint) {
> bw_index = get_bw_index(xhci, udev, sch_ep->ep);
> sch_bw = &mtk->sch_array[bw_index];
>
> @@ -717,7 +717,7 @@ int xhci_mtk_check_bandwidth(struct usb_hcd *hcd, struct 
> usb_device *udev)
> }
> }
>
> -   list_for_each_entry_safe(sch_ep, tmp, &mtk->bw_ep_list_new, endpoint) 
> {
> +   list_for_each_entry_safe(sch_ep, tmp, &mtk->bw_ep_chk_list, e

Re: extended bpf_send_signal_thread with argument

2021-02-01 Thread Peter Zijlstra

On Sun, Jan 31, 2021 at 12:14:02PM +0100, Dmitry Vyukov wrote:
> Hi,
> 
> I would like to send a signal from a bpf program invoked from a
> perf_event. There is:

You can't. Sending signals requires sighand lock, and you're not allowed
to take locks from perf_event context.

Re: [PATCH] media: allegro-dvt: Use __packed sentence

2021-02-01 Thread Michael Tretter

On Fri, 29 Jan 2021 23:54:41 +, David Laight wrote:
> From: Emmanuel Arias
> > Sent: 29 January 2021 20:02
> > 
> > Fix coding style using __packed sentece instead of
> > __attribute__((__packed__)).
> > 
> > Signed-off-by: Emmanuel Arias 
> > ---
> >  drivers/staging/media/allegro-dvt/allegro-core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/staging/media/allegro-dvt/allegro-core.c 
> > b/drivers/staging/media/allegro-
> > dvt/allegro-core.c
> > index 9f718f43282b..cee624dac61a 100644
> > --- a/drivers/staging/media/allegro-dvt/allegro-core.c
> > +++ b/drivers/staging/media/allegro-dvt/allegro-core.c
> > @@ -670,7 +670,7 @@ static ssize_t allegro_mbox_read(struct allegro_mbox 
> > *mbox,
> > struct {
> > u16 length;
> > u16 type;
> > -   } __attribute__ ((__packed__)) *header;
> > +   } __packed *header;
> > struct regmap *sram = mbox->dev->sram;
> 
> Does this actually need to be packed?
> The only reason would be if the structure could exist on a 2n+1
> boundary.

Not sure, what you mean by this.

> But that is only likely if part of some binary sequence.
> In which case I'd expect it to be marked __be or __le.

It is part of a binary sequence. It is the header of messages in a mailbox
that is used to exchange data with a co-processor (video encoder). In fact, it
should be marked as __le.

Michael

Re: [PATCH 0/8] gpio: implement the configfs testing module

2021-02-01 Thread Uwe Kleine-König

On Mon, Feb 01, 2021 at 09:37:30AM +0100, Bartosz Golaszewski wrote:
> On Sat, Jan 30, 2021 at 10:20 PM Uwe Kleine-König
>  wrote:
> >
> > Hello,
> >
> > On Fri, Jan 29, 2021 at 02:46:16PM +0100, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski 
> > >
> > > This series adds a new GPIO testing module based on configfs committable 
> > > items
> > > and sysfs. The goal is to provide a testing driver that will be 
> > > configurable
> > > at runtime (won't need module reload) and easily extensible. The control 
> > > over
> > > the attributes is also much more fine-grained than in gpio-mockup.
> > >
> > > I am aware that Uwe submitted a virtual driver called gpio-simulator some 
> > > time
> > > ago and I was against merging it as it wasn't much different from 
> > > gpio-mockup.
> > > I would ideally want to have a single testing driver to maintain so I am
> > > proposing this module as a replacement for gpio-mockup but since selftests
> > > and libgpiod depend on it and it also has users in the community, we can't
> > > outright remove it until everyone switched to the new interface. As for 
> > > Uwe's
> > > idea for linking two simulated chips so that one controls the other - 
> > > while
> > > I prefer to have an independent code path for controlling the lines (hence
> > > the sysfs attributes), I'm open to implementing it in this new driver. It
> > > should be much more feature friendly thanks to configfs than gpio-mockup.
> >
> > Funny you still think about my simulator driver. I recently thought
> 
> It's because I always feel bad when I refuse to merge someone's hard work.
> 
> > about reanimating it for my private use. The idea was to implement a
> > rotary-encoder driver (that contrast to
> > drivers/input/misc/rotary_encoder.c really implements an encoder and not
> > a decoder). With the two linked chips I can plug
> > drivers/input/misc/rotary_encoder.c on one side and my encoder on the
> > other to test both drivers completely in software.
> >
> > I didn't look into your driver yet, but getting such a driver into
> > mainline would be very welcome!
> >
> 
> My idea for linking chips (although that's not implemented yet) is an
> attribute in each configfs group called 'link' or something like that,
> that would take as argument the name of the chip to link to making the
> 'linker' the input and the 'linkee' the output.

I still wonder why you prefer to drive the lines using configfs (or
sysfs before). Using the idea of two interlinked chips and being able to
use gpio functions on one side to modify the other side is (in my eyes)
so simple and beautiful that it's obviously the right choice. But note I
still didn't look into details so there might be stuff you can modify
that wouldn't be possible with my idea. But obviously your mileage
varies here.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH 3/3] power: supply: max8997_charger: Switch to new binding

2021-02-01 Thread Timon Baetz

On Sun, 31 Jan 2021 18:28:40 +0100, Krzysztof Kozlowski wrote:
> On Sat, Jan 30, 2021 at 05:30:14PM +, Timon Baetz wrote:
> > Get regulator from parent device's node and extcon by name.
> >
> > Signed-off-by: Timon Baetz 
> > ---
> >  drivers/power/supply/max8997_charger.c | 12 
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/power/supply/max8997_charger.c 
> > b/drivers/power/supply/max8997_charger.c
> > index 321bd6b8ee41..625d8cc4312a 100644
> > --- a/drivers/power/supply/max8997_charger.c
> > +++ b/drivers/power/supply/max8997_charger.c
> > @@ -168,6 +168,7 @@ static int max8997_battery_probe(struct platform_device 
> > *pdev)
> > int ret = 0;
> > struct charger_data *charger;
> > struct max8997_dev *iodev = dev_get_drvdata(pdev->dev.parent);
> > +   struct device_node *np = pdev->dev.of_node;
> > struct i2c_client *i2c = iodev->i2c;
> > struct max8997_platform_data *pdata = iodev->pdata;
> > struct power_supply_config psy_cfg = {};
> > @@ -237,20 +238,23 @@ static int max8997_battery_probe(struct 
> > platform_device *pdev)
> > return PTR_ERR(charger->battery);
> > }
> >
> > +   // grab regulator from parent device's node
> > +   pdev->dev.of_node = iodev->dev->of_node;
> > charger->reg = devm_regulator_get_optional(&pdev->dev, "charger");
> > +   pdev->dev.of_node = np;  
> 
> I think the device does not have its own node anymore. Or did I miss
> something?

The idea is to reset of_node to whatever it was before (NULL) and basically 
leave the device unchanged. Probe might run again because of deferral.

> > if (IS_ERR(charger->reg)) {
> > if (PTR_ERR(charger->reg) == -EPROBE_DEFER)
> > return -EPROBE_DEFER;
> > dev_info(&pdev->dev, "couldn't get charger regulator\n");
> > }
> > -   charger->edev = extcon_get_edev_by_phandle(&pdev->dev, 0);
> > -   if (IS_ERR(charger->edev)) {
> > -   if (PTR_ERR(charger->edev) == -EPROBE_DEFER)
> > +   charger->edev = extcon_get_extcon_dev("max8997-muic");
> > +   if (IS_ERR_OR_NULL(charger->edev)) {
> > +   if (!charger->edev)  
> 
> Isn't NULL returned when there is simply no extcon? It's different than
> deferred probe. Returning here EPROBE_DEFER might lead to infinite probe
> tries (on every new device probe) instead of just failing it.

extcon_get_extcon_dev() just loops through all registered extcon devices
and compared names. It will return NULL when "max8997-muic" isn't
registered yet. extcon_get_extcon_dev() never returns EPROBE_DEFER so
checking for NULL seems to be the only way. Other drivers using that
function also do NULL check and return EPROBE_DEFER.

Thanks for reviewing,
Timon

Re: [PATCH] init: clean up early_param_on_off() macro

2021-02-01 Thread Johan Hovold

On Mon, Feb 01, 2021 at 01:15:32PM +0900, Masahiro Yamada wrote:
> Use early_param() to define early_param_on_off().
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  include/linux/init.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/init.h b/include/linux/init.h
> index e668832ef66a..ae2c2aace0d0 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -277,14 +277,14 @@ struct obs_kernel_param {
>   var = 1;\
>   return 0;   \
>   }   \
> - __setup_param(str_on, parse_##var##_on, parse_##var##_on, 1);   \
> + early_param(str_on, parse_##var##_on);  \
>   \
>   static int __init parse_##var##_off(char *arg)  \
>   {   \
>   var = 0;\
>   return 0;   \
>   }   \
> - __setup_param(str_off, parse_##var##_off, parse_##var##_off, 1)
> + early_param(str_off, parse_##var##_off)
>  
>  /* Relies on boot_command_line being set */
>  void __init parse_early_param(void);

Looks good:

Reviewed-by: Johan Hovold 

Johan

[PATCH] arm64: dts: mediatek: mt8183: evb: Add domain supply for mfg

2021-02-01 Thread Hsin-Yi Wang

Add domain supply node for mt8183-evb

Signed-off-by: Hsin-Yi Wang 
---
 arch/arm64/boot/dts/mediatek/mt8183-evb.dts | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts 
b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
index 3249c959f76fc..edff1e03e6fee 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
+++ b/arch/arm64/boot/dts/mediatek/mt8183-evb.dts
@@ -352,6 +352,10 @@ pins_pwm {
};
 };
 
+&mfg {
+   domain-supply = <&mt6358_vgpu_reg>;
+};
+
 &spi0 {
pinctrl-names = "default";
pinctrl-0 = <&spi_pins_0>;
-- 
2.30.0.365.g02bc693789-goog

Re: [PATCH 0/2] Add MediaTek MT8192 clock provider device nodes

2021-02-01 Thread Weiyi Lu

On Sun, 2021-01-31 at 14:27 +0100, Matthias Brugger wrote:
> 
> On 22/12/2020 14:40, Weiyi Lu wrote:
> > This series is based on v5.10-rc1, MT8192 dts v6[1] and
> > MT8192 clock v6 series[2].
> > 
> > [1] https://patchwork.kernel.org/project/linux-mediatek/list/?series=373899
> > [2] https://patchwork.kernel.org/project/linux-mediatek/list/?series=405295
> > 
> 
> [1] is already mainline. You could add this patch as a new one to [2]. But
> please try to improve the series, before sending just a new version with this
> patch added.
> 
> Regards,
> Matthias
> 
Hi Matthias,

Actually I'm a little confused now. Stephen suggested me to send clock
dts separately because dts may not go through his tree.
So I separated it from the MT8192 clock series since clock v6.
What do you suggest me to do next time?

> > Weiyi Lu (2):
> >   arm64: dts: mediatek: Add mt8192 clock controllers
> >   arm64: dts: mediatek: Correct UART0 bus clock of MT8192
> > 
> >  arch/arm64/boot/dts/mediatek/mt8192.dtsi | 165 ++-
> >  1 file changed, 164 insertions(+), 1 deletion(-)
> >

[PATCH] mm/huge_memory.c: use helper range_in_vma() in __split_huge_p[u|m]d_locked()

2021-02-01 Thread Miaohe Lin

The helper range_in_vma() is introduced via commit 017b1660df89 ("mm:
migration: fix migration of huge PMD shared pages"). But we forgot to
use it in __split_huge_pud_locked() and __split_huge_pmd_locked().

Signed-off-by: Miaohe Lin 
---
 mm/huge_memory.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 987cf5e4cf90..33353a4f95fb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1959,8 +1959,7 @@ static void __split_huge_pud_locked(struct vm_area_struct 
*vma, pud_t *pud,
unsigned long haddr)
 {
VM_BUG_ON(haddr & ~HPAGE_PUD_MASK);
-   VM_BUG_ON_VMA(vma->vm_start > haddr, vma);
-   VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PUD_SIZE, vma);
+   VM_BUG_ON_VMA(!range_in_vma(vma, haddr, haddr + HPAGE_PUD_SIZE), vma);
VM_BUG_ON(!pud_trans_huge(*pud) && !pud_devmap(*pud));
 
count_vm_event(THP_SPLIT_PUD);
@@ -2039,8 +2038,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct 
*vma, pmd_t *pmd,
int i;
 
VM_BUG_ON(haddr & ~HPAGE_PMD_MASK);
-   VM_BUG_ON_VMA(vma->vm_start > haddr, vma);
-   VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma);
+   VM_BUG_ON_VMA(!range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE), vma);
VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)
&& !pmd_devmap(*pmd));
 
-- 
2.19.1

Re: [PATCH v4 1/2] x86/setup: always add the beginning of RAM as memblock.memory

2021-02-01 Thread David Hildenbrand


On 30.01.21 23:10, Mike Rapoport wrote:

From: Mike Rapoport 

The physical memory on an x86 system starts at address 0, but this is not
always reflected in e820 map. For example, the BIOS can have e820 entries
like

[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x1000-0x0009] usable

or

[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0fff] reserved
[0.00] BIOS-e820: [mem 0x1000-0x00057fff] usable

In either case, e820__memblock_setup() won't add the range 0x - 0x1000
to memblock.memory and later during memory map initialization this range is
left outside any zone.

With SPARSEMEM=y there is always a struct page for pfn 0 and this struct
page will have it's zone link wrong no matter what value will be set there.

To avoid this inconsistency, add the beginning of RAM to memblock.memory.
Limit the added chunk size to match the reserved memory to avoid
registering memory that may be used by the firmware but never reserved at
e820__memblock_setup() time.

Fixes: bde9cfa3afe4 ("x86/setup: don't remove E820_TYPE_RAM for pfn 0")
Signed-off-by: Mike Rapoport 
Cc: sta...@vger.kernel.org
---
  arch/x86/kernel/setup.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3412c4595efd..67c77ed6eef8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -727,6 +727,14 @@ static void __init trim_low_memory_range(void)
 * Kconfig help text for X86_RESERVE_LOW.
 */
memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE));
+
+   /*
+* Even if the firmware does not report the memory at address 0 as
+* usable, inform the generic memory management about its existence
+* to ensure it is a part of ZONE_DMA and the memory map for it is
+* properly initialized.
+*/
+   memblock_add(0, ALIGN(reserve_low, PAGE_SIZE));
  }

  /*



I think, to make that code more robust, and to not rely on archs to do 
the right thing, we should do something like


1) Make sure in free_area_init() that each PFN with a memmap (i.e., 
falls into a partial present section) is spanned by a zone; that would 
include PFN 0 in this case.


2) In init_zone_unavailable_mem(), similar to round_up(max_pfn, 
PAGES_PER_SECTION) handling, consider range

[round_down(min_pfn, PAGES_PER_SECTION), min_pfn - 1]
which would handle in the x86-64 case [0..0] and, therefore, initialize 
PFN 0.


Also, I think the special-case of PFN 0 is analogous to the 
round_up(max_pfn, PAGES_PER_SECTION) handling in 
init_zone_unavailable_mem(): who guarantees that these PFN above the 
highest present PFN are actually spanned by a zone?


I'd suggest going through all zone ranges in free_area_init() first, 
dealing with zones that have "not section aligned start/end", clamping 
them up/down if required such that no holes within a section are left 
uncovered by a zone.


--
Thanks,

David / dhildenb

[RFC] sched/rt: Fix RT (group) throttling with nohz_full

2021-02-01 Thread Jonathan Schwender

If nohz_full is enabled (more precisely HK_FLAG_TIMER is set), then
do_sched_rt_period_timer may be called on a housekeeping CPU,
which would not service the isolated CPU for a non-root cgroup
(requires a kernel with RT_GROUP_SCHEDULING).
This causes RT tasks in a non-root cgroup to get throttled 
indefinitely (unless throttling is disabled) once the timer has 
been moved to a housekeeping CPU.
To fix this, housekeeping CPUs now service all online CPUs 
if HK_FLAG_TIMER (nohz_full) is set.

I'm not really sure how this relates to  Mike Galbraith previous
commit e221d028bb08 ("sched,rt: fix isolated CPUs leaving root_task_group
indefinitely throttled"), (which is dated before the housekeeping changes,)
so I'm posting this as an RFC.


Signed-off-by: Jonathan Schwender 
---
 kernel/sched/rt.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 49ec096a8aa1..3185e00b828a 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -865,9 +865,16 @@ static int do_sched_rt_period_timer(struct rt_bandwidth 
*rt_b, int overrun)
 * isolation is really required, the user will turn the throttle
 * off to kill the perturbations it causes anyway.  Meanwhile,
 * this maintains functionality for boot and/or troubleshooting.
+* If nohz_full is active and the timer was offloaded to a
+* housekeeping CPU, sched_rt_period_mask() will not contain
+* the isolated CPU. To prevent indefinite throttling of tasks
+* on isolated CPUs, housekeeping CPUs service all online CPUs.
 */
-   if (rt_b == &root_task_group.rt_bandwidth)
+   if (rt_b == &root_task_group.rt_bandwidth
+   || (housekeeping_enabled(HK_FLAG_TIMER)
+   && housekeeping_cpu(this_rq()->cpu, HK_FLAG_TIMER))) {
span = cpu_online_mask;
+   }
 #endif
for_each_cpu(i, span) {
int enqueue = 0;
-- 
2.29.2

[PATCH] drm: Fix drm_atomic_get_new_crtc_state call error

2021-02-01 Thread Zhaoge Zhang

This position is to clear the previous mask flags,
so drm_atomic_get_crtc_state should be used.

Signed-off-by: Zhaoge Zhang 
---
 drivers/gpu/drm/drm_atomic_uapi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 268bb69..07fe01b 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -313,8 +313,8 @@ drm_atomic_set_crtc_for_connector(struct 
drm_connector_state *conn_state,
return 0;
 
if (conn_state->crtc) {
-   crtc_state = drm_atomic_get_new_crtc_state(conn_state->state,
-  conn_state->crtc);
+   crtc_state = drm_atomic_get_crtc_state(conn_state->state,
+   conn_state->crtc);
 
crtc_state->connector_mask &=
~drm_connector_mask(conn_state->connector);
-- 
2.7.4

[PATCH net] net: mvpp2: TCAM entry enable should be written after SRAM data

2021-02-01 Thread stefanc

From: Stefan Chulski 

Last TCAM data contains TCAM enable bit.
It should be written after SRAM data before entry enabled.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network 
unit")
Signed-off-by: Stefan Chulski 
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c 
b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
index 0b2ff08..f4a905f 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
@@ -29,16 +29,16 @@ static int mvpp2_prs_hw_write(struct mvpp2 *priv, struct 
mvpp2_prs_entry *pe)
/* Clear entry invalidation bit */
pe->tcam[MVPP2_PRS_TCAM_INV_WORD] &= ~MVPP2_PRS_TCAM_INV_MASK;
 
-   /* Write tcam index - indirect access */
-   mvpp2_write(priv, MVPP2_PRS_TCAM_IDX_REG, pe->index);
-   for (i = 0; i < MVPP2_PRS_TCAM_WORDS; i++)
-   mvpp2_write(priv, MVPP2_PRS_TCAM_DATA_REG(i), pe->tcam[i]);
-
/* Write sram index - indirect access */
mvpp2_write(priv, MVPP2_PRS_SRAM_IDX_REG, pe->index);
for (i = 0; i < MVPP2_PRS_SRAM_WORDS; i++)
mvpp2_write(priv, MVPP2_PRS_SRAM_DATA_REG(i), pe->sram[i]);
 
+   /* Write tcam index - indirect access */
+   mvpp2_write(priv, MVPP2_PRS_TCAM_IDX_REG, pe->index);
+   for (i = 0; i < MVPP2_PRS_TCAM_WORDS; i++)
+   mvpp2_write(priv, MVPP2_PRS_TCAM_DATA_REG(i), pe->tcam[i]);
+
return 0;
 }
 
-- 
1.9.1

Build regressions/improvements in v5.11-rc6

2021-02-01 Thread Geert Uytterhoeven

Below is the list of build error/warning regressions/improvements in
v5.11-rc6[1] compared to v5.10[2].

Summarized:
  - build errors: +0/-3
  - build warnings: +31/-96

JFYI, when comparing v5.11-rc6[1] to v5.11-rc5[3], the summaries are:
  - build errors: +0/-0
  - build warnings: +0/-1

Happy fixing! ;-)

Thanks to the linux-next team for providing the build service.

[1] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/1048ba83fb1c00cd24172e23e8263972f6b5d9ac/
 (all 192 configs)
[2] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/2c85ebc57b3e1817b6ce1a6b703928e113a90442/
 (all 192 configs)
[3] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/6ee1d745b7c9fd573fba142a2efdad76a9f1cb04/
 (all 192 configs)


*** ERRORS ***

3 error improvements:
  - /kisskb/src/arch/powerpc/platforms/powermac/smp.c: error: implicit 
declaration of function 'cleanup_cpu_mmu_context' 
[-Werror=implicit-function-declaration]: 914:2 => 
  - /kisskb/src/drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: 
error: implicit declaration of function 'disable_kernel_vsx' 
[-Werror=implicit-function-declaration]: 676:2 => 
  - /kisskb/src/drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: 
error: implicit declaration of function 'enable_kernel_vsx' 
[-Werror=implicit-function-declaration]: 640:2 => 


*** WARNINGS ***

31 warning regressions:
  + .config: warning: override: reassigning to symbol 
GCC_PLUGIN_CYC_COMPLEXITY:  => 4525, 4499
  + .config: warning: override: reassigning to symbol 
GCC_PLUGIN_LATENT_ENTROPY:  => 4527, 4501
  + .config: warning: override: reassigning to symbol MIPS_CPS_NS16550_SHIFT: 
12743, 12729 => 12884, 12888, 12901
  + .config: warning: override: reassigning to symbol PPC_64K_PAGES:  => 13264
  + /kisskb/src/arch/arm/mach-omap1/board-h2.c: warning: 'isp1301_gpiod_table' 
defined but not used [-Wunused-variable]:  => 347:34
  + /kisskb/src/arch/sh/kernel/traps.c: warning: unused variable 'cpu' 
[-Wunused-variable]:  => 183:15
  + /kisskb/src/drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn20.c: 
warning: (near initialization for 'boot_options.bits') [-Wmissing-braces]:  => 
326:8
  + /kisskb/src/drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_dcn20.c: 
warning: missing braces around initializer [-Wmissing-braces]:  => 326:8
  + /kisskb/src/drivers/rtc/rtc-rx6110.c: warning: 'rx6110_probe' defined but 
not used [-Wunused-function]:  => 314:12
  + /kisskb/src/drivers/soc/qcom/pdr_interface.c: warning: (near initialization 
for 'req.service_path') [-Wmissing-braces]:  => 572:9
  + /kisskb/src/drivers/soc/qcom/pdr_interface.c: warning: missing braces 
around initializer [-Wmissing-braces]:  => 572:9
  + /kisskb/src/include/linux/minmax.h: warning: comparison of distinct pointer 
types lacks a cast:  => 18:28
  + /kisskb/src/lib/bitfield_kunit.c: warning: the frame size of 4200 bytes is 
larger than 2048 bytes [-Wframe-larger-than=]:  => 93:1
  + /kisskb/src/lib/bitfield_kunit.c: warning: the frame size of 4224 bytes is 
larger than 2048 bytes [-Wframe-larger-than=]:  => 93:1
  + /kisskb/src/lib/bitfield_kunit.c: warning: the frame size of 7432 bytes is 
larger than 2048 bytes [-Wframe-larger-than=]:  => 93:1
  + /kisskb/src/lib/bitfield_kunit.c: warning: the frame size of 7440 bytes is 
larger than 2048 bytes [-Wframe-larger-than=]:  => 93:1
  + /kisskb/src/lib/bitfield_kunit.c: warning: the frame size of 7456 bytes is 
larger than 2048 bytes [-Wframe-larger-than=]:  => 93:1
  + /kisskb/src/lib/zstd/compress.c: warning: the frame size of 1348 bytes is 
larger than 1280 bytes [-Wframe-larger-than=]:  => 2262:1
  + 
/opt/cross/kisskb/br-aarch64-glibc-2016.08-613-ge98b4dd/bin/../lib/gcc/aarch64-buildroot-linux-gnu/5.4.0/plugin/include/config/elfos.h:
 warning: invalid suffix on literal; C++11 requires a space between literal and 
string macro [-Wliteral-suffix]:  => 102:21, 170:24
  + 
/opt/cross/kisskb/br-aarch64-glibc-2016.08-613-ge98b4dd/bin/../lib/gcc/aarch64-buildroot-linux-gnu/5.4.0/plugin/include/defaults.h:
 warning: invalid suffix on literal; C++11 requires a space between literal and 
string macro [-Wliteral-suffix]:  => 126:24
  + 
/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/../lib/gcc/mipsel-buildroot-linux-uclibc/5.4.0/plugin/include/config/elfos.h:
 warning: invalid suffix on literal; C++11 requires a space between literal and 
string macro [-Wliteral-suffix]:  => 102:21, 170:24
  + 
/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/../lib/gcc/mipsel-buildroot-linux-uclibc/5.4.0/plugin/include/config/mips/mips.h:
 warning: invalid suffix on literal; C++11 requires a space between literal and 
string macro [-Wliteral-suffix]:  => 2913:20
  + 
/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/../lib/gcc/mipsel-buildroot-linux-uclibc/5.4.0/plugin/include/defaults.h:
 warning: invalid suffix on literal; C++11 requires a space between literal and 
string macro [-Wliteral-suffix]:  => 126:24
  + 
/opt/cross/kisskb/kor

Re: [PATCH v2 00/12] arm64: dts: zynqmp: DT updates to match latest drivers

2021-02-01 Thread Michal Simek




On 1/21/21 11:26 AM, Michal Simek wrote:
> Hi,
> 
> I am sending this series to reflect the latest drivers which have been
> merged to mainline kernel. I have boot it on zcu102-rev1.0 and also
> zcu104-rev1.0. That's why I have also added DT for this newer revision.
> 
> The series is based on https://github.com/Xilinx/linux-xlnx/tree/zynqmp/dt.
> And mio-bank patch requires update in dt-binding which has been posted here
> https://lore.kernel.org/r/5fa17dfe4b42abefd84b4cbb7b8bcd4d31398f40.1606914986.git.michal.si...@xilinx.com
> 
> Thanks,
> Michal
> 
> Changes in v2:
> - Remove reset description for IPs from this patch. IPs will be enabled
>   separately with DT binding update.
> - Change patch subject
> 
> Michal Simek (12):
>   arm64: dts: zynqmp: Fix u48 si5382 chip on zcu111
>   arm64: dts: zynqmp: Add DT description for si5328 for zcu102/zcu106
>   arm64: dts: zynqmp: Enable si5341 driver for zcu102/106/111
>   arm64: dts: zynqmp: Enable reset controller driver
>   arm64: dts: zynqmp: Enable phy driver for Sata on zcu102/zcu104/zcu106
>   arm64: dts: zynqmp: Add label for zynqmp_ipi
>   arm64: dts: zynqmp: Add missing mio-bank properties to sdhcis
>   arm64: dts: zynqmp: Wire arasan nand controller
>   arm64: dts: zynqmp: Wire zynqmp qspi controller
>   arm64: dts: zynqmp: Add missing lpd watchdog node
>   arm64: dts: zynqmp: Add missing iommu IDs
>   arm64: dts: zynqmp: Add description for zcu104 revC
> 
>  arch/arm64/boot/dts/xilinx/Makefile   |   1 +
>  .../arm64/boot/dts/xilinx/zynqmp-clk-ccf.dtsi |  12 +
>  .../boot/dts/xilinx/zynqmp-zcu100-revC.dts|   2 +
>  .../boot/dts/xilinx/zynqmp-zcu102-revA.dts|  84 +-
>  .../boot/dts/xilinx/zynqmp-zcu104-revA.dts|  29 ++
>  .../boot/dts/xilinx/zynqmp-zcu104-revC.dts| 282 ++
>  .../boot/dts/xilinx/zynqmp-zcu106-revA.dts|  78 +
>  .../boot/dts/xilinx/zynqmp-zcu111-revA.dts|  59 +++-
>  arch/arm64/boot/dts/xilinx/zynqmp.dtsi|  94 +-
>  9 files changed, 637 insertions(+), 4 deletions(-)
>  create mode 100644 arch/arm64/boot/dts/xilinx/zynqmp-zcu104-revC.dts
> 

Applied all.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1492 matches

Mail list logo