Re: [PATCH v4 04/16] powerpc/64s/exceptions: machine check reconcile irq state
Excerpts from Michael Ellerman's message of May 8, 2020 11:39 pm: > Nicholas Piggin writes: > >> pseries fwnmi machine check code pops the soft-irq checks in rtas_call >> (after the previous patch to remove rtas_token from this call path). > ^ > I changed this to "next" which I think is what you meant? Yes I rearranged them, so yes. Thanks, Nick
Re: [PATCH v4 11/16] powerpc/64s: machine check interrupt update NMI accounting
Excerpts from kbuild test robot's message of May 9, 2020 1:13 pm: > Hi Nicholas, > > I love your patch! Yet something to improve: ... > 1419#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) > 1420pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, > regs->dsisr); > 1421#else > 1422pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, > regs->dsisr); > 1423#endif > 1424#ifdef CONFIG_PPC64 >> 1425 pr_cont("IRQMASK: %lx IN_NMI:%d IN_MCE:%d", regs->softe, >> (int)get_paca()->in_nmi, (int)get_paca()->in_mce); Oh I meant to get rid of that hunk, it crept back in :( mpe if you could please take it out if you're merging this. It was quite useful for debugging this stuff, I might do a proper patch for this, but for now not necessary (it doesn't matter for "normal" crashes only crash crashes). Thanks, Nick
Re: [PATCH RFC 1/4] powerpc/radix: Fix compilation for radix with CONFIG_SMP=n
Excerpts from Paul Mackerras's message of May 9, 2020 3:02 pm: > This fixes the compile errors we currently get with CONFIG_SMP=n and > CONFIG_PPC_RADIX_MMU=y. Did I already fix this, or does it keep getting broken?! :( Anyway fine by me if it's required. Thanks, Nick
Re: [PATCH RFC 2/4] powerpc: Add Microwatt platform
Excerpts from Paul Mackerras's message of May 9, 2020 3:02 pm: > Microwatt is a FPGA-based implementation of the Power ISA. It > currently only implements little-endian 64-bit mode, and does > not (yet) support SMP. > > This adds a new machine type to support FPGA-based SoCs with a > Microwatt core. Very cool! Would there be any point sharing this with the "naked metal" platform Alistair has for booting POWER in L3 without OPAL? Or is it easy enough to have a several different simple 64s platforms? I have an HPT conditional compile patch and a few other diet Kconfig things I'll now be better justified to try get merged :) Thanks, Nick
Re: [PATCH RFC 4/4] powerpc/radix: Add support for microwatt's PRTBL SPR
Excerpts from Paul Mackerras's message of May 9, 2020 3:04 pm: > Microwatt currently doesn't implement hypervisor mode and therefore > doesn't implement the partition table. It does implement the process > table and radix page table walks. > > This adds code to write the base address of the process table to the > PRTBL SPR, which has been assigned SPR 720 for now, as that is in the > range of SPR numbers assigned for experimental use. PRTBL is only > written when we have neither the FW_FEATURE_LPAR feature nor the > CPU_FTR_HVMODE feature. What do you think about adding a FW_FEATURE and/or CPU_FTR for microwatt? No big deal now, but I'm sure you'll be adding other things. Thanks, Nick > > Signed-off-by: Paul Mackerras > --- > arch/powerpc/include/asm/reg.h | 1 + > arch/powerpc/mm/book3s64/radix_pgtable.c | 13 + > 2 files changed, 10 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h > index 1aa46dff0957..6ea3fc42740d 100644 > --- a/arch/powerpc/include/asm/reg.h > +++ b/arch/powerpc/include/asm/reg.h > @@ -721,6 +721,7 @@ > #endif > #define SPRN_TIR 0x1BE /* Thread Identification Register */ > #define SPRN_PTCR0x1D0 /* Partition table control Register */ > +#define SPRN_PRTBL 0x2D0 /* Process table pointer */ > #define SPRN_PSPB0x09F /* Problem State Priority Boost reg */ > #define SPRN_PTEHI 0x3D5 /* 981 7450 PTE HI word (S/W TLB load) */ > #define SPRN_PTELO 0x3D6 /* 982 7450 PTE LO word (S/W TLB load) */ > diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c > b/arch/powerpc/mm/book3s64/radix_pgtable.c > index dd1bea45325c..2e6a376c9d82 100644 > --- a/arch/powerpc/mm/book3s64/radix_pgtable.c > +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c > @@ -600,10 +600,15 @@ void __init radix__early_init_mmu(void) > radix_init_pgtable(); > > if (!firmware_has_feature(FW_FEATURE_LPAR)) { > - lpcr = mfspr(SPRN_LPCR); > - mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR); > - radix_init_partition_table(); > - radix_init_amor(); > + if (cpu_has_feature(CPU_FTR_HVMODE)) { > + lpcr = mfspr(SPRN_LPCR); > + mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR); > + radix_init_partition_table(); > + radix_init_amor(); > + } else { > + mtspr(SPRN_PRTBL, (__pa(process_tb) | > +(PRTB_SIZE_SHIFT - 12))); > + } > } else { > radix_init_pseries(); > } > -- > 2.25.3 > >
Re: ioremap() called early from pnv_pci_init_ioda_phb()
On Sat, May 9, 2020 at 12:41 AM Qian Cai wrote: > > Booting POWER9 PowerNV has this message, > > "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use > early_ioremap() instead” > > but use the patch below will result in leaks because it will never call > early_iounmap() anywhere. However, it looks me it was by design that > phb->regs mapping would be there forever where it would be used in > pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall > too strong? The warning there is junk. The PHBs are setup at boot and never torn down so we're not "leaking" the mapping. It's supposed to be there for the lifetime of the kernel. That said, we could probably move the PCI setup to a point later in boot where the normal ioremap can be be used. We would have to check for initcalls which depend on the PHBs being setup and delay those too though. Oliver
Re: [PATCH 1/4] dma-mapping: move the remaining DMA API calls out of line
On Tue, May 05, 2020 at 02:18:37PM +1000, Alexey Kardashevskiy wrote: > > > On 17/04/2020 17:58, Christoph Hellwig wrote: > > On Wed, Apr 15, 2020 at 09:21:37PM +1000, Alexey Kardashevskiy wrote: > >> And the fact they were exported leaves possibility that there is a > >> driver somewhere relying on these symbols or distro kernel won't build > >> because the symbol disappeared from exports (I do not know what KABI > >> guarantees or if mainline kernel cares). > > > > We absolutely do not care. In fact for abuses of APIs that drivers > > should not use we almost care to make them private and break people > > abusing them. > > ok :) > > >> I do not care in particular but > >> some might, a line separated with empty lines in the commit log would do. > > > > I'll add a blurb for the next version. > > > Has it gone anywhere? Thanks, I've been hoping for the sg_buf helpers to land first, as they need backporting and would conflict. Do you urgently need the series?
Re: remove a few uses of ->queuedata
On Fri, May 08, 2020 at 11:04:45AM -0700, Dan Williams wrote: > On Fri, May 8, 2020 at 9:16 AM Christoph Hellwig wrote: > > > > Hi all, > > > > various bio based drivers use queue->queuedata despite already having > > set up disk->private_data, which can be used just as easily. This > > series cleans them up to only use a single private data pointer. > > ...but isn't the queue pretty much guaranteed to be cache hot and the > gendisk cache cold? I'm not immediately seeing what else needs the > gendisk in the I/O path. Is there another motivation I'm missing? ->private_data is right next to the ->queue pointer, pat0 and part_tbl which are all used in the I/O submission path (generic_make_request / generic_make_request_checks). This is mostly a prep cleanup patch to also remove the pointless queue argument from ->make_request - then ->queue is an extra dereference and extra churn.
Re: remove a few uses of ->queuedata
On Sat, May 09, 2020 at 06:13:21AM +0800, Ming Lei wrote: > On Fri, May 08, 2020 at 06:15:02PM +0200, Christoph Hellwig wrote: > > Hi all, > > > > various bio based drivers use queue->queuedata despite already having > > set up disk->private_data, which can be used just as easily. This > > series cleans them up to only use a single private data pointer. > > > > blk-mq based drivers that have code pathes that can't easily get at > > the gendisk are unaffected by this series. > > Yeah, before adding disk, there still may be requests queued to LLD > for blk-mq based drivers. > > So are there this similar situation for these bio based drivers? bio submittsion is based on the gendisk, so we can't submit before it is added. The passthrough request based path obviously doesn't apply here.
Re: [PATCH RFC 2/4] powerpc: Add Microwatt platform
On Saturday, 9 May 2020 5:58:57 PM AEST Nicholas Piggin wrote: > Excerpts from Paul Mackerras's message of May 9, 2020 3:02 pm: > > Microwatt is a FPGA-based implementation of the Power ISA. It > > currently only implements little-endian 64-bit mode, and does > > not (yet) support SMP. > > > > This adds a new machine type to support FPGA-based SoCs with a > > Microwatt core. > > Very cool! > > Would there be any point sharing this with the "naked metal" platform > Alistair has for booting POWER in L3 without OPAL? Or is it easy enough > to have a several different simple 64s platforms? It looks pretty similar at the moment, I've been meaning to clean those patches up and send them upstream but Paul has beaten me to it. The main difference so far is how the console is setup. For booting cache contained I was using a device tree pointing at a standard UART driver and enabling the standard OF platform device tree probing. - Alistair > I have an HPT conditional compile patch and a few other diet Kconfig > things I'll now be better justified to try get merged :) > > Thanks, > Nick
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Excerpts from Qian Cai's message of May 9, 2020 3:41 am: > > >> On May 8, 2020, at 10:39 AM, Qian Cai wrote: >> >> Booting POWER9 PowerNV has this message, >> >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use >> early_ioremap() instead” >> >> but use the patch below will result in leaks because it will never call >> early_iounmap() anywhere. However, it looks me it was by design that >> phb->regs mapping would be there forever where it would be used in >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() >> initcall too strong? >> >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >> @@ -36,6 +36,7 @@ >> #include >> #include >> #include >> +#include >> >> #include >> >> @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct >> device_node *np, >>/* Get registers */ >>if (!of_address_to_resource(np, 0, &r)) { >>phb->regs_phys = r.start; >> - phb->regs = ioremap(r.start, resource_size(&r)); >> + phb->regs = early_ioremap(r.start, resource_size(&r)); >>if (phb->regs == NULL) >>pr_err(" Failed to map registers !\n”); > > This will also trigger a panic with debugfs reads, so isn’t that this commit > bogus at least for powerpc64? Your patch to use early_ioremap is faulting? I wonder why? Thanks, Nick > > d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") > > 11017.617022][T122068] Faulting instruction address: 0xc00db564 > [11017.617257][T122066] Faulting instruction address: 0xc00db564 > [11017.617950][T122073] Faulting instruction address: 0xc00db564 > [11017.61][T122064] BUG: Unable to handle kernel data access on read at > 0xffe20e10 > [11017.618935][T122064] Faulting instruction address: 0xc00db564 > [11017.737996][T122072] > [11017.738010][T122073] Oops: Kernel access of bad area, sig: 11 [#2] > [11017.738024][T122073] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 > DEBUG_PAGEALLOC NUMA PowerNV > [11017.738051][T122073] Modules linked in: brd ext4 crc16 mbcache jbd2 loop > kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata > libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod > [11017.738110][T122073] CPU: 108 PID: 122073 Comm: read_all Tainted: G D > W 5.7.0-rc4-next-20200508+ #4 > [11017.738138][T122073] NIP: c00db564 LR: c056f660 CTR: > c00db550 > [11017.738173][T122073] REGS: c00374f6f980 TRAP: 0380 Tainted: G D > W (5.7.0-rc4-next-20200508+) > [11017.738234][T122073] MSR: 90009033 CR: > 22002282 XER: 2004 > [11017.738278][T122073] CFAR: c056f65c IRQMASK: 0 > [11017.738278][T122073] GPR00: c056f660 c00374f6fc10 > c1689400 c000201ffc41aa00 > [11017.738278][T122073] GPR04: c00374f6fc70 > 0001 > [11017.738278][T122073] GPR08: ffe2 > c008ee380080 > [11017.738278][T122073] GPR12: c00db550 c000201fff671280 > > [11017.738278][T122073] GPR16: 0002 10040800 > 1001ccd8 1001cc80 > [11017.738278][T122073] GPR20: 1001cc98 1001ccc8 > 1001cca8 1001cb48 > [11017.738278][T122073] GPR24: > 03ff 7fffebb67390 > [11017.738278][T122073] GPR28: c00374f6fd90 c000200c0c6a7550 > c000200c0c6a7500 > [11017.738542][T122073] NIP [c00db564] > pnv_eeh_dbgfs_get_inbB+0x14/0x30 > [11017.738579][T122073] LR [c056f660] simple_attr_read+0xa0/0x180 > [11017.738613][T122073] Call Trace: > [11017.738645][T122073] [c00374f6fc10] [c056f630] > simple_attr_read+0x70/0x180 (unreliable) > [11017.738672][T122073] [c00374f6fcb0] [c064a2e0] > full_proxy_read+0x90/0xe0 > [11017.738686][T122073] [c00374f6fd00] [c051fe0c] > __vfs_read+0x3c/0x70 > [11017.738722][T122073] [c00374f6fd20] [c051feec] > vfs_read+0xac/0x170 > [11017.738757][T122073] [c00374f6fd70] [c052034c] > ksys_read+0x7c/0x140 > [11017.738818][T122073] [c00374f6fdc0] [c0038af4] > system_call_exception+0x114/0x1e0 > [11017.738867][T122073] [c00374f6fe20] [c000c8f0] > system_call_common+0xf0/0x278 > [11017.738916][T122073] Instruction dump: > [11017.738948][T122073] 7c0004ac f9490d10 a14d0c78 3860 b14d0c7a 4e800020 > 6000 7c0802a6 > [11017.739001][T122073] 6000 e9230278 e9290028 7c0004ac > 0c09 4c00012c 3860
[PATCH 09/15, fіxed] lightnvm: stop using ->queuedata
Signed-off-by: Christoph Hellwig --- fixed the compilation in the print_ppa arguments drivers/lightnvm/core.c | 1 - drivers/lightnvm/pblk-init.c | 2 +- drivers/lightnvm/pblk.h | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c index db38a68abb6c0..85c5490cdfd2e 100644 --- a/drivers/lightnvm/core.c +++ b/drivers/lightnvm/core.c @@ -400,7 +400,6 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create) } tdisk->private_data = targetdata; - tqueue->queuedata = targetdata; mdts = (dev->geo.csecs >> 9) * NVM_MAX_VLBA; if (dev->geo.mdts) { diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c index 9a967a2e83dd7..bec904ec0f7c0 100644 --- a/drivers/lightnvm/pblk-init.c +++ b/drivers/lightnvm/pblk-init.c @@ -49,7 +49,7 @@ struct bio_set pblk_bio_set; static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio) { - struct pblk *pblk = q->queuedata; + struct pblk *pblk = bio->bi_disk->private_data; if (bio_op(bio) == REQ_OP_DISCARD) { pblk_discard(pblk, bio); diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h index 86ffa875bfe16..49718105bc0dc 100644 --- a/drivers/lightnvm/pblk.h +++ b/drivers/lightnvm/pblk.h @@ -1255,7 +1255,7 @@ static inline int pblk_boundary_ppa_checks(struct nvm_tgt_dev *tgt_dev, continue; } - print_ppa(tgt_dev->q->queuedata, ppa, "boundary", i); + print_ppa(tgt_dev->disk->q->private_data, ppa, "boundary", i); return 1; } -- 2.26.2
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Excerpts from Oliver O'Halloran's message of May 9, 2020 6:11 pm: > On Sat, May 9, 2020 at 12:41 AM Qian Cai wrote: >> >> Booting POWER9 PowerNV has this message, >> >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use >> early_ioremap() instead” >> >> but use the patch below will result in leaks because it will never call >> early_iounmap() anywhere. However, it looks me it was by design that >> phb->regs mapping would be there forever where it would be used in >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() >> initcall too strong? > > The warning there is junk. The PHBs are setup at boot and never torn > down so we're not "leaking" the mapping. It's supposed to be there for > the lifetime of the kernel. > > That said, we could probably move the PCI setup to a point later in > boot where the normal ioremap can be be used. We would have to check > for initcalls which depend on the PHBs being setup and delay those too > though. I think it helps to unify code a bit more and take special cases out of ioremap to have all these early calls use early_ioremap. We actually do want to move these later if possible too, on radix they use memblock for page tables, and on hash they don't even set up proper kernel page tables but just bolt PTEs into the hash table. Thanks, Nick
Re: [PATCH RFC 2/4] powerpc: Add Microwatt platform
We should do that here too. Time to replace our silly UART !On 9 May 2020 6:36 pm, Alistair Popple wrote:On Saturday, 9 May 2020 5:58:57 PM AEST Nicholas Piggin wrote: > Excerpts from Paul Mackerras's message of May 9, 2020 3:02 pm: > > Microwatt is a FPGA-based implementation of the Power ISA. It > > currently only implements little-endian 64-bit mode, and does > > not (yet) support SMP. > > > > This adds a new machine type to support FPGA-based SoCs with a > > Microwatt core. > > Very cool! > > Would there be any point sharing this with the "naked metal" platform > Alistair has for booting POWER in L3 without OPAL? Or is it easy enough > to have a several different simple 64s platforms? It looks pretty similar at the moment, I've been meaning to clean those patches up and send them upstream but Paul has beaten me to it. The main difference so far is how the console is setup. For booting cache contained I was using a device tree pointing at a standard UART driver and enabling the standard OF platform device tree probing. - Alistair > I have an HPT conditional compile patch and a few other diet Kconfig > things I'll now be better justified to try get merged :) > > Thanks, > Nick
Re: [PATCH RFC 2/4] powerpc: Add Microwatt platform
Excerpts from Alistair Popple's message of May 9, 2020 6:36 pm: > On Saturday, 9 May 2020 5:58:57 PM AEST Nicholas Piggin wrote: >> Excerpts from Paul Mackerras's message of May 9, 2020 3:02 pm: >> > Microwatt is a FPGA-based implementation of the Power ISA. It >> > currently only implements little-endian 64-bit mode, and does >> > not (yet) support SMP. >> > >> > This adds a new machine type to support FPGA-based SoCs with a >> > Microwatt core. >> >> Very cool! >> >> Would there be any point sharing this with the "naked metal" platform >> Alistair has for booting POWER in L3 without OPAL? Or is it easy enough >> to have a several different simple 64s platforms? > > It looks pretty similar at the moment, I've been meaning to clean those > patches up and send them upstream but Paul has beaten me to it. The main > difference so far is how the console is setup. For booting cache contained I > was using a device tree pointing at a standard UART driver and enabling the > standard OF platform device tree probing. Well I'd only merge them if you think it makes sense. If the platform is a perfectly good abstraction for the differences and merging them would just result in painful special cases it wouldn't be worthwhile. It's clearly not a lot of code. Thanks, Nick
[PATCH fixes] powerpc/vdso32: Fallback on getres syscall when clock is unknown
There are other clocks than the standard ones, for instance per process clocks. Therefore, being above the last standard clock doesn't mean it is a bad clock. So, fallback to syscall instead of returning -EINVAL inconditionaly. Fixes: e33ffc956b08 ("powerpc/vdso32: implement clock_getres entirely") Cc: sta...@vger.kernel.org Reported-by: Aurelien Jarno Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S index a3951567118a..e7f8f9f1b3f4 100644 --- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -218,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres) blr /* -* invalid clock +* syscall fallback */ 99: - li r3, EINVAL - crset so + li r0,__NR_clock_getres + sc blr .cfi_endproc V_FUNCTION_END(__kernel_clock_getres) -- 2.25.0
Re: [PATCH 01/15] nfblock: use gendisk private_data
Hi Christoph, On Fri, May 8, 2020 at 6:16 PM Christoph Hellwig wrote: > Signed-off-by: Christoph Hellwig Thanks for your patch! > --- a/arch/m68k/emu/nfblock.c > +++ b/arch/m68k/emu/nfblock.c > @@ -61,7 +61,7 @@ struct nfhd_device { > > static blk_qc_t nfhd_make_request(struct request_queue *queue, struct bio > *bio) > { > - struct nfhd_device *dev = queue->queuedata; > + struct nfhd_device *dev = bio->bi_disk->private_data; > struct bio_vec bvec; > struct bvec_iter iter; > int dir, len, shift; > @@ -122,7 +122,6 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 > bsize) > if (dev->queue == NULL) > goto free_dev; > > - dev->queue->queuedata = dev; > blk_queue_logical_block_size(dev->queue, bsize); > > dev->disk = alloc_disk(16); > @@ -136,6 +135,7 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 > bsize) > sprintf(dev->disk->disk_name, "nfhd%u", dev_id); > set_capacity(dev->disk, (sector_t)blocks * (bsize / 512)); > dev->disk->queue = dev->queue; > + dev->disk->private_data = dev; This is already set above, just before the quoted sprintf() call. > > add_disk(dev->disk); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH -next] soc: fsl: dpio: remove set but not used variable 'addr_cena'
Pls ignore this duplicate. On 2020/5/8 22:10, YueHaibing wrote: > drivers/soc/fsl/dpio//qbman-portal.c:650:11: warning: variable 'addr_cena' > set but not used [-Wunused-but-set-variable] > uint64_t addr_cena; >^ > > It is never used, so remove it. > > Signed-off-by: YueHaibing > --- > drivers/soc/fsl/dpio/qbman-portal.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/soc/fsl/dpio/qbman-portal.c > b/drivers/soc/fsl/dpio/qbman-portal.c > index e2e9fbb58a72..0ce859b60888 100644 > --- a/drivers/soc/fsl/dpio/qbman-portal.c > +++ b/drivers/soc/fsl/dpio/qbman-portal.c > @@ -647,7 +647,6 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s, > const uint32_t *cl = (uint32_t *)d; > uint32_t eqcr_ci, eqcr_pi, half_mask, full_mask; > int i, num_enqueued = 0; > - uint64_t addr_cena; > > spin_lock(&s->access_spinlock); > half_mask = (s->eqcr.pi_ci_mask>>1); > @@ -700,7 +699,6 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s, > > /* Flush all the cacheline without load/store in between */ > eqcr_pi = s->eqcr.pi; > - addr_cena = (size_t)s->addr_cena; > for (i = 0; i < num_enqueued; i++) > eqcr_pi++; > s->eqcr.pi = eqcr_pi & full_mask; >
Re: [PATCH v3] powerpc/64s/pgtable: fix an undefined behaviour
> On Mar 6, 2020, at 1:56 AM, Christophe Leroy wrote: > > > > Le 06/03/2020 à 05:48, Qian Cai a écrit : >> Booting a power9 server with hash MMU could trigger an undefined >> behaviour because pud_offset(p4d, 0) will do, >> 0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10) >> Fix it by converting pud_index() and friends to static inline >> functions. >> UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15 >> shift exponent 34 is too large for 32-bit type 'int' >> CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13 >> Call Trace: >> dump_stack+0xf4/0x164 (unreliable) >> ubsan_epilogue+0x18/0x78 >> __ubsan_handle_shift_out_of_bounds+0x160/0x21c >> walk_pagetables+0x2cc/0x700 >> walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282 >> (inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311 >> ptdump_check_wx+0x8c/0xf0 >> mark_rodata_ro+0x48/0x80 >> kernel_init+0x74/0x194 >> ret_from_kernel_thread+0x5c/0x74 >> Suggested-by: Christophe Leroy >> Signed-off-by: Qian Cai > > Reviewed-by: Christophe Leroy Michael, can you take a look at this patch when you have a chance? It looks falling through the cracks. > >> --- >> v3: convert pud_index() etc to static inline functions. >> v2: convert pud_offset() etc to static inline functions. >> arch/powerpc/include/asm/book3s/64/pgtable.h | 23 >> 1 file changed, 19 insertions(+), 4 deletions(-) >> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h >> b/arch/powerpc/include/asm/book3s/64/pgtable.h >> index 201a69e6a355..bd432c6706b9 100644 >> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h >> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h >> @@ -998,10 +998,25 @@ extern struct page *pgd_page(pgd_t pgd); >> #define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS) >> #define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS) >> -#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - >> 1)) >> -#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1)) >> -#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1)) >> -#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - >> 1)) >> +static inline unsigned long pgd_index(unsigned long address) >> +{ >> +return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1); >> +} >> + >> +static inline unsigned long pud_index(unsigned long address) >> +{ >> +return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1); >> +} >> + >> +static inline unsigned long pmd_index(unsigned long address) >> +{ >> +return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); >> +} >> + >> +static inline unsigned long pte_index(unsigned long address) >> +{ >> +return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1); >> +} >>/* >> * Find an entry in a page-table-directory. We combine the address region
Re: [PATCH 1/4] dma-mapping: move the remaining DMA API calls out of line
On 09/05/2020 18:19, Christoph Hellwig wrote: > On Tue, May 05, 2020 at 02:18:37PM +1000, Alexey Kardashevskiy wrote: >> >> >> On 17/04/2020 17:58, Christoph Hellwig wrote: >>> On Wed, Apr 15, 2020 at 09:21:37PM +1000, Alexey Kardashevskiy wrote: And the fact they were exported leaves possibility that there is a driver somewhere relying on these symbols or distro kernel won't build because the symbol disappeared from exports (I do not know what KABI guarantees or if mainline kernel cares). >>> >>> We absolutely do not care. In fact for abuses of APIs that drivers >>> should not use we almost care to make them private and break people >>> abusing them. >> >> ok :) >> I do not care in particular but some might, a line separated with empty lines in the commit log would do. >>> >>> I'll add a blurb for the next version. >> >> >> Has it gone anywhere? Thanks, > > I've been hoping for the sg_buf helpers to land first, as they need > backporting and would conflict. Do you urgently need the series? Nah, not that urgent. Thanks, -- Alexey
Re: remove a few uses of ->queuedata
On Sat, May 9, 2020 at 1:24 AM Christoph Hellwig wrote: > > On Fri, May 08, 2020 at 11:04:45AM -0700, Dan Williams wrote: > > On Fri, May 8, 2020 at 9:16 AM Christoph Hellwig wrote: > > > > > > Hi all, > > > > > > various bio based drivers use queue->queuedata despite already having > > > set up disk->private_data, which can be used just as easily. This > > > series cleans them up to only use a single private data pointer. > > > > ...but isn't the queue pretty much guaranteed to be cache hot and the > > gendisk cache cold? I'm not immediately seeing what else needs the > > gendisk in the I/O path. Is there another motivation I'm missing? > > ->private_data is right next to the ->queue pointer, pat0 and part_tbl > which are all used in the I/O submission path (generic_make_request / > generic_make_request_checks). This is mostly a prep cleanup patch to > also remove the pointless queue argument from ->make_request - then > ->queue is an extra dereference and extra churn. Ah ok. If the changelogs had been filled in with something like "In preparation for removing @q from make_request_fn, stop using ->queuedata", I probably wouldn't have looked twice. For the nvdimm/ driver updates you can add: Reviewed-by: Dan Williams ...or just let me know if you want me to pick those up through the nvdimm tree.
Re: ioremap() called early from pnv_pci_init_ioda_phb()
> On May 9, 2020, at 4:38 AM, Nicholas Piggin wrote: > > Your patch to use early_ioremap is faulting? I wonder why? Yes, I don’t know the reasons either. I suppose not many places in other parts of the kernel which keep using those addresses from early_ioremap() after system booted. Otherwise, we would see those leak warnings elsewhere. Thus, we probably have to audit the code, and if still necessary, call early_ioremap() and then early_iounmap() followed by a ioremap() once slab allocator is available?
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Le 08/05/2020 à 19:41, Qian Cai a écrit : On May 8, 2020, at 10:39 AM, Qian Cai wrote: Booting POWER9 PowerNV has this message, "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use early_ioremap() instead” but use the patch below will result in leaks because it will never call early_iounmap() anywhere. However, it looks me it was by design that phb->regs mapping would be there forever where it would be used in pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall too strong? --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -36,6 +36,7 @@ #include #include #include +#include #include @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, /* Get registers */ if (!of_address_to_resource(np, 0, &r)) { phb->regs_phys = r.start; - phb->regs = ioremap(r.start, resource_size(&r)); + phb->regs = early_ioremap(r.start, resource_size(&r)); if (phb->regs == NULL) pr_err(" Failed to map registers !\n”); This will also trigger a panic with debugfs reads, so isn’t that this commit bogus at least for powerpc64? d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") No d538aadc2718 is not bogus. That's the point, we want to remove all early usages of ioremap() in order to remove the hack with the ioremap_bot stuff and all, and stick to the generic ioremap logic. In order to do so, all early use of ioremap() has to be converted to early_ioremap() or to fixmap or anything else that allows to do ioremaps before the slab is ready. early_ioremap() is for temporary mappings necessary at boottime. For long lasting mappings, another method is to be used. Now, the point is that other architectures like for instance x86 don't seem to have to use early_ioremap() much. Powerpc is for instance doing early mappings for PCI. Seems like x86 initialises PCI once slab is ready. Can't powerpc do the same ? Christophe
Re: [PATCH v8 8/8] powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32
Le 28/04/2020 à 18:05, Arnd Bergmann a écrit : On Tue, Apr 28, 2020 at 3:16 PM Christophe Leroy wrote: Provides __kernel_clock_gettime64() on vdso32. This is the 64 bits version of __kernel_clock_gettime() which is y2038 compliant. Signed-off-by: Christophe Leroy Looks good to me Reviewed-by: Arnd Bergmann There was a bug on ARM for the corresponding function, so far it is unclear if this was a problem related to particular hardware, the 32-bit kernel code, or the common implementation of clock_gettime64 in the vdso library, see https://github.com/richfelker/musl-cross-make/issues/96 Just to be sure that powerpc is not affected by the same issue, can you confirm that repeatedly calling clock_gettime64 on powerpc32, alternating between vdso and syscall, results in monotically increasing times? I think that's one of the things vdsotest checks, so yes that's ok I think. Christophe
[PATCH vdsotest] Add support for clock_gettime64() on powerpc32
libc test is commented out because at the time being very few libc if any supports clock_gettime64() syscall number is hardcoded when it doesn't exists in unistd.h Signed-off-by: Christophe Leroy --- Based on master from https://github.com/nathanlynch/vdsotest.git src/clock-boottime.c | 4 + src/clock-monotonic-coarse.c | 4 + src/clock-monotonic-raw.c | 4 + src/clock-monotonic.c | 4 + src/clock-realtime-coarse.c| 4 + src/clock-realtime.c | 4 + src/clock-tai.c| 4 + src/clock_gettime64_template.c | 401 + 8 files changed, 429 insertions(+) create mode 100644 src/clock_gettime64_template.c diff --git a/src/clock-boottime.c b/src/clock-boottime.c index 9fb1ac48501d..07ef31d08614 100644 --- a/src/clock-boottime.c +++ b/src/clock-boottime.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic-coarse.c b/src/clock-monotonic-coarse.c index ca1df58691ca..da064188756f 100644 --- a/src/clock-monotonic-coarse.c +++ b/src/clock-monotonic-coarse.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic-raw.c b/src/clock-monotonic-raw.c index 5dbb1842e698..55373b94ecfd 100644 --- a/src/clock-monotonic-raw.c +++ b/src/clock-monotonic-raw.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic.c b/src/clock-monotonic.c index 44318ae1e1c2..a900d24598a1 100644 --- a/src/clock-monotonic.c +++ b/src/clock-monotonic.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-realtime-coarse.c b/src/clock-realtime-coarse.c index 8f33f9a2d30b..8f2e6242bf0d 100644 --- a/src/clock-realtime-coarse.c +++ b/src/clock-realtime-coarse.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-realtime.c b/src/clock-realtime.c index 079fd801e654..ab76329a6676 100644 --- a/src/clock-realtime.c +++ b/src/clock-realtime.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-tai.c b/src/clock-tai.c index ad0448adeba5..cb2511039ddc 100644 --- a/src/clock-tai.c +++ b/src/clock-tai.c @@ -7,3 +7,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock_gettime64_template.c b/src/clock_gettime64_template.c new file mode 100644 index ..4752845148d6 --- /dev/null +++ b/src/clock_gettime64_template.c @@ -0,0 +1,401 @@ +/* + * Copyright 2014 Mentor Graphics Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; version 2 of the + * License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "compiler.h" +#include "vdsotest.h" + +struct timespec64 { + long long tv_sec; + long tv_nsec; +}; + +#ifndef SYS_clock_gettime64 +#define SYS_clock_gettime64 403 +#endif + +static int (*clock_gettime64_vdso)(clockid_t id, struct timespec64 *ts); + +static bool vdso_has_clock_gettime64(void) +{ + return clock_gettime64_vdso != NULL; +} + +static int clock_gettime64_syscall_wrapper(clockid_t id, struct timespec64 *ts) +{ + return syscall(SYS_clock_gettime64, id, ts); +} + +static void clock_gettime64_syscall_nofail(clockid_t id, struct timespec64 *ts) +{ + int err; + + err = clock_gettime64_syscall_wrapper(id, ts); + if (err) + error(EXIT_FAILURE, errno, "SYS_clock_gettime64"); +} + +stat
[PATCH vdsotest] Add support for clock_gettime64() on powerpc32
libc test is commented out because at the time being very few libc if any supports clock_gettime64() syscall number is hardcoded when it doesn't exists in unistd.h Signed-off-by: Christophe Leroy --- Based on master from https://github.com/nathanlynch/vdsotest.git src/clock-boottime.c | 4 + src/clock-monotonic-coarse.c | 4 + src/clock-monotonic-raw.c | 4 + src/clock-monotonic.c | 4 + src/clock-realtime-coarse.c| 4 + src/clock-realtime.c | 4 + src/clock-tai.c| 4 + src/clock_gettime64_template.c | 401 + 8 files changed, 429 insertions(+) create mode 100644 src/clock_gettime64_template.c diff --git a/src/clock-boottime.c b/src/clock-boottime.c index 9fb1ac48501d..07ef31d08614 100644 --- a/src/clock-boottime.c +++ b/src/clock-boottime.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic-coarse.c b/src/clock-monotonic-coarse.c index ca1df58691ca..da064188756f 100644 --- a/src/clock-monotonic-coarse.c +++ b/src/clock-monotonic-coarse.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic-raw.c b/src/clock-monotonic-raw.c index 5dbb1842e698..55373b94ecfd 100644 --- a/src/clock-monotonic-raw.c +++ b/src/clock-monotonic-raw.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-monotonic.c b/src/clock-monotonic.c index 44318ae1e1c2..a900d24598a1 100644 --- a/src/clock-monotonic.c +++ b/src/clock-monotonic.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-realtime-coarse.c b/src/clock-realtime-coarse.c index 8f33f9a2d30b..8f2e6242bf0d 100644 --- a/src/clock-realtime-coarse.c +++ b/src/clock-realtime-coarse.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-realtime.c b/src/clock-realtime.c index 079fd801e654..ab76329a6676 100644 --- a/src/clock-realtime.c +++ b/src/clock-realtime.c @@ -3,3 +3,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock-tai.c b/src/clock-tai.c index ad0448adeba5..cb2511039ddc 100644 --- a/src/clock-tai.c +++ b/src/clock-tai.c @@ -7,3 +7,7 @@ #include "clock_gettime_template.c" #include "clock_getres_template.c" + +#if defined(__powerpc__) && !defined(__powerpc64__) +#include "clock_gettime64_template.c" +#endif diff --git a/src/clock_gettime64_template.c b/src/clock_gettime64_template.c new file mode 100644 index ..4752845148d6 --- /dev/null +++ b/src/clock_gettime64_template.c @@ -0,0 +1,401 @@ +/* + * Copyright 2014 Mentor Graphics Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; version 2 of the + * License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "compiler.h" +#include "vdsotest.h" + +struct timespec64 { + long long tv_sec; + long tv_nsec; +}; + +#ifndef SYS_clock_gettime64 +#define SYS_clock_gettime64 403 +#endif + +static int (*clock_gettime64_vdso)(clockid_t id, struct timespec64 *ts); + +static bool vdso_has_clock_gettime64(void) +{ + return clock_gettime64_vdso != NULL; +} + +static int clock_gettime64_syscall_wrapper(clockid_t id, struct timespec64 *ts) +{ + return syscall(SYS_clock_gettime64, id, ts); +} + +static void clock_gettime64_syscall_nofail(clockid_t id, struct timespec64 *ts) +{ + int err; + + err = clock_gettime64_syscall_wrapper(id, ts); + if (err) + error(EXIT_FAILURE, errno, "SYS_clock_gettime64"); +} + +stat
[PATCH fixes] powerpc/40x: Make more space for system call exception
When CONFIG_VIRT_CPU_ACCOUNTING is selected, system call exception handler doesn't fit below 0xd00 and build fails. As exception 0xd00 doesn't exist and is never generated by 40x, comment it out in order to get more space for system call exception. Reported-by: kbuild test robot Fixes: 9e27086292aa ("powerpc/32: Warn and return ENOSYS on syscalls from kernel") Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_40x.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S index 9bb663977e84..2cec543c38f0 100644 --- a/arch/powerpc/kernel/head_40x.S +++ b/arch/powerpc/kernel/head_40x.S @@ -344,8 +344,9 @@ _ENTRY(saved_ksp_limit) /* 0x0C00 - System Call Exception */ START_EXCEPTION(0x0C00, SystemCall) SYSCALL_ENTRY 0xc00 +/* Trap_0D is commented out to get more space for system call exception */ - EXCEPTION(0x0D00, Trap_0D, unknown_exception, EXC_XFER_STD) +/* EXCEPTION(0x0D00, Trap_0D, unknown_exception, EXC_XFER_STD) */ EXCEPTION(0x0E00, Trap_0E, unknown_exception, EXC_XFER_STD) EXCEPTION(0x0F00, Trap_0F, unknown_exception, EXC_XFER_STD) -- 2.25.0
Re: [PATCH v8 8/8] powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32
Le 09/05/2020 à 17:54, Christophe Leroy a écrit : Le 28/04/2020 à 18:05, Arnd Bergmann a écrit : On Tue, Apr 28, 2020 at 3:16 PM Christophe Leroy wrote: Provides __kernel_clock_gettime64() on vdso32. This is the 64 bits version of __kernel_clock_gettime() which is y2038 compliant. Signed-off-by: Christophe Leroy Looks good to me Reviewed-by: Arnd Bergmann There was a bug on ARM for the corresponding function, so far it is unclear if this was a problem related to particular hardware, the 32-bit kernel code, or the common implementation of clock_gettime64 in the vdso library, see https://github.com/richfelker/musl-cross-make/issues/96 Just to be sure that powerpc is not affected by the same issue, can you confirm that repeatedly calling clock_gettime64 on powerpc32, alternating between vdso and syscall, results in monotically increasing times? I think that's one of the things vdsotest checks, so yes that's ok I think. Here is the full result with vdsotest: gettimeofday: syscall: 3715 nsec/call gettimeofday:libc: 794 nsec/call gettimeofday:vdso: 947 nsec/call getcpu: syscall: 1614 nsec/call getcpu:libc: 484 nsec/call getcpu:vdso: 184 nsec/call clock-gettime64-realtime-coarse: syscall: 3152 nsec/call clock-gettime64-realtime-coarse:libc: not tested clock-gettime64-realtime-coarse:vdso: 653 nsec/call clock-getres-realtime-coarse: syscall: 2963 nsec/call clock-getres-realtime-coarse:libc: 745 nsec/call clock-getres-realtime-coarse:vdso: 553 nsec/call clock-gettime-realtime-coarse: syscall: 5120 nsec/call clock-gettime-realtime-coarse:libc: 731 nsec/call clock-gettime-realtime-coarse:vdso: 577 nsec/call clock-gettime64-realtime: syscall: 3719 nsec/call clock-gettime64-realtime:libc: not tested clock-gettime64-realtime:vdso: 976 nsec/call clock-getres-realtime: syscall: 2496 nsec/call clock-getres-realtime:libc: 745 nsec/call clock-getres-realtime:vdso: 546 nsec/call clock-gettime-realtime: syscall: 4800 nsec/call clock-gettime-realtime:libc: 1080 nsec/call clock-gettime-realtime:vdso: 1798 nsec/call clock-gettime64-boottime: syscall: 4132 nsec/call clock-gettime64-boottime:libc: not tested clock-gettime64-boottime:vdso: 975 nsec/call clock-getres-boottime: syscall: 2497 nsec/call clock-getres-boottime:libc: 730 nsec/call clock-getres-boottime:vdso: 546 nsec/call clock-gettime-boottime: syscall: 3728 nsec/call clock-gettime-boottime:libc: 1079 nsec/call clock-gettime-boottime:vdso: 941 nsec/call clock-gettime64-tai: syscall: 4148 nsec/call clock-gettime64-tai:libc: not tested clock-gettime64-tai:vdso: 955 nsec/call clock-getres-tai: syscall: 2494 nsec/call clock-getres-tai:libc: 730 nsec/call clock-getres-tai:vdso: 545 nsec/call clock-gettime-tai: syscall: 3729 nsec/call clock-gettime-tai:libc: 1079 nsec/call clock-gettime-tai:vdso: 927 nsec/call clock-gettime64-monotonic-raw: syscall: 3677 nsec/call clock-gettime64-monotonic-raw:libc: not tested clock-gettime64-monotonic-raw:vdso: 1032 nsec/call clock-getres-monotonic-raw: syscall: 2494 nsec/call clock-getres-monotonic-raw:libc: 745 nsec/call clock-getres-monotonic-raw:vdso: 545 nsec/call clock-gettime-monotonic-raw: syscall: 3276 nsec/call clock-gettime-monotonic-raw:libc: 1140 nsec/call clock-gettime-monotonic-raw:vdso: 1002 nsec/call clock-gettime64-monotonic-coarse: syscall: 4099 nsec/call clock-gettime64-monotonic-coarse:libc: not tested clock-gettime64-monotonic-coarse:vdso: 653 nsec/call clock-getres-monotonic-coarse: syscall: 2962 nsec/call clock-getres-monotonic-coarse:libc: 745 nsec/call clock-getres-monotonic-coarse:vdso: 545 nsec/call clock-gettime-monotonic-coarse: syscall: 4297 nsec/call clock-gettime-monotonic-coarse:libc: 730 nsec/call clock-gettime-monotonic-coarse:vdso: 592 nsec/call clock-gettime64-monotonic: syscall: 3863 nsec/call clock-gettime64-monotonic:libc: not tested clock-gettime64-monotonic:vdso: 975 nsec/call clock-getres-monotonic: syscall: 2494 nsec/call clock-getres-monotonic:libc: 745 nsec/call clock-getres-monotonic:vdso: 545 nsec/call clock-gettime-monotonic: syscall: 3465 nsec/call clock-gettime-monotonic:libc: 1079 nsec/call clock-gettime-monotonic:vdso: 927 nsec/call Christophe
[PATCH v2 4/9] drivers/ps3: Remove duplicate error messages
From: Markus Elfring Remove duplicate memory allocation failure error messages. Signed-off-by: Markus Elfring Signed-off-by: Geoff Levand --- drivers/ps3/ps3-lpm.c | 2 -- drivers/ps3/ps3-vuart.c | 1 - 2 files changed, 3 deletions(-) diff --git a/drivers/ps3/ps3-lpm.c b/drivers/ps3/ps3-lpm.c index 83c45659bc9d..fcc346296e3a 100644 --- a/drivers/ps3/ps3-lpm.c +++ b/drivers/ps3/ps3-lpm.c @@ -,8 +,6 @@ int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void *tb_cache, lpm_priv->tb_cache_internal = kzalloc( lpm_priv->tb_cache_size + 127, GFP_KERNEL); if (!lpm_priv->tb_cache_internal) { - dev_err(sbd_core(), "%s:%u: alloc internal tb_cache " - "failed\n", __func__, __LINE__); result = -ENOMEM; goto fail_malloc; } diff --git a/drivers/ps3/ps3-vuart.c b/drivers/ps3/ps3-vuart.c index ddaa5ea5801a..3b5878edc6c2 100644 --- a/drivers/ps3/ps3-vuart.c +++ b/drivers/ps3/ps3-vuart.c @@ -917,7 +917,6 @@ static int ps3_vuart_bus_interrupt_get(void) vuart_bus_priv.bmp = kzalloc(sizeof(struct ports_bmp), GFP_KERNEL); if (!vuart_bus_priv.bmp) { - pr_debug("%s:%d: kzalloc failed.\n", __func__, __LINE__); result = -ENOMEM; goto fail_bmp_malloc; } -- 2.20.1
[PATCH v2 5/9] net/ps3_gelic_net: Remove duplicate error message
From: Markus Elfring Remove an extra message for a memory allocation failure in function gelic_descr_prepare_rx(). Signed-off-by: Markus Elfring Signed-off-by: Geoff Levand --- drivers/net/ethernet/toshiba/ps3_gelic_net.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c index 070dd6fa9401..8522f3898e0d 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c @@ -382,8 +382,6 @@ static int gelic_descr_prepare_rx(struct gelic_card *card, descr->skb = dev_alloc_skb(bufsize + GELIC_NET_RXBUF_ALIGN - 1); if (!descr->skb) { descr->buf_addr = 0; /* tell DMAC don't touch memory */ - dev_info(ctodev(card), -"%s:allocate skb failed !!\n", __func__); return -ENOMEM; } descr->buf_size = cpu_to_be32(bufsize); -- 2.20.1
[PATCH v2 6/9] ps3disk: use the default segment boundary
From: Emmanuel Nicolet Hi, since commit dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count"), the kernel will bug_on on the PS3 because bio_split() is called with sectors == 0: kernel BUG at block/bio.c:1853! Oops: Exception in kernel mode, sig: 5 [#1] BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3 Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \ ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1 NIP: c027d0d0 LR: c027d0b0 CTR: REGS: c135ae90 TRAP: 0700 Not tainted (5.3.0-rc4) MSR: 80028032 CR: 44008240 XER: 2000 IRQMASK: 0 GPR00: c0289368 c135b120 c084a500 c4ff8300 GPR04: 0c00 c4c905e0 c4c905e0 GPR08: 0001 GPR12: c08ef000 003e 00080001 GPR16: 0100 0004 GPR20: c062fd7e 0001 0080 GPR24: c0781788 c135b350 0080 c4c905e0 GPR28: c135b348 c4ff8300 c4c9 NIP [c027d0d0] .bio_split+0x28/0xac LR [c027d0b0] .bio_split+0x8/0xac Call Trace: [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable) [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4 [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294 [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174 [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c [c135b600] [c0206184] .mpage_readpages+0xa4/0x184 [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38 [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8 [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184 [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8 [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830 [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0 [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124 [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8 [c135be20] [c000a524] system_call+0x5c/0x70 Instruction dump: 7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0 7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62 [ end trace 313fec760f30aa1f ]--- The problem originates from setting the segment boundary of the request queue to -1UL. This makes get_max_segment_size() return zero when offset is zero, whatever the max segment size. The test with BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows to zero in the return statement. Not setting the segment boundary and using the default value (BLK_SEG_BOUNDARY_MASK) fixes the problem. Maybe BLK_SEG_BOUNDARY_MASK should be set to -1UL? It's currently set to only 0xUL. I don't know if that would break anything. Signed-off-by: Emmanuel Nicolet Signed-off-by: Geoff Levand --- drivers/block/ps3disk.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c index c5c6487a19d5..7b55811c2a81 100644 --- a/drivers/block/ps3disk.c +++ b/drivers/block/ps3disk.c @@ -454,7 +454,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev) queue->queuedata = dev; blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9); - blk_queue_segment_boundary(queue, -1UL); blk_queue_dma_alignment(queue, dev->blk_size-1); blk_queue_logical_block_size(queue, dev->blk_size); -- 2.20.1
[PATCH v2 3/9] powerpc/head_check: Avoid broken pipe
Remove the '-m4' option to grep to allow grep to process all of nm's output. This avoids the nm warning: nm terminated with signal 13 [Broken pipe] Signed-off-by: Geoff Levand --- arch/powerpc/tools/head_check.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh index 37061fb9b58e..e32d3162e5ed 100644 --- a/arch/powerpc/tools/head_check.sh +++ b/arch/powerpc/tools/head_check.sh @@ -46,7 +46,7 @@ nm="$1" vmlinux="$2" # gcc-4.6-era toolchain make _stext an A (absolute) symbol rather than T -$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" -m4 > .tmp_symbols.txt +$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" > .tmp_symbols.txt vma=$(cat .tmp_symbols.txt | grep -e " [TA] _stext$" | cut -d' ' -f1) -- 2.20.1
[PATCH v2 2/9] powerpc/wrapper: Output linker map file
To aid debugging wrapper troubles, output a linker map file 'wrapper.map' when the build is verbose. Signed-off-by: Geoff Levand --- arch/powerpc/boot/wrapper | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper index ed6266367bc0..35ace40d9fc2 100755 --- a/arch/powerpc/boot/wrapper +++ b/arch/powerpc/boot/wrapper @@ -29,6 +29,7 @@ set -e # Allow for verbose output if [ "$V" = 1 ]; then set -x +map="-Map wrapper.map" fi # defaults @@ -500,7 +501,7 @@ if [ "$platform" != "miboot" ]; then text_start="-Ttext $link_address" fi #link everything -${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" \ +${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" $map \ $platformo $tmp $object/wrapper.a rm $tmp fi -- 2.20.1
[PATCH v2 8/9] powerpc/ps3: Fix kexec shutdown hang
The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines are called very late in the shutdown via kexec's mmu_cleanup_all routine. By the time mmu_cleanup_all runs it is too late to use udbg_printf, and calling it will cause PS3 systems to hang. Remove all debugging statements from ps3_mm_region_destroy() and ps3_mm_vas_destroy() and replace any error reporting with calls to lv1_panic. With this change builds with 'DEBUG' defined will not cause kexec reboots to hang, and builds with 'DEBUG' defined or not will end in lv1_panic if an error is encountered. Signed-off-by: Geoff Levand --- arch/powerpc/platforms/ps3/mm.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c index 423be34f0f5f..f42fe4e86ce5 100644 --- a/arch/powerpc/platforms/ps3/mm.c +++ b/arch/powerpc/platforms/ps3/mm.c @@ -200,13 +200,14 @@ void ps3_mm_vas_destroy(void) { int result; - DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id); - if (map.vas_id) { result = lv1_select_virtual_address_space(0); - BUG_ON(result); - result = lv1_destruct_virtual_address_space(map.vas_id); - BUG_ON(result); + result += lv1_destruct_virtual_address_space(map.vas_id); + + if (result) { + lv1_panic(0); + } + map.vas_id = 0; } } @@ -304,19 +305,20 @@ static void ps3_mm_region_destroy(struct mem_region *r) int result; if (!r->destroy) { - pr_info("%s:%d: Not destroying high region: %llxh %llxh\n", - __func__, __LINE__, r->base, r->size); return; } - DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base); - if (r->base) { result = lv1_release_memory(r->base); - BUG_ON(result); + + if (result) { + lv1_panic(0); + } + r->size = r->base = r->offset = 0; map.total = map.rm.size; } + ps3_mm_set_repository_highmem(NULL); } -- 2.20.1
[PATCH v2 9/9] hvc_console: Allow backends to set I/O buffer size
To allow HVC backends to set the I/O buffer sizes to values that are most efficient for the backend, change the macro definitions where the buffer sizes are set to be conditional on whether or not the macros are already defined. Also, rename the macros from N_OUTBUF to HVC_N_OUBUF and from N_INBUF to HVC_N_INBUF. Typical usage in the backend source file would be: Signed-off-by: Geoff Levand --- drivers/tty/hvc/hvc_console.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 436cc51c92c3..2928bad057fc 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -42,12 +42,15 @@ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ /* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. + * These default sizes are most efficient for vio, because they are + * the native transfer size. */ -#define N_OUTBUF 16 -#define N_INBUF16 +#if !defined(HVC_N_OUTBUF) +# define HVC_N_OUTBUF 16 +#endif +#if !defined(HVC_N_INBUF) +# define HVC_N_INBUF 16 +#endif #define __ALIGNED__ __attribute__((__aligned__(sizeof(long @@ -151,7 +154,7 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char c[HVC_N_OUTBUF] __ALIGNED__; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; @@ -640,7 +643,7 @@ static int __hvc_poll(struct hvc_struct *hp, bool may_sleep) { struct tty_struct *tty; int i, n, count, poll_mask = 0; - char buf[N_INBUF] __ALIGNED__; + char buf[HVC_N_INBUF] __ALIGNED__; unsigned long flags; int read_total = 0; int written_total = 0; @@ -681,7 +684,7 @@ static int __hvc_poll(struct hvc_struct *hp, bool may_sleep) read_again: /* Read data if any */ - count = tty_buffer_request_room(&hp->port, N_INBUF); + count = tty_buffer_request_room(&hp->port, HVC_N_INBUF); /* If flip is full, just reschedule a later read */ if (count == 0) { -- 2.20.1
[PATCH v2 1/9] powerpc/head_check: Automatic verbosity
To aid debugging build problems turn on shell tracing for the head_check script when the build is verbose. Signed-off-by: Geoff Levand --- arch/powerpc/tools/head_check.sh | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh index ad9e57209aa4..37061fb9b58e 100644 --- a/arch/powerpc/tools/head_check.sh +++ b/arch/powerpc/tools/head_check.sh @@ -31,8 +31,10 @@ # level entry code (boot, interrupt vectors, etc) until r2 is set up. This # could cause the kernel to die in early boot. -# Turn this on if you want more debug output: -# set -x +# Allow for verbose output +if [ "$V" = "1" ]; then + set -x +fi if [ $# -lt 2 ]; then echo "$0 [path to nm] [path to vmlinux]" 1>&2 -- 2.20.1
[PATCH v2 0/9] powerpc + ps3 patches
Hi Michael, This is a combined V2 of the two patch sets I sent out on March 27th, 'PS3 patches for v5.7' and 'powerpc: Minor updates to improve build debugging'. I've dropped these two patches that were in my 'PS3 patches for v5.7' set: powerpc/ps3: Add lv1_panic powerpc/ps3: Add udbg_panic and replaced them with a new patch: powerpc/ps3: Fix kexec shutdown hang There is new patch I've added in this set: hvc_console: Allow backends to set I/O buffer size which I've been using in my debugging. There is nothing using this feature in the upstream kernel, and I don't plan to submit anything that would use it, so use your discretion as to merge it. -Geoff The following changes since commit 0e698dfa282211e414076f9dc7e83c1c288314fd: Linux 5.7-rc4 (2020-05-03 14:56:04 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git for-merge-powerpc-v2 for you to fetch changes up to 6f6294df663a53f47bb28abcbb1ef756c6a59922: hvc_console: Allow backends to set I/O buffer size (2020-05-09 11:24:42 -0700) Emmanuel Nicolet (1): ps3disk: use the default segment boundary Geoff Levand (6): powerpc/head_check: Automatic verbosity powerpc/wrapper: Output linker map file powerpc/head_check: Avoid broken pipe powerpc/ps3: Add check for otheros image size powerpc/ps3: Fix kexec shutdown hang hvc_console: Allow backends to set I/O buffer size Markus Elfring (2): drivers/ps3: Remove duplicate error messages net/ps3_gelic_net: Remove duplicate error message arch/powerpc/boot/wrapper| 20 +--- arch/powerpc/platforms/ps3/mm.c | 22 -- arch/powerpc/tools/head_check.sh | 8 +--- drivers/block/ps3disk.c | 1 - drivers/net/ethernet/toshiba/ps3_gelic_net.c | 2 -- drivers/ps3/ps3-lpm.c| 2 -- drivers/ps3/ps3-vuart.c | 1 - drivers/tty/hvc/hvc_console.c| 19 +++ 8 files changed, 45 insertions(+), 30 deletions(-) -- 2.20.1
[PATCH v2 7/9] powerpc/ps3: Add check for otheros image size
The ps3's otheros flash loader has a size limit of 16 MiB for the uncompressed image. If that limit will be reached output the flash image file as 'otheros-too-big.bld'. Signed-off-by: Geoff Levand --- arch/powerpc/boot/wrapper | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper index 35ace40d9fc2..ab1e3ddc79f3 100755 --- a/arch/powerpc/boot/wrapper +++ b/arch/powerpc/boot/wrapper @@ -571,7 +571,20 @@ ps3) count=$overlay_size bs=1 odir="$(dirname "$ofile.bin")" -rm -f "$odir/otheros.bld" -gzip -n --force -9 --stdout "$ofile.bin" > "$odir/otheros.bld" + +# The ps3's flash loader has a size limit of 16 MiB for the uncompressed +# image. If a compressed image that exceeded this limit is written to +# flash the loader will decompress that image until the 16 MiB limit is +# reached, then enter the system reset vector of the partially decompressed +# image. No warning is issued. +rm -f "$odir"/{otheros,otheros-too-big}.bld +size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' ' -f1) +bld="otheros.bld" +if [ $size -gt $((0x100)) ]; then +bld="otheros-too-big.bld" +echo " INFO: Uncompressed kernel is too large to program into PS3 flash memory;" \ +"size=0x$(printf "%x\n" $size), limit=0x100." +fi +gzip -n --force -9 --stdout "$ofile.bin" > "$odir/$bld" ;; esac -- 2.20.1
Re: ioremap() called early from pnv_pci_init_ioda_phb()
On Sun, May 10, 2020 at 1:51 AM Christophe Leroy wrote: > > > > Le 08/05/2020 à 19:41, Qian Cai a écrit : > > > > > >> On May 8, 2020, at 10:39 AM, Qian Cai wrote: > >> > >> Booting POWER9 PowerNV has this message, > >> > >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use > >> early_ioremap() instead” > >> > >> but use the patch below will result in leaks because it will never call > >> early_iounmap() anywhere. However, it looks me it was by design that > >> phb->regs mapping would be there forever where it would be used in > >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() > >> initcall too strong? > >> > >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c > >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > >> @@ -36,6 +36,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> > >> #include > >> > >> @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct > >> device_node *np, > >> /* Get registers */ > >> if (!of_address_to_resource(np, 0, &r)) { > >> phb->regs_phys = r.start; > >> - phb->regs = ioremap(r.start, resource_size(&r)); > >> + phb->regs = early_ioremap(r.start, resource_size(&r)); > >> if (phb->regs == NULL) > >> pr_err(" Failed to map registers !\n”); > > > > This will also trigger a panic with debugfs reads, so isn’t that this > > commit bogus at least for powerpc64? > > > > d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") > > No d538aadc2718 is not bogus. That's the point, we want to remove all > early usages of ioremap() in order to remove the hack with the > ioremap_bot stuff and all, and stick to the generic ioremap logic. > > In order to do so, all early use of ioremap() has to be converted to > early_ioremap() or to fixmap or anything else that allows to do ioremaps > before the slab is ready. > > early_ioremap() is for temporary mappings necessary at boottime. For > long lasting mappings, another method is to be used. > > Now, the point is that other architectures like for instance x86 don't > seem to have to use early_ioremap() much. Powerpc is for instance doing > early mappings for PCI. Seems like x86 initialises PCI once slab is > ready. Can't powerpc do the same ? Patches welcome.
Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
On 2020-05-06 02:48, Greg KH wrote: On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote: Potentially, hvc_open() can be called in parallel when two tasks calls open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add() callback in the function fails, where it sets the tty->driver_data to NULL, the parallel hvc_open() can see this NULL and cause a memory abort. Hence, serialize hvc_open and check if tty->private_data is NULL before proceeding ahead. The issue can be easily reproduced by launching two tasks simultaneously that does nothing but open() and close() on /dev/hvcX. For example: $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 & Signed-off-by: Raghavendra Rao Ananta --- drivers/tty/hvc/hvc_console.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 436cc51c92c3..ebe26fe5ac09 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs); */ static DEFINE_MUTEX(hvc_structs_mutex); +/* Mutex to serialize hvc_open */ +static DEFINE_MUTEX(hvc_open_mutex); /* * This value is used to assign a tty->index value to a hvc_struct based * upon order of exposure via hvc_probe(), when we can not match it to @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct tty_struct *tty) */ static int hvc_open(struct tty_struct *tty, struct file * filp) { - struct hvc_struct *hp = tty->driver_data; + struct hvc_struct *hp; unsigned long flags; int rc = 0; + mutex_lock(&hvc_open_mutex); + + hp = tty->driver_data; + if (!hp) { + rc = -EIO; + goto out; + } + spin_lock_irqsave(&hp->port.lock, flags); /* Check and then increment for fast path open. */ if (hp->port.count++ > 0) { spin_unlock_irqrestore(&hp->port.lock, flags); hvc_kick(); - return 0; + goto out; } /* else count == 0 */ spin_unlock_irqrestore(&hp->port.lock, flags); Wait, why isn't this driver just calling tty_port_open() instead of trying to open-code all of this? Keeping a single mutext for open will not protect it from close, it will just slow things down a bit. There should already be a tty lock held by the tty core for open() to keep it from racing things, right? The tty lock should have been held, but not likely across ->install() and ->open() callbacks, thus resulting in a race between hvc_install() and hvc_open(), where hvc_install() sets a data and the hvc_open() clears it. hvc_open() doesn't check if the data was set to NULL and proceeds. Try just removing all of this logic and replacing it with a call to tty_port_open() and see if that fixes this issue. As "proof" of this, I don't see other serial drivers needing a single mutex for their open calls, do you? thanks, greg k-h Thank you. Raghavendra
[Bug 207359] MegaRAID SAS 9361 controller hang/reset
https://bugzilla.kernel.org/show_bug.cgi?id=207359 --- Comment #3 from Cameron (c...@neo-zeon.de) --- Created attachment 289041 --> https://bugzilla.kernel.org/attachment.cgi?id=289041&action=edit 5.6.11 megaraid POWER hang Still happens with 5.6.11. There seems to be potentially a bit more output this time, and I've included output from shutting down too in case it's useful. -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH] powerpc/powernv/pci: fix a RCU-list lock
It is unsafe to traverse tbl->it_group_list without the RCU read lock. WARNING: suspicious RCU usage 5.7.0-rc4-next-20200508 #1 Not tainted - arch/powerpc/platforms/powernv/pci-ioda-tce.c:355 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by qemu-kvm/4305: #0: c00bc3fe6988 (&container->group_lock){}-{3:3}, at: vfio_fops_unl_ioctl+0x108/0x410 [vfio] #1: c0080fcc7400 (&vfio.iommu_drivers_lock){+.+.}-{3:3}, at: vfio_fops_unl_ioctl+0x148/0x410 [vfio] #2: c00bc3fe4d68 (&container->lock){+.+.}-{3:3}, at: tce_iommu_attach_group+0x3c/0x4f0 [vfio_iommu_spapr_tce] stack backtrace: CPU: 4 PID: 4305 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508 #1 Call Trace: [c010f29afa60] [c07154c8] dump_stack+0xfc/0x174 (unreliable) [c010f29afab0] [c01d8ff0] lockdep_rcu_suspicious+0x140/0x164 [c010f29afb30] [c00dae2c] pnv_pci_unlink_table_and_group+0x11c/0x200 [c010f29afb70] [c00d4a34] pnv_pci_ioda2_unset_window+0xc4/0x190 [c010f29afbf0] [c00d4b4c] pnv_ioda2_take_ownership+0x4c/0xd0 [c010f29afc30] [c0080fd60ee0] tce_iommu_attach_group+0x2c8/0x4f0 [vfio_iommu_spapr_tce] [c010f29afcd0] [c0080fcc11a0] vfio_fops_unl_ioctl+0x238/0x410 [vfio] [c010f29afd50] [c05430a8] ksys_ioctl+0xd8/0x130 [c010f29afda0] [c0543128] sys_ioctl+0x28/0x40 [c010f29afdc0] [c0038af4] system_call_exception+0x114/0x1e0 [c010f29afe20] [c000c8f0] system_call_common+0xf0/0x278 Signed-off-by: Qian Cai --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c index 5dc6847d5f4c..6be9cf292b4e 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c @@ -352,6 +352,8 @@ void pnv_pci_unlink_table_and_group(struct iommu_table *tbl, /* Remove link to a group from table's list of attached groups */ found = false; + + rcu_read_lock(); list_for_each_entry_rcu(tgl, &tbl->it_group_list, next) { if (tgl->table_group == table_group) { list_del_rcu(&tgl->next); @@ -360,6 +362,8 @@ void pnv_pci_unlink_table_and_group(struct iommu_table *tbl, break; } } + rcu_read_unlock(); + if (WARN_ON(!found)) return; -- 2.21.0 (Apple Git-122.2)
[PATCH] powerpc/mm/book3s64/iommu: fix some RCU-list locks
It is safe to traverse mm->context.iommu_group_mem_list with either mem_list_mutex or the RCU read lock held. Silence a few RCU-list false positive warnings and fix a few missing RCU read locks. arch/powerpc/mm/book3s64/iommu_api.c:330 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4305: #0: c00bc3fe4d68 (&container->lock){+.+.}-{3:3}, at: tce_iommu_ioctl.part.9+0xc7c/0x1870 [vfio_iommu_spapr_tce] #1: c1501910 (mem_list_mutex){+.+.}-{3:3}, at: mm_iommu_get+0x50/0x190 arch/powerpc/mm/book3s64/iommu_api.c:132 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4305: #0: c00bc3fe4d68 (&container->lock){+.+.}-{3:3}, at: tce_iommu_ioctl.part.9+0xc7c/0x1870 [vfio_iommu_spapr_tce] #1: c1501910 (mem_list_mutex){+.+.}-{3:3}, at: mm_iommu_do_alloc+0x120/0x5f0 arch/powerpc/mm/book3s64/iommu_api.c:292 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4312: #0: c00ecafe23c8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0xdc/0x950 [kvm] #1: c00045e6c468 (&kvm->srcu){}-{0:0}, at: kvmppc_h_put_tce+0x88/0x340 [kvm] arch/powerpc/mm/book3s64/iommu_api.c:424 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4312: #0: c00ecafe23c8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0xdc/0x950 [kvm] #1: c00045e6c468 (&kvm->srcu){}-{0:0}, at: kvmppc_h_put_tce+0x88/0x340 [kvm] Signed-off-by: Qian Cai --- arch/powerpc/mm/book3s64/iommu_api.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c index fa05bbd1f682..bf0108b6f445 100644 --- a/arch/powerpc/mm/book3s64/iommu_api.c +++ b/arch/powerpc/mm/book3s64/iommu_api.c @@ -129,7 +129,8 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, mutex_lock(&mem_list_mutex); - list_for_each_entry_rcu(mem2, &mm->context.iommu_group_mem_list, next) { + list_for_each_entry_rcu(mem2, &mm->context.iommu_group_mem_list, next, + lockdep_is_held(&mem_list_mutex)) { /* Overlap? */ if ((mem2->ua < (ua + (entries << PAGE_SHIFT))) && (ua < (mem2->ua + @@ -289,6 +290,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm, { struct mm_iommu_table_group_mem_t *mem, *ret = NULL; + rcu_read_lock(); list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) { if ((mem->ua <= ua) && (ua + size <= mem->ua + @@ -297,6 +299,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm, break; } } + rcu_read_unlock(); return ret; } @@ -327,7 +330,8 @@ struct mm_iommu_table_group_mem_t *mm_iommu_get(struct mm_struct *mm, mutex_lock(&mem_list_mutex); - list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) { + list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next, + lockdep_is_held(&mem_list_mutex)) { if ((mem->ua == ua) && (mem->entries == entries)) { ret = mem; ++mem->used; @@ -421,6 +425,7 @@ bool mm_iommu_is_devmem(struct mm_struct *mm, unsigned long hpa, struct mm_iommu_table_group_mem_t *mem; unsigned long end; + rcu_read_lock(); list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) { if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) continue; @@ -437,6 +442,7 @@ bool mm_iommu_is_devmem(struct mm_struct *mm, unsigned long hpa, return true; } } + rcu_read_unlock(); return false; } -- 2.21.0 (Apple Git-122.2)
[PATCH] powerpc/kvm/book3s64/vio: fix some RCU-list locks
It is unsafe to traverse kvm->arch.spapr_tce_tables and stt->iommu_tables without the RCU read lock held. Also, add cond_resched_rcu() in places with the RCU read lock held that could take a while to finish. arch/powerpc/kvm/book3s_64_vio.c:76 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by qemu-kvm/4265. stack backtrace: CPU: 96 PID: 4265 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508+ #2 Call Trace: [c000201a8690f720] [c0715948] dump_stack+0xfc/0x174 (unreliable) [c000201a8690f770] [c01d9470] lockdep_rcu_suspicious+0x140/0x164 [c000201a8690f7f0] [c00810b9fb48] kvm_spapr_tce_release_iommu_group+0x1f0/0x220 [kvm] [c000201a8690f870] [c00810b8462c] kvm_spapr_tce_release_vfio_group+0x54/0xb0 [kvm] [c000201a8690f8a0] [c00810b84710] kvm_vfio_destroy+0x88/0x140 [kvm] [c000201a8690f8f0] [c00810b7d488] kvm_put_kvm+0x370/0x600 [kvm] [c000201a8690f990] [c00810b7e3c0] kvm_vm_release+0x38/0x60 [kvm] [c000201a8690f9c0] [c05223f4] __fput+0x124/0x330 [c000201a8690fa20] [c0151cd8] task_work_run+0xb8/0x130 [c000201a8690fa70] [c01197e8] do_exit+0x4e8/0xfa0 [c000201a8690fb70] [c011a374] do_group_exit+0x64/0xd0 [c000201a8690fbb0] [c0132c90] get_signal+0x1f0/0x1200 [c000201a8690fcc0] [c0020690] do_notify_resume+0x130/0x3c0 [c000201a8690fda0] [c0038d64] syscall_exit_prepare+0x1a4/0x280 [c000201a8690fe20] [c000c8f8] system_call_common+0xf8/0x278 arch/powerpc/kvm/book3s_64_vio.c:368 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by qemu-kvm/4264: #0: c000201ae2d000d8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0xdc/0x950 [kvm] #1: c000200c9ed0c468 (&kvm->srcu){}-{0:0}, at: kvmppc_h_put_tce+0x88/0x340 [kvm] arch/powerpc/kvm/book3s_64_vio.c:108 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-kvm/4257: #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: kvm_vfio_set_attr+0x598/0x6c0 [kvm] arch/powerpc/kvm/book3s_64_vio.c:146 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-kvm/4257: #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: kvm_vfio_set_attr+0x598/0x6c0 [kvm] Signed-off-by: Qian Cai --- arch/powerpc/kvm/book3s_64_vio.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 50555ad1db93..4f5016bab723 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -73,6 +73,7 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm, struct kvmppc_spapr_tce_iommu_table *stit, *tmp; struct iommu_table_group *table_group = NULL; + rcu_read_lock(); list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) { table_group = iommu_group_get_iommudata(grp); @@ -87,7 +88,9 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm, kref_put(&stit->kref, kvm_spapr_tce_liobn_put); } } + cond_resched_rcu(); } + rcu_read_unlock(); } extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, @@ -105,12 +108,14 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!f.file) return -EBADF; + rcu_read_lock(); list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) { if (stt == f.file->private_data) { found = true; break; } } + rcu_read_unlock(); fdput(f); @@ -143,6 +148,7 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!tbl) return -EINVAL; + rcu_read_lock(); list_for_each_entry_rcu(stit, &stt->iommu_tables, next) { if (tbl != stit->tbl) continue; @@ -150,14 +156,17 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd, if (!kref_get_unless_zero(&stit->kref)) { /* stit is being destroyed */ iommu_tce_table_put(tbl); + rcu_read_unlock(); return -ENOTTY; } /* * The table is already known to this KVM, we just increased * its KVM reference counter and can return. */ + rcu_read_unlock(); return 0; } + rcu_read_unlock();
Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
On Sat, May 09, 2020 at 06:30:56PM -0700, rana...@codeaurora.org wrote: > On 2020-05-06 02:48, Greg KH wrote: > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote: > > > Potentially, hvc_open() can be called in parallel when two tasks calls > > > open() on /dev/hvcX. In such a scenario, if the > > > hp->ops->notifier_add() > > > callback in the function fails, where it sets the tty->driver_data to > > > NULL, the parallel hvc_open() can see this NULL and cause a memory > > > abort. > > > Hence, serialize hvc_open and check if tty->private_data is NULL > > > before > > > proceeding ahead. > > > > > > The issue can be easily reproduced by launching two tasks > > > simultaneously > > > that does nothing but open() and close() on /dev/hvcX. > > > For example: > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 & > > > > > > Signed-off-by: Raghavendra Rao Ananta > > > --- > > > drivers/tty/hvc/hvc_console.c | 16 ++-- > > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/tty/hvc/hvc_console.c > > > b/drivers/tty/hvc/hvc_console.c > > > index 436cc51c92c3..ebe26fe5ac09 100644 > > > --- a/drivers/tty/hvc/hvc_console.c > > > +++ b/drivers/tty/hvc/hvc_console.c > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs); > > > */ > > > static DEFINE_MUTEX(hvc_structs_mutex); > > > > > > +/* Mutex to serialize hvc_open */ > > > +static DEFINE_MUTEX(hvc_open_mutex); > > > /* > > > * This value is used to assign a tty->index value to a hvc_struct > > > based > > > * upon order of exposure via hvc_probe(), when we can not match it > > > to > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver > > > *driver, struct tty_struct *tty) > > > */ > > > static int hvc_open(struct tty_struct *tty, struct file * filp) > > > { > > > - struct hvc_struct *hp = tty->driver_data; > > > + struct hvc_struct *hp; > > > unsigned long flags; > > > int rc = 0; > > > > > > + mutex_lock(&hvc_open_mutex); > > > + > > > + hp = tty->driver_data; > > > + if (!hp) { > > > + rc = -EIO; > > > + goto out; > > > + } > > > + > > > spin_lock_irqsave(&hp->port.lock, flags); > > > /* Check and then increment for fast path open. */ > > > if (hp->port.count++ > 0) { > > > spin_unlock_irqrestore(&hp->port.lock, flags); > > > hvc_kick(); > > > - return 0; > > > + goto out; > > > } /* else count == 0 */ > > > spin_unlock_irqrestore(&hp->port.lock, flags); > > > > Wait, why isn't this driver just calling tty_port_open() instead of > > trying to open-code all of this? > > > > Keeping a single mutext for open will not protect it from close, it will > > just slow things down a bit. There should already be a tty lock held by > > the tty core for open() to keep it from racing things, right? > The tty lock should have been held, but not likely across ->install() and > ->open() callbacks, thus resulting in a race between hvc_install() and > hvc_open(), How? The tty lock is held in install, and should not conflict with open(), otherwise we would be seeing this happen in all tty drivers, right? > where hvc_install() sets a data and the hvc_open() clears it. hvc_open() > doesn't > check if the data was set to NULL and proceeds. What data is being set that hvc_open is checking? And you are not grabbing a lock in your install callback, you are only serializing your open call here, I don't see how this is fixing anything other than perhaps slowing down your codepaths. As an arument why this isn't correct, can you answer why this same type of change wouldn't be required for all tty drivers in the tree? thanks, greg k-h