[PATCH] lib/scatterlist: Avoid a double memset
'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to clear it twice. Remove the redundant initialization. Signed-off-by: Christophe JAILLET --- lib/scatterlist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 5d63a8857f36..d94628fa3349 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -504,7 +504,7 @@ struct scatterlist *sgl_alloc_order(unsigned long long length, nalloc++; } sgl = kmalloc_array(nalloc, sizeof(struct scatterlist), - (gfp & ~GFP_DMA) | __GFP_ZERO); + gfp & ~GFP_DMA); if (!sgl) return NULL; -- 2.25.1
ERROR: modpost: "rcu_idle_enter" undefined!
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 325d0eab4f31c6240b59d5b2b8042c88f59405b5 commit: 1fecfdbb7acc6624655450a609221c89b5197a06 ACPI: processor: Take over RCU-idle for C3-BM idle date: 4 days ago config: ia64-defconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 1fecfdbb7acc6624655450a609221c89b5197a06 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=ia64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "rcu_idle_enter" [drivers/acpi/processor.ko] undefined! >> ERROR: modpost: "rcu_idle_exit" [drivers/acpi/processor.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH] lib/scatterlist: Avoid a double memset
On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote: > 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to > clear it twice. > > Remove the redundant initialization. I didn't look very thoroughly, but there are at least a few more of these with kcalloc and kzalloc like block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * req->nr_phys_segments); block/bsg-lib.c- block/bsg-lib.c-BUG_ON(!req->nr_phys_segments); block/bsg-lib.c- block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL); block/bsg-lib.c-if (!buf->sg_list) block/bsg-lib.c-return -ENOMEM; block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments); --- drivers/target/target_core_rd.c-sg = kcalloc(sg_per_table + chain_entry, sizeof(*sg), drivers/target/target_core_rd.c-GFP_KERNEL); drivers/target/target_core_rd.c-if (!sg) drivers/target/target_core_rd.c-return -ENOMEM; drivers/target/target_core_rd.c- drivers/target/target_core_rd.c:sg_init_table(sg, sg_per_table + chain_entry); --- net/rds/rdma.c- sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL); net/rds/rdma.c- if (!sg) { net/rds/rdma.c- ret = -ENOMEM; net/rds/rdma.c- goto out; net/rds/rdma.c- } net/rds/rdma.c- WARN_ON(!nents); net/rds/rdma.c: sg_init_table(sg, nents);
Scammed Victim
SCAMMED VICTIM REF CODE: 06654 This is to bring to your notice that I am a delegate from the United Nations to The IMF (International Monetary Fund) United States Regional Payment Office to pay 20 scam victims $1,000,000.00 USD (One Million Dollars only) each. You are listed and approved for this payment as one of the scammed victims to be paid this amount,respond to this mail as soon as possible for the immediate payments of your $1,000,000.00 USD compensations funds. You are to send your: NAME: ADDRESS: TELEPHONE: DATE OF BIRTH: COUNTRY: SEX: OCCUPATION: CONTACT OUR HSBC BANK PAYMENT OFFICER Name: Tom Robert Email: tomrob...@accountant.com Please you are advice to stop any communication with any person or Body that you are currently paying money to due to transaction which you have online with the person.Please let us know right now via email or telephone your present dealing online, so we can help you verify the authenticity of the transaction which you are presently involved in, so you don't get scammed. Please also send me proof of payment in any previous scam If you still have any (optional).I shall feed you with further modalities as soon as I hear from you. Regards, Ms.Helen Clark United Nation Representative. http://www.undp.org NOTE : If You Receive This Message In Your Junk Or Spam Its Due To Your Internet Provider.
[PATCH] scsi: pmcraid: Fix memory allocation in 'pmcraid_alloc_sglist()'
When the scatter list is allocated in 'pmcraid_alloc_sglist()', the corresponding pointer should be stored in 'scatterlist' within the 'pmcraid_sglist' structure. Otherwise, 'scatterlist' is NULL. This leads to a potential memory leak and NULL pointer dereference. Fixes: ed4414cef2ad ("scsi: pmcraid: Use sgl_alloc_order() and sgl_free_order()") Signed-off-by: Christophe JAILLET --- This patch is completely speculative and untested. Should it be correct, I think that their should be some trouble somewhere. Either NULL pointer dereference or incorrect behavior. The patch that introduced this potential bug is 2 years 1/2 old. This should have been spotted earlier. So unless this driver is mostly unused, this looks odd to me. Feedback appreciated. --- drivers/scsi/pmcraid.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c index d99568fdf4af..00e155c88f03 100644 --- a/drivers/scsi/pmcraid.c +++ b/drivers/scsi/pmcraid.c @@ -3230,8 +3230,9 @@ static struct pmcraid_sglist *pmcraid_alloc_sglist(int buflen) return NULL; sglist->order = order; - sgl_alloc_order(buflen, order, false, - GFP_KERNEL | GFP_DMA | __GFP_ZERO, &sglist->num_sg); + sglist->scatterlist = sgl_alloc_order(buflen, order, false, + GFP_KERNEL | GFP_DMA | __GFP_ZERO, + &sglist->num_sg); return sglist; } -- 2.25.1
[tip:x86/cpu] BUILD SUCCESS e1ebb2b49048c4767cfa0d8466f9c701e549fa5e
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cpu branch HEAD: e1ebb2b49048c4767cfa0d8466f9c701e549fa5e KVM: SVM: Don't flush cache if hardware enforces cache coherency across encryption domains elapsed time: 721m configs tested: 61 configs skipped: 65 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig pariscgeneric-32bit_defconfig xtensa common_defconfig arcnsimosci_defconfig sh lboxre2_defconfig powerpc ps3_defconfig powerpc tqm8541_defconfig mips pic32mzda_defconfig armmmp2_defconfig sparcalldefconfig powerpcicon_defconfig m68k m5208evb_defconfig csky alldefconfig powerpc tqm8555_defconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig i386 allyesconfig i386defconfig x86_64 randconfig-a005-20200920 x86_64 randconfig-a003-20200920 x86_64 randconfig-a004-20200920 x86_64 randconfig-a002-20200920 x86_64 randconfig-a006-20200920 x86_64 randconfig-a001-20200920 i386 randconfig-a002-20200920 i386 randconfig-a006-20200920 i386 randconfig-a003-20200920 i386 randconfig-a004-20200920 i386 randconfig-a005-20200920 i386 randconfig-a001-20200920 i386 randconfig-a012-20200920 i386 randconfig-a014-20200920 i386 randconfig-a016-20200920 i386 randconfig-a013-20200920 i386 randconfig-a011-20200920 i386 randconfig-a015-20200920 riscvnommu_k210_defconfig riscvallyesconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscv rv32_defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 kexec clang tested configs: x86_64 randconfig-a011-20200920 x86_64 randconfig-a013-20200920 x86_64 randconfig-a014-20200920 x86_64 randconfig-a015-20200920 x86_64 randconfig-a012-20200920 x86_64 randconfig-a016-20200920 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
On Sun, Sep 20, 2020 at 08:23:26AM +0200, Thomas Gleixner wrote: > On Sat, Sep 19 2020 at 12:37, Daniel Vetter wrote: > > On Sat, Sep 19, 2020 at 12:35 PM Daniel Vetter wrote: > >> I think it should be the case, but I want to double check: Will > >> copy_*_user be allowed within a kmap_temporary section? This would > >> allow us to ditch an absolute pile of slowpaths. > > > > (coffee just kicked in) copy_*_user is ofc allowed, but if you hit a > > page fault you get a short read/write. This looks like it would remove > > the need to handle these in a slowpath, since page faults can now be > > served in this new kmap_temporary sections. But this sounds too good > > to be true, so I'm wondering what I'm missing. > > In principle we could allow pagefaults, but not with the currently > proposed interface which can be called from any context. Obviously if > called from atomic context it can't handle user page faults. Yeah that's clear, but does the implemention need to disable pagefaults unconditionally? > In theory we could make a variant which does not disable pagefaults, but > that's what kmap() already provides. Currently we have a bunch of code which roughly does kmap_atomic(); copy_*_user(); kunmap_atomic(); if (short_copy_user) { kmap(); copy_*_user(remaining_stuff); kunmap(); } And the only reason is that kmap is a notch slower, hence the fastpath. If we get a kmap which is fast and allows pagefaults (only in contexts that allow pagefaults already ofc) then we can ditch a pile of kmap users. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/bridge: dw-mipi-dsi: Use kmemdup cf. kmalloc+memcpy
On Sat, Sep 19, 2020 at 9:31 PM Alex Dewar wrote: > > On 2020-09-11 13:57, Neil Armstrong wrote: > > On 09/09/2020 21:02, Alex Dewar wrote: > >> kmemdup can be used instead of kmalloc+memcpy. Replace an occurrence of > >> this pattern. > Friendly ping? > >> > >> Issue identified with Coccinelle. > >> > >> Signed-off-by: Alex Dewar > >> --- > >> drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 4 +--- > >> 1 file changed, 1 insertion(+), 3 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c > >> b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c > >> index 52f5c5a2ed64..7e9a62ad56e8 100644 > >> --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c > >> +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c > >> @@ -1049,12 +1049,10 @@ static void debugfs_create_files(void *data) > >> }; > >> int i; > >> > >> -dsi->debugfs_vpg = kmalloc(sizeof(debugfs), GFP_KERNEL); > >> +dsi->debugfs_vpg = kmemdup(debugfs, sizeof(debugfs), GFP_KERNEL); > >> if (!dsi->debugfs_vpg) > >> return; > >> > >> -memcpy(dsi->debugfs_vpg, debugfs, sizeof(debugfs)); > >> - > >> for (i = 0; i < ARRAY_SIZE(debugfs); i++) > >> debugfs_create_file(dsi->debugfs_vpg[i].name, 0644, > >> dsi->debugfs, &dsi->debugfs_vpg[i], > >> > > Acked-by: Neil Armstrong Neil has commit rights, so I was assuming he'd push this to drm-misc-next. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH 4.19] dmaengine: at_hdmac: Fix memory leak
This fixes memory leak in at_hdmac. Mainline does not have the same problem. Signed-off-by: Pavel Machek (CIP) diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c index 86427f6ba78c..0847b2055857 100644 --- a/drivers/dma/at_hdmac.c +++ b/drivers/dma/at_hdmac.c @@ -1714,8 +1714,10 @@ static struct dma_chan *at_dma_xlate(struct of_phandle_args *dma_spec, atslave->dma_dev = &dmac_pdev->dev; chan = dma_request_channel(mask, at_dma_filter, atslave); - if (!chan) + if (!chan) { + kfree(atslave); return NULL; + } atchan = to_at_dma_chan(chan); atchan->per_if = dma_spec->args[0] & 0xff; -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
RE: [PATCH] [v2] nvme: use correct upper limit for tag in nvme_handle_cqe()
Hi, I test and get the init flow of nvme admin queue and io queue in kernel 5.6, Currently, the code use nvmeq->q_depth as the upper limit for tag in nvme_handle_cqe(), according to below init flow, we already have the race currently. Admin queue init flow: 1, set nvmeq->q_depth 32 for admin queue; 2, register irq handler(nvme_irq) for admin queue 0; 3, set admin_tagset.queue_depth to 30 and alloc rqs; 4, nvme irq happen on admin queue; IO queue init flow: 1, set nvmeq->q_depth 1024 for io queue 1~16; 2, register irq handler(nvme_irq) for io queue 1~16; 3, set tagset.queue_depth to 1023 and alloc rqs; 4, nvme irq happen on io queue; So we have two issues need to fix: 1, register interrupt handler(nvme_irq) after tagset init(step 3 above) done to avoid a race; 2, use correct upper limit(queue_depth in tagset) for tag in nvme_handle_cqe(), which is the issue that will be solved in this patch I submitted. Is it right? Thanks a lot. -Original Message- From: tianxianting (RD) Sent: Saturday, September 19, 2020 11:15 AM To: 'Keith Busch' Cc: ax...@fb.com; h...@lst.de; s...@grimberg.me; linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org Subject: RE: [PATCH] [v2] nvme: use correct upper limit for tag in nvme_handle_cqe() Hi Keith, Thanks a lot for your comments, I will try to figure out a safe fix for this issue, then for you review:) -Original Message- From: Keith Busch [mailto:kbu...@kernel.org] Sent: Saturday, September 19, 2020 3:21 AM To: tianxianting (RD) Cc: ax...@fb.com; h...@lst.de; s...@grimberg.me; linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] [v2] nvme: use correct upper limit for tag in nvme_handle_cqe() On Fri, Sep 18, 2020 at 06:44:20PM +0800, Xianting Tian wrote: > @@ -940,7 +940,9 @@ static inline void nvme_handle_cqe(struct nvme_queue > *nvmeq, u16 idx) > struct nvme_completion *cqe = &nvmeq->cqes[idx]; > struct request *req; > > - if (unlikely(cqe->command_id >= nvmeq->q_depth)) { > + if (unlikely(cqe->command_id >= > + nvmeq->qid ? nvmeq->dev->tagset.queue_depth : > + nvmeq->dev->admin_tagset.queue_depth)) { Both of these values are set before blk_mq_alloc_tag_set(), so you still have a race. The interrupt handler probably just shouldn't be registered with the queue before the tagset is initialized since there can't be any work for the handler to do before that happens anyway. The controller is definitely broken, though, and will lead to unavoidable corruption if it's really behaving this way.
Re: [PATCH] x86/boot: Delay BSP init until after FPU initialization
On Sun, Sep 20, 2020 at 10:03:10AM +0900, Mike Hommey wrote: > FPU initialization handles the clearcpuid command line argument. If it > comes after BSP init, clearcpuid cannot be used to disable features that > trigger some parts of the BSP init code. > > Signed-off-by: Mike Hommey > --- > arch/x86/kernel/cpu/common.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > I was trying to use clearcpuid=440 to disable X86_FEATURES_AMD_SSBD to > reproduce the behavior that happens on Zen/Zen+ on a Zen2 machine, but > that didn't work because the command line is handled after the setup for > X86_FEATURE_LS_CFG_SSBD. > > I tought about either moving the command line handling earlier, but it > seems there wasn't a specific reason for BSP init being earlier than FPU > initialization so I went with reordering those instead. Our boot order is fragile and the functionality in fpu__init_parse_early_param() which does the clearcpuid= parsing should be independent from FPU, as your use case shows. So I'd prefer if you moved that function perhaps to right after the call setup_force_cpu_cap(X86_FEATURE_CPUID); in early_identify_cpu() and renamed it to something generic instead. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
[PATCH] watchdog: fix memory leak in error path
Fix memory leak in error path. Signed-off-by: Pavel Machek (CIP) diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 6798addabd5a..785270ee337c 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -994,8 +994,10 @@ static int watchdog_cdev_register(struct watchdog_device *wdd) wd_data->wdd = wdd; wdd->wd_data = wd_data; - if (IS_ERR_OR_NULL(watchdog_kworker)) + if (IS_ERR_OR_NULL(watchdog_kworker)) { + kfree(wd_data); return -ENODEV; + } device_initialize(&wd_data->dev); wd_data->dev.devt = MKDEV(MAJOR(watchdog_devt), wdd->id); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH] lib/scatterlist: Avoid a double memset
On Sun, 20 Sep 2020, Joe Perches wrote: > On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote: > > 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to > > clear it twice. > > > > Remove the redundant initialization. > > I didn't look very thoroughly, but there are at least > a few more of these with kcalloc and kzalloc like > > block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * > req->nr_phys_segments); > block/bsg-lib.c- > block/bsg-lib.c-BUG_ON(!req->nr_phys_segments); > block/bsg-lib.c- > block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL); > block/bsg-lib.c-if (!buf->sg_list) > block/bsg-lib.c-return -ENOMEM; > block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments); > --- > drivers/target/target_core_rd.c- sg = kcalloc(sg_per_table + > chain_entry, sizeof(*sg), > drivers/target/target_core_rd.c- GFP_KERNEL); > drivers/target/target_core_rd.c- if (!sg) > drivers/target/target_core_rd.c- return -ENOMEM; > drivers/target/target_core_rd.c- > drivers/target/target_core_rd.c: sg_init_table(sg, sg_per_table > + chain_entry); > --- > net/rds/rdma.c- sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL); > net/rds/rdma.c- if (!sg) { > net/rds/rdma.c- ret = -ENOMEM; > net/rds/rdma.c- goto out; > net/rds/rdma.c- } > net/rds/rdma.c- WARN_ON(!nents); > net/rds/rdma.c: sg_init_table(sg, nents); I found 16 occurrences in the following files: net/rds/rdma.c drivers/infiniband/hw/efa/efa_verbs.c drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c drivers/misc/mic/scif/scif_nodeqp.c block/bsg-lib.c drivers/dma/sh/rcar-dmac.c drivers/spi/spi-topcliff-pch.c net/sunrpc/xprtrdma/frwr_ops.c drivers/dma/imx-dma.c drivers/pci/p2pdma.c drivers/dma/sh/shdma-base.c drivers/target/target_core_rd.c drivers/media/common/saa7146/saa7146_core.c drivers/tty/serial/pch_uart.c drivers/net/wireless/intel/iwlwifi/fw/dbg.c julia
[PATCH] usb: yurex: Rearrange code not to need GFP_ATOMIC
Move prepare to wait around, so that normal GFP_KERNEL allocation can be used. Signed-off-by: Pavel Machek (CIP) Acked-by: Alan Stern diff --git a/drivers/usb/misc/yurex.c b/drivers/usb/misc/yurex.c index b2e09883c7e2..071f1debebba 100644 --- a/drivers/usb/misc/yurex.c +++ b/drivers/usb/misc/yurex.c @@ -489,10 +489,10 @@ static ssize_t yurex_write(struct file *file, const char __user *user_buffer, } /* send the data as the control msg */ - prepare_to_wait(&dev->waitq, &wait, TASK_INTERRUPTIBLE); dev_dbg(&dev->interface->dev, "%s - submit %c\n", __func__, dev->cntl_buffer[0]); - retval = usb_submit_urb(dev->cntl_urb, GFP_ATOMIC); + retval = usb_submit_urb(dev->cntl_urb, GFP_KERNEL); + prepare_to_wait(&dev->waitq, &wait, TASK_INTERRUPTIBLE); if (retval >= 0) timeout = schedule_timeout(YUREX_WRITE_TIMEOUT); finish_wait(&dev->waitq, &wait); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver
On Sat, Sep 19, 2020 at 04:22:35PM -0300, Jason Gunthorpe wrote: > On Sat, Sep 19, 2020 at 07:27:30PM +0200, Greg Kroah-Hartman wrote: > > > It's probably heresy, but why do I need to integrate into the RDMA > > > subsystem ? > > > I understand your reasoning about networking (Ethernet) as the driver > > > connects to the kernel networking stack (netdev), but with RDMA the > > > driver doesn't use or connect to anything in that stack. If I were to > > > support IBverbs and declare that I support it, then of course I would > > > need to integrate to the RDMA subsystem and add my backend to > > > rdma-core. > > > > IBverbs are horrid and I would not wish them on anyone. Seriously. > > I'm curious what drives this opinion? Did you have it since you > reviewed the initial submission all those years ago? As I learned more about that interface, yes, I like it less and less :) But that's the userspace api you all are stuck with, for various reasons, my opinion doesn't matter here. > > I think the general rdma apis are the key here, not the userspace api. > > Are you proposing that habana should have uAPI in drivers/misc and > present a standard rdma-core userspace for it? This is the only > userspace programming interface for RoCE HW. I think that would be > much more work. > > If not, what open source userspace are you going to ask them to > present to merge the kernel side into misc? I don't think that they have a userspace api to their rdma feature from what I understand, but I could be totally wrong as I do not know their hardware at all, so I'll let them answer this question. > > Note, I do not know exactly what they are, but no, IBverbs are not ok. > > Should we stop merging new drivers and abandon the RDMA subsystem? Is > there something you'd like to see fixed? > > Don't really understand your position, sorry. For anything that _has_ to have a userspace RMDA interface, sure ibverbs are the one we are stuck with, but I didn't think that was the issue here at all, which is why I wrote the above comments. thanks, greg k-h
Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
On Sun, Sep 20 2020 at 08:41, Thomas Gleixner wrote: > On Sat, Sep 19 2020 at 10:18, Linus Torvalds wrote: >> Maybe I've missed something. Is it because the new interface still >> does "pagefault_disable()" perhaps? >> >> But does it even need the pagefault_disable() at all? Yes, the >> *atomic* one obviously needed it. But why does this new one need to >> disable page faults? > > It disables pagefaults because it can be called from atomic and > non-atomic context. That was the point to get rid of > > if (preeemptible()) > kmap(); > else > kmap_atomic(); > > If it does not disable pagefaults, then it's just a lightweight variant > of kmap() for short lived mappings. Actually most usage sites of kmap atomic do not need page faults to be disabled at all. As Daniel pointed out the implicit pagefault disable enforces copy_from_user_inatomic() even in places which are clearly preemptible task context. As we need to look at each instance anyway we can add the PF disable explicitely to the very few places which actually need it. Thanks, tglx
Re: [PATCH] lib/scatterlist: Avoid a double memset
Le 20/09/2020 à 10:32, Julia Lawall a écrit : On Sun, 20 Sep 2020, Joe Perches wrote: On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote: 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to clear it twice. Remove the redundant initialization. I didn't look very thoroughly, but there are at least a few more of these with kcalloc and kzalloc like block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * req->nr_phys_segments); block/bsg-lib.c- block/bsg-lib.c-BUG_ON(!req->nr_phys_segments); block/bsg-lib.c- block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL); block/bsg-lib.c-if (!buf->sg_list) block/bsg-lib.c-return -ENOMEM; block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments); --- drivers/target/target_core_rd.c-sg = kcalloc(sg_per_table + chain_entry, sizeof(*sg), drivers/target/target_core_rd.c-GFP_KERNEL); drivers/target/target_core_rd.c-if (!sg) drivers/target/target_core_rd.c-return -ENOMEM; drivers/target/target_core_rd.c- drivers/target/target_core_rd.c:sg_init_table(sg, sg_per_table + chain_entry); --- net/rds/rdma.c- sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL); net/rds/rdma.c- if (!sg) { net/rds/rdma.c- ret = -ENOMEM; net/rds/rdma.c- goto out; net/rds/rdma.c- } net/rds/rdma.c- WARN_ON(!nents); net/rds/rdma.c: sg_init_table(sg, nents); I found 16 occurrences in the following files: net/rds/rdma.c drivers/infiniband/hw/efa/efa_verbs.c drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c drivers/misc/mic/scif/scif_nodeqp.c block/bsg-lib.c drivers/dma/sh/rcar-dmac.c drivers/spi/spi-topcliff-pch.c net/sunrpc/xprtrdma/frwr_ops.c drivers/dma/imx-dma.c drivers/pci/p2pdma.c drivers/dma/sh/shdma-base.c drivers/target/target_core_rd.c drivers/media/common/saa7146/saa7146_core.c drivers/tty/serial/pch_uart.c drivers/net/wireless/intel/iwlwifi/fw/dbg.c julia Also in: sound/soc/sprd/sprd-pcm-dma.c sound/soc/sprd/sprd-pcm-compress.c which are a bit tricky. It also uses some, IMHO, pointless devm_ functions. Feel free to propose patches. I'm not focused on that at the moment. CJ
[PATCH] media: firewire: fix memory leak
Fix memory leak in node_probe. Signed-off-by: Pavel Machek (CIP) diff --git a/drivers/media/firewire/firedtv-fw.c b/drivers/media/firewire/firedtv-fw.c index 3f1ca40b9b98..8a8585261bb8 100644 --- a/drivers/media/firewire/firedtv-fw.c +++ b/drivers/media/firewire/firedtv-fw.c @@ -272,8 +272,10 @@ static int node_probe(struct fw_unit *unit, const struct ieee1394_device_id *id) name_len = fw_csr_string(unit->directory, CSR_MODEL, name, sizeof(name)); - if (name_len < 0) - return name_len; + if (name_len < 0) { + err = name_len; + goto fail_free; + } for (i = ARRAY_SIZE(model_names); --i; ) if (strlen(model_names[i]) <= name_len && strncmp(name, model_names[i], name_len) == 0) -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH] 6lowpan: Add missing locking
I believe set_lock should be taken in exit function, too. Signed-off-by: Pavel Machek (CIP) diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c index cff4944d5b66..1420734394e9 100644 --- a/net/bluetooth/6lowpan.c +++ b/net/bluetooth/6lowpan.c @@ -1301,10 +1301,12 @@ static void __exit bt_6lowpan_exit(void) debugfs_remove(lowpan_enable_debugfs); debugfs_remove(lowpan_control_debugfs); + mutex_lock(&set_lock); if (listen_chan) { l2cap_chan_close(listen_chan, 0); l2cap_chan_put(listen_chan); } + mutex_unlock(&set_lock); disconnect_devices(); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH 1/4] blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
The throtl_downgrade_state() is always used to change to LIMIT_LOW limitation, thus remove the latter meaningless parameter which indicates the limitation index. Signed-off-by: Baolin Wang --- block/blk-throttle.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 36ba61c..4007b26 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1970,7 +1970,7 @@ static void throtl_upgrade_state(struct throtl_data *td) queue_work(kthrotld_workqueue, &td->dispatch_work); } -static void throtl_downgrade_state(struct throtl_data *td, int new) +static void throtl_downgrade_state(struct throtl_data *td) { td->scale /= 2; @@ -1980,7 +1980,7 @@ static void throtl_downgrade_state(struct throtl_data *td, int new) return; } - td->limit_index = new; + td->limit_index = LIMIT_LOW; td->low_downgrade_time = jiffies; } @@ -2067,7 +2067,7 @@ static void throtl_downgrade_check(struct throtl_grp *tg) * cgroups */ if (throtl_hierarchy_can_downgrade(tg)) - throtl_downgrade_state(tg->td, LIMIT_LOW); + throtl_downgrade_state(tg->td); tg->last_bytes_disp[READ] = 0; tg->last_bytes_disp[WRITE] = 0; -- 1.8.3.1
[PATCH 0/4] Some improvements for blk throttle
Hi, This patch set did some improvements for blk throttle, please help to review. Thanks. Baolin Wang (4): blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Fix IO hang for a corner case block/blk-throttle.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) -- 1.8.3.1
[PATCH 2/4] blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
We only update the tg->last_finish_time when the low limitaion is enabled, so we can move the tg->last_finish_time validation a little forward to avoid getting the unnecessary current time stamp if the the low limitation is not enabled. Signed-off-by: Baolin Wang --- block/blk-throttle.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 4007b26..7e72102 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2077,10 +2077,14 @@ static void throtl_downgrade_check(struct throtl_grp *tg) static void blk_throtl_update_idletime(struct throtl_grp *tg) { - unsigned long now = ktime_get_ns() >> 10; + unsigned long now; unsigned long last_finish_time = tg->last_finish_time; - if (now <= last_finish_time || last_finish_time == 0 || + if (last_finish_time == 0) + return; + + now = ktime_get_ns() >> 10; + if (now <= last_finish_time || last_finish_time == tg->checked_last_finish_time) return; -- 1.8.3.1
[PATCH 3/4] blk-throttle: Avoid tracking latency if low limit is invalid
The IO latency tracking is only for LOW limit, so we should add a validation to avoid redundant latency tracking if the LOW limit is not valid. Signed-off-by: Baolin Wang --- block/blk-throttle.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 7e72102..b0d8f7c 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2100,7 +2100,7 @@ static void throtl_update_latency_buckets(struct throtl_data *td) unsigned long last_latency[2] = { 0 }; unsigned long latency[2]; - if (!blk_queue_nonrot(td->queue)) + if (!blk_queue_nonrot(td->queue) || !td->limit_valid[LIMIT_LOW]) return; if (time_before(jiffies, td->last_calculate_time + HZ)) return; @@ -2338,6 +2338,8 @@ void blk_throtl_bio_endio(struct bio *bio) if (!blkg) return; tg = blkg_to_tg(blkg); + if (!tg->td->limit_valid[LIMIT_LOW]) + return; finish_time_ns = ktime_get_ns(); tg->last_finish_time = finish_time_ns >> 10; -- 1.8.3.1
[PATCH 4/4] blk-throttle: Fix IO hang for a corner case
It can not scale up in throtl_adjusted_limit() if we set bps or iops is 1, which will cause IO hang when enable low limit. Thus we should treat 1 as a illegal value to avoid this issue. Signed-off-by: Baolin Wang --- block/blk-throttle.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index b0d8f7c..0649bce 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1687,13 +1687,13 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of, goto out_finish; ret = -EINVAL; - if (!strcmp(tok, "rbps")) + if (!strcmp(tok, "rbps") && val > 1) v[0] = val; - else if (!strcmp(tok, "wbps")) + else if (!strcmp(tok, "wbps") && val > 1) v[1] = val; - else if (!strcmp(tok, "riops")) + else if (!strcmp(tok, "riops") && val > 1) v[2] = min_t(u64, val, UINT_MAX); - else if (!strcmp(tok, "wiops")) + else if (!strcmp(tok, "wiops") && val > 1) v[3] = min_t(u64, val, UINT_MAX); else if (off == LIMIT_LOW && !strcmp(tok, "idle")) idle_time = val; -- 1.8.3.1
[PATCH v2] checkpatch: extend author Signed-off-by check for split From: header
Checkpatch did not handle cases where the author From: header was split into multiple lines. The author identity could not be resolved and checkpatch generated a false NO_AUTHOR_SIGN_OFF warning. A typical example is Commit e33bcbab16d1 ("tee: add support for session's client UUID generation"). When checkpatch was run on this commit, it displayed: "WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author ''" This was due to split header lines not being handled properly and the author himself wrote in Commit cd2614967d8b ("checkpatch: warn if missing author Signed-off-by"): "Split From: headers are not fully handled: only the first part is compared." Support split From: headers by correctly parsing the header extension lines. RFC 2822, Section-2.2.3 stated that each extended line must start with a WSP character (a space or htab). The solution was therefore to concatenate the lines which start with a WSP to get the correct long header. Suggested-by: Joe Perches Link: https://lore.kernel.org/linux-kernel-mentees/f5d8124e54a50480b0a9fa638787bc29b6e09854.ca...@perches.com/ Signed-off-by: Dwaipayan Ray --- scripts/checkpatch.pl | 4 1 file changed, 4 insertions(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 504d2e431c60..9e65d21456f1 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2661,6 +2661,10 @@ sub process { # Check the patch for a From: if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) { $author = $1; + my $curline = $linenr; + while(defined($rawlines[$curline]) && ($rawlines[$curline++] =~ /^[ \t]\s*(.*)/)) { + $author .= $1; + } $author = encode("utf8", $author) if ($line =~ /=\?utf-8\?/i); $author =~ s/"//g; $author = reformat_email($author); -- 2.27.0
KMSAN: uninit-value in is_idle_task
Hello, syzbot found the following issue on: HEAD commit:3b3ea602 x86: add failure injection to get/put/clear_user git tree: https://github.com/google/kmsan.git master console output: https://syzkaller.appspot.com/x/log.txt?x=144b896590 kernel config: https://syzkaller.appspot.com/x/.config?x=ee5f7a0b2e48ed66 dashboard link: https://syzkaller.appspot.com/bug?extid=a4b29e75da1c2c8bd34f compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+a4b29e75da1c2c8bd...@syzkaller.appspotmail.com = BUG: KMSAN: uninit-value in is_idle_task+0x65/0x70 include/linux/sched.h:1672 CPU: 0 PID: 5351 Comm: syz-executor.3 Not tainted 5.8.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 is_idle_task+0x65/0x70 include/linux/sched.h:1672 sysvec_apic_timer_interrupt+0x2e/0x130 arch/x86/kernel/apic/apic.c:1091 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:593 RIP: 0010:kmsan_get_metadata+0x136/0x180 mm/kmsan/kmsan_shadow.c:201 Code: 75 5f 48 89 df e8 6a 00 00 00 48 85 c0 74 50 48 8b 48 40 48 85 c9 74 47 48 8b 40 48 48 85 c0 74 3e 81 e3 ff 0f 00 00 45 84 f6 <48> 0f 45 c8 48 b8 00 00 00 00 00 16 00 00 48 01 c8 48 c1 e8 04 48 RSP: 0018:8881cde9f460 EFLAGS: 0246 RAX: ea00090451b0 RBX: 0608 RCX: ea00090311b0 RDX: 0001cde9f608 RSI: 0720 RDI: 8881cde9f608 RBP: 8881cde9f478 R08: ea0f R09: 88812fffa000 R10: 0004 R11: R12: 006a R13: R14: R15: kmsan_get_shadow_origin_ptr+0x6c/0xb0 mm/kmsan/kmsan_shadow.c:149 __msan_metadata_ptr_for_load_4+0x10/0x20 mm/kmsan/kmsan_instr.c:54 vga_8planes_fillrect drivers/video/fbdev/vga16fb.c:855 [inline] vga16fb_fillrect+0xbbc/0x1870 drivers/video/fbdev/vga16fb.c:947 bit_clear_margins+0x582/0x740 drivers/video/fbdev/core/bitblit.c:232 fbcon_clear_margins drivers/video/fbdev/core/fbcon.c:1381 [inline] fbcon_switch+0x2e79/0x3830 drivers/video/fbdev/core/fbcon.c:2363 redraw_screen+0x9b2/0x2980 drivers/tty/vt/vt.c:1015 fbcon_modechanged+0x110d/0x1320 drivers/video/fbdev/core/fbcon.c:3001 fbcon_update_vcs+0x86/0xa0 drivers/video/fbdev/core/fbcon.c:3048 fb_set_var+0x1420/0x1850 drivers/video/fbdev/core/fbmem.c:1056 do_fb_ioctl+0xc00/0x1150 drivers/video/fbdev/core/fbmem.c:1109 fb_ioctl+0x1e4/0x210 drivers/video/fbdev/core/fbmem.c:1185 vfs_ioctl fs/ioctl.c:48 [inline] ksys_ioctl fs/ioctl.c:753 [inline] __do_sys_ioctl fs/ioctl.c:762 [inline] __se_sys_ioctl+0x319/0x4d0 fs/ioctl.c:760 __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:760 do_syscall_64+0xad/0x160 arch/x86/entry/common.c:386 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45d5f9 Code: Bad RIP value. RSP: 002b:7fdcda761c78 EFLAGS: 0246 ORIG_RAX: 0010 RAX: ffda RBX: e280 RCX: 0045d5f9 RDX: 20c0 RSI: 4601 RDI: 0003 RBP: 0118cf80 R08: R09: R10: R11: 0246 R12: 0118cf4c R13: 0169fb6f R14: 7fdcda7629c0 R15: 0118cf4c [ cut here ] slab index 32768 out of bounds (558) for stack id 262b8000 WARNING: CPU: 0 PID: 5351 at lib/stackdepot.c:235 stack_depot_fetch+0x2d/0x60 lib/stackdepot.c:234 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
kernel BUG at fs/f2fs/segment.h:LINE!
Hello, syzbot found the following issue on: HEAD commit:fc4f28bb Merge tag 'for-5.9-rc5-tag' of git://git.kernel.o.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=134d97dd90 kernel config: https://syzkaller.appspot.com/x/.config?x=c61f6bd349c981f3 dashboard link: https://syzkaller.appspot.com/bug?extid=3698081bcf0bb2d12174 compiler: gcc (GCC) 10.1.0-syz 20200507 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+3698081bcf0bb2d12...@syzkaller.appspotmail.com [ cut here ] kernel BUG at fs/f2fs/segment.h:657! invalid opcode: [#1] PREEMPT SMP KASAN CPU: 1 PID: 16220 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:f2fs_ra_meta_pages+0xa51/0xdc0 fs/f2fs/segment.h:657 Code: 08 49 83 ed 01 e9 3f f8 ff ff c7 44 24 18 80 00 00 00 45 31 ff 45 31 f6 31 f6 41 bc 01 00 00 00 e9 cd fb ff ff e8 0f 48 3e fe <0f> 0b e8 08 48 3e fe 4c 8b 6c 24 08 be 04 00 00 00 4c 89 ef e8 76 RSP: 0018:c9000b4177a8 EFLAGS: 00010246 RAX: 0004 RBX: dc00 RCX: c9000ac51000 RDX: 0004 RSI: 833605d1 RDI: 0004 RBP: 88804e738000 R08: 0001 R09: ea00060e7337 R10: 0037 R11: R12: 0001 R13: 0601 R14: 88808cb73800 R15: 001e FS: 7fcc23249700() GS:8880ae70() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 560df32133b7 CR3: 74a09000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: build_sit_entries fs/f2fs/segment.c:4195 [inline] f2fs_build_segment_manager+0x4b8a/0xa3c0 fs/f2fs/segment.c:4779 f2fs_fill_super+0x377d/0x6b80 fs/f2fs/super.c:3633 mount_bdev+0x32e/0x3f0 fs/super.c:1417 legacy_get_tree+0x105/0x220 fs/fs_context.c:592 vfs_get_tree+0x89/0x2f0 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x1387/0x2070 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount fs/namespace.c:3390 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x46004a Code: b8 a6 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd 89 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 da 89 fb ff c3 66 0f 1f 84 00 00 00 00 00 RSP: 002b:7fcc23248a88 EFLAGS: 0202 ORIG_RAX: 00a5 RAX: ffda RBX: 7fcc23248b20 RCX: 0046004a RDX: 2000 RSI: 2100 RDI: 7fcc23248ae0 RBP: 7fcc23248ae0 R08: 7fcc23248b20 R09: 2000 R10: R11: 0202 R12: 2000 R13: 2100 R14: 2200 R15: 20012400 Modules linked in: ---[ end trace a60c6b47a90afa81 ]--- RIP: 0010:f2fs_ra_meta_pages+0xa51/0xdc0 fs/f2fs/segment.h:657 Code: 08 49 83 ed 01 e9 3f f8 ff ff c7 44 24 18 80 00 00 00 45 31 ff 45 31 f6 31 f6 41 bc 01 00 00 00 e9 cd fb ff ff e8 0f 48 3e fe <0f> 0b e8 08 48 3e fe 4c 8b 6c 24 08 be 04 00 00 00 4c 89 ef e8 76 RSP: 0018:c9000b4177a8 EFLAGS: 00010246 RAX: 0004 RBX: dc00 RCX: c9000ac51000 RDX: 0004 RSI: 833605d1 RDI: 0004 RBP: 88804e738000 R08: 0001 R09: ea00060e7337 R10: 0037 R11: R12: 0001 R13: 0601 R14: 88808cb73800 R15: 001e FS: 7fcc23249700() GS:8880ae70() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f253183c000 CR3: 74a09000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
[GIT PULL] EDAC urgent for v5.9-rc6
Hi Linus, please pull two more fixes for ghes_edac resulting from playing with CONFIG_DEBUG_TEST_DRIVER_REMOVE=y. Thx. --- The following changes since commit 856deb866d16e29bd65952e0289066f6078af773: Linux 5.9-rc5 (2020-09-13 16:06:00 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/edac_urgent_for_v5.9_rc6 for you to fetch changes up to 251c54ea26fa6029b01a76161a37a12fde5124e4: EDAC/ghes: Check whether the driver is on the safe list correctly (2020-09-15 09:42:15 +0200) Two fixes for resulting from CONFIG_DEBUG_TEST_DRIVER_REMOVE=y experiments: * The first one completes a previous fix to reset a local structure containing scanned system data properly so that the driver rescans, as it should, on a second load. * The second one addresses a refcount underflow due to not paying attention to the driver whitelest on unregister. Borislav Petkov (2): EDAC/ghes: Clear scanned data on unload EDAC/ghes: Check whether the driver is on the safe list correctly drivers/edac/ghes_edac.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c index 54ebc8afc6b1..94d1e3165052 100644 --- a/drivers/edac/ghes_edac.c +++ b/drivers/edac/ghes_edac.c @@ -508,6 +508,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev) if (!force_load && idx < 0) return -ENODEV; } else { + force_load = true; idx = 0; } @@ -629,9 +630,13 @@ void ghes_edac_unregister(struct ghes *ghes) struct mem_ctl_info *mci; unsigned long flags; + if (!force_load) + return; + mutex_lock(&ghes_reg_mutex); system_scanned = false; + memset(&ghes_hw, 0, sizeof(struct ghes_hw_desc)); if (!refcount_dec_and_test(&ghes_refcount)) goto unlock; -- Regards/Gruss, Boris. SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg
RE: R: [PATCH v2 4/4] dt-bindings: net: Document use of mac-address-increment
> On Sun, Sep 20, 2020 at 02:39:39AM +0200, ansuels...@gmail.com > wrote: > > > > > > > -Messaggio originale- > > > Da: Andrew Lunn > > > Inviato: domenica 20 settembre 2020 02:31 > > > A: Ansuel Smith > > > Cc: Miquel Raynal ; Richard Weinberger > > > ; Vignesh Raghavendra ; Rob > Herring > > > ; David S. Miller ; Jakub > > > Kicinski ; Heiner Kallweit ; > > > Russell King ; Frank Rowand > > > ; Boris Brezillon ; > linux- > > > m...@lists.infradead.org; devicet...@vger.kernel.org; linux- > > > ker...@vger.kernel.org; net...@vger.kernel.org > > > Oggetto: Re: [PATCH v2 4/4] dt-bindings: net: Document use of mac- > > > address-increment > > > > > > > + mac-address-increment: > > > > +description: > > > > + The MAC address can optionally be increased (or decreased using > > > > + negative values) from the original value readed (from a nvmem > > cell > > > > > > Read is irregular, there is no readed, just read. > > > > > > > + for example). This can be used if the mac is readed from a > > dedicated > > > > + partition and must be increased based on the number of device > > > > + present in the system. > > > > > > You should probably add there is no underflow/overflow to other bytes > > > of the MAC address. 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00. > > > > > > > +minimum: -255 > > > > +maximum: 255 > > > > + > > > > + mac-address-increment-byte: > > > > +description: > > > > + If 'mac-address-increment' is defined, this will tell what byte > > of > > > > + the mac-address will be increased. If 'mac-address-increment' is > > > > + not defined, this option will do nothing. > > > > +default: 5 > > > > +minimum: 0 > > > > +maximum: 5 > > > > > > Is there a real need for this? A value of 0 seems like a bad idea, > > > since a unicast address could easily become a multicast address, which > > > is not valid for an interface address. It also does not seem like a > > > good idea to allow the OUI to be changed. So i think only bytes 3-5 > > > should be allowed, but even then, i don't think this is needed, unless > > > you do have a clear use case. > > > > > > Andrew > > > > Honestly the mac-address-increment-byte is added to give user some > control > > but I > > don't really have a use case for it. Should I limit it to 3 or just remove > > the function? > > If you have no use case, just remove it and document that last byte > will be incremented. I somebody does need it, it can be added in a > backwards compatible way. > > Andrew I just rechecked mac-address-increment-byte and we have one device that use it and would benefits from this. I will change the Documentation to min 3 and leave it.
[PATCH v3 1/4] mtd: Add nvmem support for mtd nvmem-providers
Introduce 2 new bindings for the mtd structure. Mtd partitions can be set as 'nvmem-provider' and any subpartition defined with the tag 'nvmem-cell' are skipped by the 'fixed-partitions' parser and registred as a nvmem cell by the nvmem api. Signed-off-by: Ansuel Smith --- drivers/mtd/mtdcore.c| 3 ++- drivers/mtd/parsers/ofpart.c | 8 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c index 7d930569a7df..ba5236db8318 100644 --- a/drivers/mtd/mtdcore.c +++ b/drivers/mtd/mtdcore.c @@ -551,6 +551,7 @@ static int mtd_nvmem_reg_read(void *priv, unsigned int offset, static int mtd_nvmem_add(struct mtd_info *mtd) { + struct device_node *mtd_node = mtd_get_of_node(mtd); struct nvmem_config config = {}; config.id = -1; @@ -563,7 +564,7 @@ static int mtd_nvmem_add(struct mtd_info *mtd) config.stride = 1; config.read_only = true; config.root_only = true; - config.no_of_node = true; + config.no_of_node = !of_property_read_bool(mtd_node, "nvmem-provider"); config.priv = mtd; mtd->nvmem = nvmem_register(&config); diff --git a/drivers/mtd/parsers/ofpart.c b/drivers/mtd/parsers/ofpart.c index daf507c123e6..442e039214bc 100644 --- a/drivers/mtd/parsers/ofpart.c +++ b/drivers/mtd/parsers/ofpart.c @@ -61,6 +61,10 @@ static int parse_fixed_partitions(struct mtd_info *master, if (!dedicated && node_has_compatible(pp)) continue; + /* skip adding if a nvmem-cell is detected */ + if (of_property_read_bool(pp, "nvmem-cell")) + continue; + nr_parts++; } @@ -80,6 +84,10 @@ static int parse_fixed_partitions(struct mtd_info *master, if (!dedicated && node_has_compatible(pp)) continue; + /* skip adding if a nvmem-cell is detected */ + if (of_property_read_bool(pp, "nvmem-cell")) + continue; + reg = of_get_property(pp, "reg", &len); if (!reg) { if (dedicated) { -- 2.27.0
[PATCH v3 2/4] dt-bindings: mtd: partition: Document use of nvmem-provider
Document the use of this 2 new bindings, nvmem-provider and nvmem-cell, used to describe the nvmem cell that the subpartition provide to the nvmem api and the system. Nvmem cell are direct subnode of the subpartition and are skipped by the 'fixed-partitions' parser if they contain the 'nvmem-cell' tag. The subpartition must have the 'nvmem-provider' tag or the subpartition will not register the cell to the nvmem api. Signed-off-by: Ansuel Smith --- .../devicetree/bindings/mtd/partition.txt | 59 +++ 1 file changed, 59 insertions(+) diff --git a/Documentation/devicetree/bindings/mtd/partition.txt b/Documentation/devicetree/bindings/mtd/partition.txt index 4a39698221a2..66d3a3f0a021 100644 --- a/Documentation/devicetree/bindings/mtd/partition.txt +++ b/Documentation/devicetree/bindings/mtd/partition.txt @@ -64,6 +64,16 @@ Optional properties: - slc-mode: This parameter, if present, allows one to emulate SLC mode on a partition attached to an MLC NAND thus making this partition immune to paired-pages corruptions +- nvmem-provider : Optionally a subpartition can be set as a nvmem-provider. This can + be very useful if some data like the mac-address is stored in a special partition + at a specific offset. Subpartition that describe nvmem-cell must have set the + 'nvmem-cell' of they will be treated as a subpartition and not skipped and registred + as nvmem cells. In this specific case '#address-cells' and '#size-cells' must be + provided. +- nvmem-cell : A direct subnode of a subpartition can be described as a nvmem-cell and + skipped by the fixed-partition parser and registred as a nvmem-cell of the registred + nvmem subpartition IF it does contain the 'nvmem-provider tag. If the subpartition + lacks of such tag the subnode will be skipped and the nvmem api won't register them. Examples: @@ -158,3 +168,52 @@ flash@3 { }; }; }; + +flash@0 { + partitions { + compatible = "fixed-partitions"; + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "u-boot"; + reg = <0x000 0x10>; + read-only; + }; + + art: art@120 { + label = "art"; + reg = <0x120 0x014>; + read-only; + nvmem-provider; + + #address-cells = <1>; + #size-cells = <1>; + + macaddr_gmac1: macaddr_gmac1@0 { + nvmem-cell; + reg = <0x0 0x6>; + }; + + macaddr_gmac2: macaddr_gmac2@6 { + nvmem-cell; + reg = <0x6 0x6>; + }; + + macaddr_wifi: macaddr_wifi@6 { + nvmem-cell; + reg = <0x6 0x6>; + }; + + pre_cal_24g: pre_cal_24g@1000 { + nvmem-cell; + reg = <0x1000 0x2f20>; + }; + + pre_cal_5g: pre_cal_5g@5000{ + nvmem-cell; + reg = <0x5000 0x2f20>; + }; + }; + }; +}; \ No newline at end of file -- 2.27.0
[PATCH v3 0/4] Actually implement nvmem support for mtd
The mtd support for the nvmem api has been stalled from 2018 with a patch half pushed hoping that a scheme is found for the mtd name later. This pathset try to address this and add a very needed feature for the mac-address. My solution to the already discussed problem here [1] is to keep it simple. A mtd subpartition can be set as a nvmem-provider with a specific tag and each direct subnode is treat by the nvmem api as a nvmem-cell and gets correctly registred. The mtd driver will treat a direct subnode of the subpartition as a legacy subpartition of the subpartition using the 'fixed-partition' parser. To fix this every nvmem-cell has to have the 'nvmem-cell' tag and the fixed-partition parser will skip these node if this tag is detected. Simple as that. The subpartition with the tag 'nvmem-provider' will then be set by the config to not skip the of registration in the config and the nvmem-cells are correctly registred and can be used to set mac-address of the devices on the system. The last 2 patch try to address a problem in the embedded devices (mostly routers) that have the mac-address saved in a dedicated partition and is a ""master"" mac-address. Each device increment or decrement the extracted mac-address based on the device number. The increment function is implemented in the of_net function to get the mac-address that every net driver should allready use if they don't have a trusty mac-address source. (One example of this implementation are many ath10k device that gets the mac-address from the art mtd partition assigned to the network driver and increments it 1 for the wifi 2.4ghz and 2 for the wifi 5ghz). I really hope my mtd nvmem implementation doesn't conflicts with others and can be used, since this would remove many patch used to get mac-address and other nvmem data. [1] https://lore.kernel.org/patchwork/patch/765435/ Changes: v3: * Fix const discard warning in of_net.c * Add some info about overflow/underflow of mac-increment * Limit mac-increment-bytes to the last 3 bytes v2: * Fix compile error (missing mtd_node in mtdcore) Ansuel Smith (4): mtd: Add nvmem support for mtd nvmem-providers dt-bindings: mtd: partition: Document use of nvmem-provider of_net: add mac-address-increment support dt-bindings: net: Document use of mac-address-increment .../devicetree/bindings/mtd/partition.txt | 59 +++ .../bindings/net/ethernet-controller.yaml | 21 +++ drivers/mtd/mtdcore.c | 3 +- drivers/mtd/parsers/ofpart.c | 8 +++ drivers/of/of_net.c | 57 ++ 5 files changed, 134 insertions(+), 14 deletions(-) -- 2.27.0
[PATCH v3 3/4] of_net: add mac-address-increment support
Lots of embedded devices use the mac-address of other interface extracted from nvmem cells and increments it by one or two. Add two bindings to integrate this and directly use the right mac-address for the interface. Some example are some routers that use the gmac mac-address stored in the art partition and increments it by one for the wifi. mac-address-increment-byte bindings is used to tell what byte of the mac-address has to be increased (if not defined the last byte is increased) and mac-address-increment tells how much the byte decided early has to be increased. Signed-off-by: Ansuel Smith --- drivers/of/of_net.c | 57 ++--- 1 file changed, 44 insertions(+), 13 deletions(-) diff --git a/drivers/of/of_net.c b/drivers/of/of_net.c index 6e411821583e..bafbc833e659 100644 --- a/drivers/of/of_net.c +++ b/drivers/of/of_net.c @@ -45,7 +45,7 @@ int of_get_phy_mode(struct device_node *np, phy_interface_t *interface) } EXPORT_SYMBOL_GPL(of_get_phy_mode); -static const void *of_get_mac_addr(struct device_node *np, const char *name) +static void *of_get_mac_addr(struct device_node *np, const char *name) { struct property *pp = of_find_property(np, name, NULL); @@ -54,26 +54,31 @@ static const void *of_get_mac_addr(struct device_node *np, const char *name) return NULL; } -static const void *of_get_mac_addr_nvmem(struct device_node *np) +static void *of_get_mac_addr_nvmem(struct device_node *np, int *err) { int ret; - const void *mac; + void *mac; u8 nvmem_mac[ETH_ALEN]; struct platform_device *pdev = of_find_device_by_node(np); - if (!pdev) - return ERR_PTR(-ENODEV); + if (!pdev) { + *err = -ENODEV; + return NULL; + } ret = nvmem_get_mac_address(&pdev->dev, &nvmem_mac); if (ret) { put_device(&pdev->dev); - return ERR_PTR(ret); + *err = ret; + return NULL; } mac = devm_kmemdup(&pdev->dev, nvmem_mac, ETH_ALEN, GFP_KERNEL); put_device(&pdev->dev); - if (!mac) - return ERR_PTR(-ENOMEM); + if (!mac) { + *err = -ENOMEM; + return NULL; + } return mac; } @@ -98,24 +103,50 @@ static const void *of_get_mac_addr_nvmem(struct device_node *np) * this case, the real MAC is in 'local-mac-address', and 'mac-address' exists * but is all zeros. * + * DT can tell the system to increment the mac-address after is extracted by + * using: + * - mac-address-increment-byte to decide what byte to increase + * (if not defined is increased the last byte) + * - mac-address-increment to decide how much to increase. The value will + * not overflow to other bytes if the increment is over 255. + * (example 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00) + * * Return: Will be a valid pointer on success and ERR_PTR in case of error. */ const void *of_get_mac_address(struct device_node *np) { - const void *addr; + u32 inc_idx, mac_inc; + int ret = 0; + u8 *addr; + + /* Check first if the increment byte is present and valid. +* If not set assume to increment the last byte if found. +*/ + if (of_property_read_u32(np, "mac-address-increment-byte", &inc_idx)) + inc_idx = 5; + if (inc_idx < 3 || inc_idx > 5) + return ERR_PTR(-EINVAL); addr = of_get_mac_addr(np, "mac-address"); if (addr) - return addr; + goto found; addr = of_get_mac_addr(np, "local-mac-address"); if (addr) - return addr; + goto found; addr = of_get_mac_addr(np, "address"); if (addr) - return addr; + goto found; + + addr = of_get_mac_addr_nvmem(np, &ret); + if (ret) + return ERR_PTR(ret); + +found: + if (!of_property_read_u32(np, "mac-address-increment", &mac_inc)) + addr[inc_idx] += mac_inc; - return of_get_mac_addr_nvmem(np); + return addr; } EXPORT_SYMBOL(of_get_mac_address); -- 2.27.0
[PATCH v3 4/4] dt-bindings: net: Document use of mac-address-increment
Two new bindings are now supported by the of_net driver to increase (or decrease) a mac-address. This can be very useful in case where the system extract the mac-address for the device from a dedicated partition and have a generic mac-address that needs to be incremented based on the device number. - mac-address-increment-byte is used to tell what byte must be incremented (if not set the last byte is increased) - mac-address-increment is used to tell how much to increment of the extracted mac-address decided byte. Signed-off-by: Ansuel Smith --- .../bindings/net/ethernet-controller.yaml | 21 +++ 1 file changed, 21 insertions(+) diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml b/Documentation/devicetree/bindings/net/ethernet-controller.yaml index fa2baca8c726..8174f64e79bd 100644 --- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml @@ -32,6 +32,27 @@ properties: - minItems: 6 maxItems: 6 + mac-address-increment: +description: + The MAC address can optionally be increased (or decreased using + negative values) from the original value read (from a nvmem cell + for example). This can be used if the mac is read from a dedicated + partition and must be increased based on the number of device + present in the system. The increment will not overflow/underflow to + the other bytes if it's over 255 or 0. + (example 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00) +minimum: -255 +maximum: 255 + + mac-address-increment-byte: +description: + If 'mac-address-increment' is defined, this will tell what byte of + the mac-address will be increased. If 'mac-address-increment' is + not defined, this option will do nothing. +default: 5 +minimum: 3 +maximum: 5 + max-frame-size: $ref: /schemas/types.yaml#definitions/uint32 description: -- 2.27.0
Re: [RFC PATCH 21/24] mm/hugetlb: Merge pte to huge pmd only for gigantic page
On Tue, Sep 15, 2020 at 9:03 PM Muchun Song wrote: > > Merge pte to huge pmd if it has ever been split. Now only support > gigantic page which's vmemmap pages size is an integer multiple of > PMD_SIZE. This is the simplest case to handle. > > Signed-off-by: Muchun Song > --- > include/linux/hugetlb.h | 7 +++ > mm/hugetlb.c| 104 +++- > 2 files changed, 109 insertions(+), 2 deletions(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index e3aa192f1c39..c56df0da7ae5 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -611,6 +611,13 @@ static inline bool vmemmap_pmd_huge(pmd_t *pmd) > } > #endif > > +#ifndef vmemmap_pmd_mkhuge > +static inline pmd_t vmemmap_pmd_mkhuge(struct page *page) > +{ > + return pmd_mkhuge(mk_pmd(page, PAGE_KERNEL)); > +} > +#endif > + > #ifndef VMEMMAP_HPAGE_SHIFT > #define VMEMMAP_HPAGE_SHIFTPMD_SHIFT > #endif > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 28c154679838..3ca36e259b4e 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1759,6 +1759,62 @@ static void __remap_huge_page_pte_vmemmap(struct page > *reuse, pte_t *ptep, > } > } > > +static void __replace_huge_page_pte_vmemmap(pte_t *ptep, unsigned long start, > + unsigned int nr, struct page > *huge, > + struct list_head *free_pages) > +{ > + unsigned long addr; > + unsigned long end = start + (nr << PAGE_SHIFT); > + > + for (addr = start; addr < end; addr += PAGE_SIZE, ptep++) { > + struct page *page; > + pte_t old = *ptep; > + pte_t entry; > + > + prepare_vmemmap_page(huge); > + > + entry = mk_pte(huge++, PAGE_KERNEL); > + VM_WARN_ON(!pte_present(old)); > + page = pte_page(old); > + list_add(&page->lru, free_pages); > + > + set_pte_at(&init_mm, addr, ptep, entry); > + } > +} > + > +static void replace_huge_page_pmd_vmemmap(pmd_t *pmd, unsigned long start, > + struct page *huge, > + struct list_head *free_pages) > +{ > + unsigned long end = start + VMEMMAP_HPAGE_SIZE; > + > + flush_cache_vunmap(start, end); > + __replace_huge_page_pte_vmemmap(pte_offset_kernel(pmd, start), start, > + VMEMMAP_HPAGE_NR, huge, free_pages); > + flush_tlb_kernel_range(start, end); > +} > + > +static pte_t *merge_vmemmap_pte(pmd_t *pmdp, unsigned long addr) > +{ > + pte_t *pte; > + struct page *page; > + > + pte = pte_offset_kernel(pmdp, addr); > + page = pte_page(*pte); > + set_pmd(pmdp, vmemmap_pmd_mkhuge(page)); > + > + return pte; > +} > + > +static void merge_huge_page_pmd_vmemmap(pmd_t *pmd, unsigned long start, > + struct page *huge, > + struct list_head *free_pages) > +{ > + replace_huge_page_pmd_vmemmap(pmd, start, huge, free_pages); > + pte_free_kernel(&init_mm, merge_vmemmap_pte(pmd, start)); > + flush_tlb_kernel_range(start, start + VMEMMAP_HPAGE_SIZE); > +} > + > static inline void alloc_vmemmap_pages(struct hstate *h, struct list_head > *list) > { > int i; > @@ -1772,6 +1828,15 @@ static inline void alloc_vmemmap_pages(struct hstate > *h, struct list_head *list) > } > } > > +static inline void dissolve_compound_page(struct page *page, unsigned int > order) > +{ > + int i; > + unsigned int nr_pages = 1 << order; > + > + for (i = 1; i < nr_pages; i++) > + set_page_refcounted(page + i); > +} > + > static void alloc_huge_page_vmemmap(struct hstate *h, struct page *head) > { > pmd_t *pmd; > @@ -1791,10 +1856,45 @@ static void alloc_huge_page_vmemmap(struct hstate *h, > struct page *head) > __remap_huge_page_pte_vmemmap); > if (!freed_vmemmap_hpage_dec(pmd_page(*pmd)) && pmd_split(pmd)) { > /* > -* Todo: > -* Merge pte to huge pmd if it has ever been split. > +* Merge pte to huge pmd if it has ever been split. Now only > +* support gigantic page which's vmemmap pages size is an > +* integer multiple of PMD_SIZE. This is the simplest case > +* to handle. > */ > clear_pmd_split(pmd); > + > + if (IS_ALIGNED(nr_vmemmap(h), VMEMMAP_HPAGE_NR)) { > + unsigned long addr = (unsigned long)head; > + unsigned long end = addr + nr_vmemmap_size(h); > + > + spin_unlock(ptl); > + > + for (; addr < end; addr += VMEMMAP_HPAGE_SIZE) { > + void
Re: [PATCH] PR_SPEC_DISABLE_NOEXEC support for arm64.
> As a heads up: I'm currently reworking most of this, and hope to post > something within the next two weeks. Sure. Let me know whether you want to implement the PR_SPEC_DISABLE_NOEXEC support directly or whether this patch would be relevant even after your rework. > > > diff --git a/arch/arm64/include/asm/ssbd.h b/arch/arm64/include/asm/ssbd.h > > new file mode 100644 > > index ..68c716dc5811 > > --- /dev/null > > +++ b/arch/arm64/include/asm/ssbd.h ... > > +} > > I'd prefer to keep these where they are and have an out-of-line call if > necessary. We should try to keep the SSBD stuff in one place. OK. > > > + > > +#endif /* __ASM_SSBD_H */ > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > index 6089638c7d43..ad3c67c86c4c 100644 > > --- a/arch/arm64/kernel/process.c > > +++ b/arch/arm64/kernel/process.c > > @@ -54,6 +54,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #if defined(CONFIG_STACKPROTECTOR) && > > !defined(CONFIG_STACKPROTECTOR_PER_TASK) > > @@ -588,6 +589,18 @@ void arch_setup_new_exec(void) > > current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0; > > > > ptrauth_thread_init_user(current); > > + > > + /* > > + * Don't inherit TIF_SSBD across exec boundary when > > + * PR_SPEC_DISABLE_NOEXEC is used. > > + */ > > + if (test_thread_flag(TIF_SSBD) && > > + task_spec_ssb_noexec(current)) { > > + clear_thread_flag(TIF_SSBD); > > + task_clear_spec_ssb_disable(current); > > + task_clear_spec_ssb_noexec(current); > > + ssbd_ssbs_enable(current); > > + } > > How is this supposed to work with CPUs that expose SSBS directly to > userspace? I suppose we should be using PR_SPEC_DISABLE_NOEXEC to decide > what we set the SSBS bit to on exec, but the logic here requires TIF_SSBD > to be set and so won't trigger afaict. > You're right. The SSBS support is incomplete. I guess "test_thread_flag(TIF_SSBD)" can be replaced just with "arm64_get_ssbd_state() == ARM64_SSBD_KERNEL". Thanks, Anthony
Re: [PATCH v8 04/20] gpio: uapi: define uAPI v2
On Tue, Sep 15, 2020 at 01:06:36PM +0300, Andy Shevchenko wrote: > On Wed, Sep 9, 2020 at 1:29 PM Kent Gibson wrote: > > > > Thanks again for doing this and sorry for quite a delay from my side. > Below my comments regarding uAPI v2 (mostly nit-picks as I agree with > it in general). > No problems with most of your points - so I'll chop those bits out. Responses to the remainder below... > > +/** > > + * enum gpio_v2_line_attr_id - &struct gpio_v2_line_attribute.id values > > Perhaps per-item description (couple of words per each)? > I'll add kernel doc for all the enum values - see if that works. > > + * GPIO_V2_LINE_FLAG_OUTPUT etc, OR:ed together. This is the default for > > I think "OR:ed together" is understandable for programmers but a bit > special for normal phrases. I would rephrase it somehow, but I'm not a > native speaker. > Something like "or any of their combinations". > My bad - I was assuming the primary audience was programmers ;-). Actually that description is drawn from the v1 uAPI, as is the case for all of the fields that were carried over. And "or any of their combinations" is misleading - some combinations are invalid. I'll take another look at it, but I'm ok with it as is. (and I' ok with 'ok' over 'OK' or 'okay' as well, btw ;-) > > + * @attrs: the configuration attributes associated with the requested > > + * lines. Any attribute should only be associated with a particular line > > + * once. If an attribute is associated with a line multiple times then the > > + * first occurrence (i.e. lowest index) has precedence. > > This is a bit unexpected. I anticipated last come last served as in a > shell script argument list. > I don't want to encourage userspace to just add another entry to attrs to encode a change. The behaviour was originally as you describe, but was changed in v3 to encourage userspace to keep the configuration attributes as simplified as possible, and to make clearer that the fields, particularly the flags bits, are not overlayed. > > + /* > > +* Pad to fill implicit padding and provide space for future use. > > +*/ > > > + __u32 padding[5]; > > This is still questionable. First of all why to waste so many bytes (I > propose 5 -> 1) and how will you do when the structure changes (we > still need some type of versioning or flags for which one u32 is more > than enough). > Ack - we disagree on how to manage future changes and the impact of reserving space. > > + * @num_lines: number of lines requested in this request, i.e. the number > > + * of valid fields in the GPIO_V2_LINES_MAX sized arrays, set to 1 to > > + * request a single line > > I would rather call it "amount" or something which is one word, but > you may have a reason for the current, so I don't insist. > Also, I would describe what will be the expected behaviour outside of > the range (and what is the range?). > For example, would it mean that we consider all lines if num_lines >= > _LINES_MAX? > Using num_X to describe the number of active entries in array X is a common pattern, so I'll stick with that. And num_lines > _LINES_MAX is interpreted as invalid - since they wont fit. The EINVAL the user will get if they try should make that clear. > > + * @consumer: a functional name for the consumer of this GPIO line as set > > + * by whatever is using it, will be empty if there is no current user but > > + * may also be empty if the consumer doesn't set this up > > In both cases "empty" means what? All \0:s, only first \0, white spaces? > Another one inherited from v1 - empty => consumer[0] == '\0'. Cheers, Kent.
possible deadlock in xfrm_policy_lookup_bytype
Hello, syzbot found the following issue on: HEAD commit:5fa35f24 Add linux-next specific files for 20200916 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=16ec67fd90 kernel config: https://syzkaller.appspot.com/x/.config?x=6bdb7e39caf48f53 dashboard link: https://syzkaller.appspot.com/bug?extid=4cbd5e3669aee5ac5149 compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17ca2c7390 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1366870190 The issue was bisected to: commit 1909760f5fc3f123e47b4e24e0ccdc0fc8f3f106 Author: Ahmed S. Darwish Date: Fri Sep 4 15:32:31 2020 + seqlock: PREEMPT_RT: Do not starve seqlock_t writers bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=156b27dd90 final oops: https://syzkaller.appspot.com/x/report.txt?x=176b27dd90 console output: https://syzkaller.appspot.com/x/log.txt?x=136b27dd90 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+4cbd5e3669aee5ac5...@syzkaller.appspotmail.com Fixes: 1909760f5fc3 ("seqlock: PREEMPT_RT: Do not starve seqlock_t writers") WARNING: possible irq lock inversion dependency detected 5.9.0-rc5-next-20200916-syzkaller #0 Not tainted syz-executor974/6847 just changed the state of lock: 8ae7a3c8 (&s->seqcount#9){+..-}-{0:0}, at: xfrm_policy_lookup_bytype+0x183/0xa40 net/xfrm/xfrm_policy.c:2088 but this lock took another, SOFTIRQ-unsafe lock in the past: (&s->seqcount#8){+.+.}-{0:0} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0CPU1 lock(&s->seqcount#8); local_irq_disable(); lock(&s->seqcount#9); lock(&s->seqcount#8); lock(&s->seqcount#9); *** DEADLOCK *** 4 locks held by syz-executor974/6847: #0: 8aae80a8 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:687 [inline] #0: 8aae80a8 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3390 #1: c9007d80 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:35 [inline] #1: c9007d80 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1403 #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: read_pnet include/net/net_namespace.h:327 [inline] #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: dev_net include/linux/netdevice.h:2290 [inline] #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: mld_sendpack+0x165/0xdb0 net/ipv6/mcast.c:1646 #3: 89e71cc0 (rcu_read_lock){}-{1:2}, at: xfrm_policy_lookup_bytype+0x104/0xa40 net/xfrm/xfrm_policy.c:2082 the shortest dependencies between 2nd lock and 1st lock: -> (&s->seqcount#8){+.+.}-{0:0} { HARDIRQ-ON-W at: lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398 write_seqcount_t_begin_nested include/linux/seqlock.h:509 [inline] write_seqcount_t_begin include/linux/seqlock.h:535 [inline] write_seqlock include/linux/seqlock.h:883 [inline] xfrm_set_spdinfo+0x302/0x660 net/xfrm/xfrm_user.c:1185 xfrm_user_rcv_msg+0x414/0x700 net/xfrm/xfrm_user.c:2684 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2470 xfrm_netlink_rcv+0x6b/0x90 net/xfrm/xfrm_user.c:2692 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:671 sys_sendmsg+0x6e8/0x810 net/socket.c:2362 ___sys_sendmsg+0xf3/0x170 net/socket.c:2416 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2449 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 SOFTIRQ-ON-W at: lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398 write_seqcount_t_begin_nested include/linux/seqlock.h:509 [inline] write_seqcount_t_begin include/linux/seqlock.h:535 [inline] write_seqlock include/linux/seqlock.h:883 [inline] xfrm_set_spdinfo+0x302/0x660 net/xfrm/xfrm_user.c:1185 xfrm_user_rcv_msg+0x414/0x700 net/xfrm/xfrm_user.
[PATCH 1/2] mips: Add strong UC ordering config
In accordance with [1, 2] memory transactions using CCA=2 (Uncached Cacheability and Coherency Attribute) are always strongly ordered. This means the younger memory accesses using CCA=2 are never allowed to be executed before older memory accesses using CCA=2 (no bypassing is allowed), and Loads and Stores using CCA=2 are never speculative. It is expected by the specification that the rest of the system maintains these properties for processor initiated uncached accesses. So the system IO interconnect doesn't reorder uncached transactions once they have left the processor subsystem. Taking into account these properties and what [3] says about the relaxed IO-accessors we can infer that normal Loads and Stores from/to CCA=2 memory and without any additional execution barriers will fully comply with the {read,write}X_relaxed() methods requirements. Let's convert then currently generated relaxed IO-accessors to being pure Loads and Stores. Seeing the commit 3d474dacae72 ("MIPS: Enforce strong ordering for MMIO accessors") and commit 8b656253a7a4 ("MIPS: Provide actually relaxed MMIO accessors") have already made a preparation in the corresponding macro, we can do that just by replacing the "barrier" parameter utilization with the "relax" one. Note the "barrier" macro argument can be removed, since it isn't fully used anyway other than being always assigned to 1. Of course it would be fullish to believe that all the available MIPS-based CPUs completely follow the denoted specification, especially considering how old the architecture is. Instead we introduced a dedicated kernel config, which when enabled will convert the relaxed IO-accessors to being pure Loads and Stores without any additional barriers around. So if some CPU supports the strongly ordered UC memory access, it can enable that config and use a fully optimized relaxed IO-methods. For instance, Baikal-T1 architecture support code will do that. [1] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior, p. 33 [2] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58 [3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Serge Semin Cc: Maciej W. Rozycki --- arch/mips/Kconfig | 8 arch/mips/include/asm/io.h | 20 ++-- 2 files changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index c95fa3a2484c..2c82d927347d 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -2066,6 +2066,14 @@ config WEAK_ORDERING # config WEAK_REORDERING_BEYOND_LLSC bool + +# +# CPU may not reorder reads and writes R->R, R->W, W->R, W->W within Uncached +# Cacheability and Coherency Attribute (CCA=2) +# +config STRONG_UC_ORDERING + bool + endmenu # diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h index 78537aa23500..130c4b6458fc 100644 --- a/arch/mips/include/asm/io.h +++ b/arch/mips/include/asm/io.h @@ -213,7 +213,7 @@ void iounmap(const volatile void __iomem *addr); #define war_io_reorder_wmb() barrier() #endif -#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, relax, irq)\ +#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, relax, irq) \ \ static inline void pfx##write##bwlq(type val, \ volatile void __iomem *mem) \ @@ -221,7 +221,7 @@ static inline void pfx##write##bwlq(type val, \ volatile type *__mem; \ type __val; \ \ - if (barrier)\ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ else\ war_io_reorder_wmb(); \ @@ -262,7 +262,7 @@ static inline type pfx##read##bwlq(const volatile void __iomem *mem)\ \ __mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));\ \ - if (barrier)\ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ \ if (sizeof(type) != sizeof(u64) || size
[PATCH 2/2] mips: Introduce MIPS CM2 GCR Control register accessors
For some reason these accessors have been absent from the MIPS kernel, while some of them can be used to tune the MIPS code execution up (the default value are fully acceptable though). For instance, in the framework of MIPS P5600/P6600 (see [1] for details) if we are sure the IO interconnect doesn't reorder the requests we can freely set GCR_CONTROL.SYNCDIS, which will make CM2 to respond on SYNCs just after a request is accepted on the L2/Memory interface instead of executing the legacy SYNC and waiting for the full response from L2/Memory. Needless to say that this will significantly speed the {read,write}X() IO-accessors due to having more lightweight barriers around the IO Loads and Stores. There are others MIPS Coherency Manager optimizations available in framework of that register like cache ops serialization limits, speculative read enable, etc, which can be useful for the various MIPS platforms. [1] MIPS32 P5600 Multiprocessing System Software User's Manual, Document Number: MD01025, Revision 01.60, April 19, 2016, p. 400 Signed-off-by: Serge Semin --- Folks, do you think it would be better to implement a dedicated config for arch/mips/kernel/mips-cm.c code, which would disable the SI_SyncTxEn acceptance by setting the GCR_CONTROL.SYNCDIS bit? Currently I intend to set it in the out platform-specific prom_init() method. --- arch/mips/include/asm/mips-cm.h | 15 +++ 1 file changed, 15 insertions(+) diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h index aeae2effa123..17b2adf57e0c 100644 --- a/arch/mips/include/asm/mips-cm.h +++ b/arch/mips/include/asm/mips-cm.h @@ -143,6 +143,21 @@ GCR_ACCESSOR_RW(64, 0x008, base) #define CM_GCR_BASE_CMDEFTGT_IOCU02 #define CM_GCR_BASE_CMDEFTGT_IOCU13 +/* GCR_CONTROL - Global CM2 Settings */ +GCR_ACCESSOR_RW(64, 0x010, control) +#define CM_GCR_CONTROL_SYNCCTL BIT(16) +#define CM_GCR_CONTROL_SYNCDIS BIT(5) +#define CM_GCR_CONTROL_IVU_EN BIT(4) +#define CM_GCR_CONTROL_SHST_EN BIT(3) +#define CM_GCR_CONTROL_PARK_EN BIT(2) +#define CM_GCR_CONTROL_MMIO_LIMIT_DIS BIT(1) +#define CM_GCR_CONTROL_SPEC_READ_ENBIT(0) + +/* GCR_CONTROL2 - Global CM2 Settings (continue) */ +GCR_ACCESSOR_RW(64, 0x018, control2) +#define CM_GCR_CONTROL2_L2_CACHEOP_LIMIT GENMASK(19, 16) +#define CM_GCR_CONTROL2_L1_CACHEOP_LIMIT GENMASK(3, 0) + /* GCR_ACCESS - Controls core/IOCU access to GCRs */ GCR_ACCESSOR_RW(32, 0x020, access) #define CM_GCR_ACCESS_ACCESSEN GENMASK(7, 0) -- 2.27.0
[PATCH 0/2] mips: Introduce some IO-accessors optimizations
It has been discovered that on our MIPS P5600-based CPU the IO accessors aren't that rapid as they could be even taking into account a relatively slow AXI2APB bridge embedded into the system interconnect. Turned out we can introduce two types of optimizations. First we can remove the execution barriers from the relaxed IO-accessors as our CPU conforms to the MIPS Coherency Protocol Specification [1, 2]. Of course it also concerns the IO interconnect implementation. So in accordance with [3] we suggest to remove the barriers at least for the platforms which conform the specification the same way as ours. Second there is a dedicated Coherency Manager control register, which can be also used to tune the IO methods up. For some reason it hasn't been added to the MIPS arch code so far, while it provides flags for instance to speed the SYNC barrier for the platforms with non-re-ordering IO interconnect, to set the cache ops serialization limits, enable the speculative reads, etc. For now we suggest to add just the macro with the CM2 GCR_CONTROL register accessors and fields description. So any platform could use it to activate the corresponding optimization. Our platform-wise we'll do this in the framework of our Baikal-T1 platform code in the prom_init() method. [1] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior, p. 33 [2] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58 [3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Serge Semin Cc: Alexey Malahov Cc: Pavel Parkhomenko Cc: Vadim Vlasov Cc: Maciej W. Rozycki Cc: linux-m...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Serge Semin (2): mips: Add strong UC ordering config mips: Introduce MIPS CM2 GCR Control register accessors arch/mips/Kconfig | 8 arch/mips/include/asm/io.h | 20 ++-- arch/mips/include/asm/mips-cm.h | 15 +++ 3 files changed, 33 insertions(+), 10 deletions(-) -- 2.27.0
[PATCH 1/3] hwmon: bt1-pvt: Test sensor power supply on probe
Baikal-T1 PVT sensor has got a dedicated power supply domain (feed up by the external GPVT/VPVT_18 pins). In case if it isn't powered up, the registers will be accessible, but the sensor conversion just won't happen. Due to that an attempt to read data from any PVT sensor will cause the task hanging up. For instance that will happen if XP11 jumper isn't installed on the Baikal-T1-based BFK3.1 board. Let's at least test whether the conversion work on the device probe procedure. By doing so will make sure that the PVT sensor is powered up at least at boot time. Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver") Signed-off-by: Serge Semin --- drivers/hwmon/bt1-pvt.c | 40 1 file changed, 40 insertions(+) diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c index 94698cae0497..f4b7353c078a 100644 --- a/drivers/hwmon/bt1-pvt.c +++ b/drivers/hwmon/bt1-pvt.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -982,6 +983,41 @@ static int pvt_request_clks(struct pvt_hwmon *pvt) return 0; } +static int pvt_check_pwr(struct pvt_hwmon *pvt) +{ + unsigned long tout; + int ret = 0; + u32 data; + + /* +* Test out the sensor conversion functionality. If it is not done on +* time then the domain must have been unpowered and we won't be able +* to use the device later in this driver. +* Note If the power source is lost during the normal driver work the +* data read procedure will either return -ETIMEDOUT (for the +* alarm-less driver configuration) or just stop the repeated +* conversion. In the later case alas we won't be able to detect the +* problem. +*/ + pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_ALL, PVT_INTR_ALL); + pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN); + pvt_set_tout(pvt, 0); + readl(pvt->regs + PVT_DATA); + + tout = PVT_TOUT_MIN / NSEC_PER_USEC; + usleep_range(tout, 2 * tout); + + data = readl(pvt->regs + PVT_DATA); + if (!(data & PVT_DATA_VALID)) { + ret = -ENODEV; + dev_err(pvt->dev, "Sensor is powered down\n"); + } + + pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0); + + return ret; +} + static void pvt_init_iface(struct pvt_hwmon *pvt) { u32 trim, temp; @@ -1109,6 +1145,10 @@ static int pvt_probe(struct platform_device *pdev) if (ret) return ret; + ret = pvt_check_pwr(pvt); + if (ret) + return ret; + pvt_init_iface(pvt); ret = pvt_request_irq(pvt); -- 2.27.0
[PATCH 0/3] hwmon: bt1-pvt: Fix PVT sensor being unpowered
Baikal-T1 PVT sensor has got a dedicated power domain with 1.8V intended to be supplied via the external GPVT and VPVT_18 pins. In case if the power isn't supplied, the sensor IO registers will be accessible, but the data conversion just won't happen. The situation of the power loss currently will cause the temperature and voltage read procedures to hang up. The problem is fixed in the framework of this patchset firstly by checking whether the conversion works at the sensor probe procedure, and secondly by keeping the current conversion update timeout in the private driver data cache and using it to set the wait-completion timeout. Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver") Signed-off-by: Serge Semin Cc: Maxim Kaurkin Cc: Alexey Malahov Cc: Pavel Parkhomenko Cc: linux-m...@vger.kernel.org Cc: linux-hw...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Serge Semin (3): hwmon: bt1-pvt: Test sensor power supply on probe hwmon: bt1-pvt: Cache current update timeout hwmon: bt1-pvt: Wait for the completion with timeout drivers/hwmon/bt1-pvt.c | 138 drivers/hwmon/bt1-pvt.h | 3 + 2 files changed, 101 insertions(+), 40 deletions(-) -- 2.27.0
[PATCH 3/3] hwmon: bt1-pvt: Wait for the completion with timeout
If the PVT sensor is suddenly powered down while a caller is waiting for the conversion completion, the request won't be finished and the task will hang up on this procedure until the power is back up again. Let's call the wait_for_completion_timeout() method instead to prevent that. The cached timeout is exactly what we need to predict for how long conversion could normally last. Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver") Signed-off-by: Serge Semin --- drivers/hwmon/bt1-pvt.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c index 2600426a3b21..3e1d56585b91 100644 --- a/drivers/hwmon/bt1-pvt.c +++ b/drivers/hwmon/bt1-pvt.c @@ -477,6 +477,7 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum pvt_sensor_type type, long *val) { struct pvt_cache *cache = &pvt->cache[type]; + unsigned long timeout; u32 data; int ret; @@ -500,7 +501,14 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum pvt_sensor_type type, pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, 0); pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN); - wait_for_completion(&cache->conversion); + /* +* Wait with timeout since in case if the sensor is suddenly powered +* down the request won't be completed and the caller will hang up on +* this procedure until the power is back up again. Multiply the +* timeout by the factor of two to prevent a false timeout. +*/ + timeout = 2 * usecs_to_jiffies(ktime_to_us(pvt->timeout)); + ret = wait_for_completion_timeout(&cache->conversion, timeout); pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0); pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, @@ -510,6 +518,9 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum pvt_sensor_type type, mutex_unlock(&pvt->iface_mtx); + if (!ret) + return -ETIMEDOUT; + if (type == PVT_TEMP) *val = pvt_calc_poly(&poly_N_to_temp, data); else -- 2.27.0
[PATCH 2/3] hwmon: bt1-pvt: Cache current update timeout
Instead of converting the update timeout data to the milliseconds each time on the read procedure let's preserve the currently set timeout in the dedicated driver private data cache. The cached value will be then used in the timeout read method and in the alarm-less data conversion to prevent the caller task hanging up in case if the PVT sensor is suddenly powered down. Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver") Signed-off-by: Serge Semin --- drivers/hwmon/bt1-pvt.c | 85 ++--- drivers/hwmon/bt1-pvt.h | 3 ++ 2 files changed, 49 insertions(+), 39 deletions(-) diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c index f4b7353c078a..2600426a3b21 100644 --- a/drivers/hwmon/bt1-pvt.c +++ b/drivers/hwmon/bt1-pvt.c @@ -655,44 +655,16 @@ static int pvt_write_trim(struct pvt_hwmon *pvt, long val) static int pvt_read_timeout(struct pvt_hwmon *pvt, long *val) { - unsigned long rate; - ktime_t kt; - u32 data; - - rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk); - if (!rate) - return -ENODEV; - - /* -* Don't bother with mutex here, since we just read data from MMIO. -* We also have to scale the ticks timeout up to compensate the -* ms-ns-data translations. -*/ - data = readl(pvt->regs + PVT_TTIMEOUT) + 1; + int ret; - /* -* Calculate ref-clock based delay (Ttotal) between two consecutive -* data samples of the same sensor. So we first must calculate the -* delay introduced by the internal ref-clock timer (Tref * Fclk). -* Then add the constant timeout cuased by each conversion latency -* (Tmin). The basic formulae for each conversion is following: -* Ttotal = Tref * Fclk + Tmin -* Note if alarms are enabled the sensors are polled one after -* another, so in order to have the delay being applicable for each -* sensor the requested value must be equally redistirbuted. -*/ -#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS) - kt = ktime_set(PVT_SENSORS_NUM * (u64)data, 0); - kt = ktime_divns(kt, rate); - kt = ktime_add_ns(kt, PVT_SENSORS_NUM * PVT_TOUT_MIN); -#else - kt = ktime_set(data, 0); - kt = ktime_divns(kt, rate); - kt = ktime_add_ns(kt, PVT_TOUT_MIN); -#endif + ret = mutex_lock_interruptible(&pvt->iface_mtx); + if (ret) + return ret; /* Return the result in msec as hwmon sysfs interface requires. */ - *val = ktime_to_ms(kt); + *val = ktime_to_ms(pvt->timeout); + + mutex_unlock(&pvt->iface_mtx); return 0; } @@ -700,7 +672,7 @@ static int pvt_read_timeout(struct pvt_hwmon *pvt, long *val) static int pvt_write_timeout(struct pvt_hwmon *pvt, long val) { unsigned long rate; - ktime_t kt; + ktime_t kt, cache; u32 data; int ret; @@ -713,7 +685,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long val) * between all available sensors to have the requested delay * applicable to each individual sensor. */ - kt = ms_to_ktime(val); + cache = kt = ms_to_ktime(val); #if defined(CONFIG_SENSORS_BT1_PVT_ALARMS) kt = ktime_divns(kt, PVT_SENSORS_NUM); #endif @@ -742,6 +714,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long val) return ret; pvt_set_tout(pvt, data); + pvt->timeout = cache; mutex_unlock(&pvt->iface_mtx); @@ -1018,10 +991,17 @@ static int pvt_check_pwr(struct pvt_hwmon *pvt) return ret; } -static void pvt_init_iface(struct pvt_hwmon *pvt) +static int pvt_init_iface(struct pvt_hwmon *pvt) { + unsigned long rate; u32 trim, temp; + rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk); + if (!rate) { + dev_err(pvt->dev, "Invalid reference clock rate\n"); + return -ENODEV; + } + /* * Make sure all interrupts and controller are disabled so not to * accidentally have ISR executed before the driver data is fully @@ -1036,12 +1016,37 @@ static void pvt_init_iface(struct pvt_hwmon *pvt) pvt_set_mode(pvt, pvt_info[pvt->sensor].mode); pvt_set_tout(pvt, PVT_TOUT_DEF); + /* +* Preserve the current ref-clock based delay (Ttotal) between the +* sensors data samples in the driver data so not to recalculate it +* each time on the data requests and timeout reads. It consists of the +* delay introduced by the internal ref-clock timer (N / Fclk) and the +* constant timeout caused by each conversion latency (Tmin): +* Ttotal = N / Fclk + Tmin +* If alarms are enabled the sensors are polled one after another and +* in order to get the next measurement of a particular sensor the +* caller will have to wait for at most until all the others are +
[rcu:dev.2020.09.17b] BUILD SUCCESS ef749b8a4cc43dcbdf6e5848acdc2cb8d6950b45
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev.2020.09.17b branch HEAD: ef749b8a4cc43dcbdf6e5848acdc2cb8d6950b45 fixup! locktorture: Prevent hangs for invalid arguments elapsed time: 722m configs tested: 110 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig powerpc maple_defconfig sh microdev_defconfig armneponset_defconfig powerpc kmeter1_defconfig powerpc ps3_defconfig powerpc tqm8541_defconfig mips pic32mzda_defconfig armmmp2_defconfig sparcalldefconfig powerpcicon_defconfig riscvnommu_k210_defconfig sh kfr2r09-romimage_defconfig m68k atari_defconfig powerpc rainier_defconfig powerpc sbc8548_defconfig mips jazz_defconfig powerpc pseries_defconfig mips capcella_defconfig xtensa alldefconfig powerpc tqm8540_defconfig powerpc mpc512x_defconfig m68k m5208evb_defconfig csky alldefconfig powerpc tqm8555_defconfig h8300 h8s-sim_defconfig sh rsk7264_defconfig sh se7724_defconfig mips decstation_64_defconfig mips ci20_defconfig powerpcsocrates_defconfig shhp6xx_defconfig shedosk7705_defconfig arm pxa_defconfig powerpc mpc832x_mds_defconfig arm h5000_defconfig mips ip32_defconfig mips ip22_defconfig powerpc allnoconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig x86_64 randconfig-a005-20200920 x86_64 randconfig-a003-20200920 x86_64 randconfig-a004-20200920 x86_64 randconfig-a002-20200920 x86_64 randconfig-a006-20200920 x86_64 randconfig-a001-20200920 i386 randconfig-a002-20200920 i386 randconfig-a006-20200920 i386 randconfig-a003-20200920 i386 randconfig-a004-20200920 i386 randconfig-a005-20200920 i386 randconfig-a001-20200920 i386 randconfig-a012-20200920 i386 randconfig-a014-20200920 i386 randconfig-a016-20200920 i386 randconfig-a013-20200920 i386 randconfig-a011-20200920 i386 randconfig-a015-20200920 riscvallyesconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscv rv32_defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig
Re: [PATCH v7 07/20] gpiolib: cdev: support GPIO_V2_GET_LINE_IOCTL and GPIO_V2_LINE_GET_VALUES_IOCTL
On Tue, Sep 15, 2020 at 01:31:27PM +0300, Andy Shevchenko wrote: > On Sat, Sep 5, 2020 at 4:49 PM Kent Gibson wrote: > > > > Add support for requesting lines using the GPIO_V2_GET_LINE_IOCTL, and > > returning their current values using GPIO_V2_LINE_GET_VALUES_IOCTL. > > > > The struct linereq implementation is based on the v1 struct linehandle > > implementation. > > > > Signed-off-by: Kent Gibson > > --- > > [snip] > > > + /* Bias requires explicit direction. */ > > + if ((flags & GPIO_V2_LINE_BIAS_FLAGS) && > > + !(flags & GPIO_V2_LINE_DIRECTION_FLAGS)) > > + return -EINVAL; > > Why is this? If I request a line as is and after set a bias, should I > really care about direction? > Yeah, you probably should be aware of the direction if they are setting the bias, so this makes sure they are. The practical reason is that gpiod_direction_output() or gpiod_direction_input() have to be called to set the bias, and they are only called if the corresponding flag is set. > > > + for (num_get = 0, i = 0; i < lr->num_lines; i++) { > > + if (lv.mask & BIT_ULL(i)) { > > for_each_set_bit() ? > No - lv.mask is u64, not unsigned long. You could do a cast, but it would break on BE-32. Sound familar? - you caught me doing just that in your review of an earlier version. > > + ulr.consumer[sizeof(ulr.consumer)-1] = '\0'; > > + if (strlen(ulr.consumer)) { > > + lr->label = kstrdup(ulr.consumer, GFP_KERNEL); > > Sounds like kstrndup() > Been here before too - they differ slightly in that here lr->label is left null if consumer is empty, whereas kstrndup() would alloc one byte just for the null terminator. > > + } > > + > > + blocking_notifier_call_chain(&desc->gdev->notifier, > > +GPIOLINE_CHANGED_REQUESTED, > > desc); > > + > > > + dev_dbg(&gdev->dev, "registered chardev handle for line > > %d\n", > > + offset); > > Hmm... I would rather see trace events / points than new dev_dbg() / > pr_debug() calls. > Agreed - it is on the TODO list. I have looked at it and doing it properly would mean adding tracepoints to gpiolib.c, and modifying the v1 code as well, so it is best done in a separate patch later... > > @@ -1104,6 +1505,25 @@ int gpiolib_cdev_register(struct gpio_device *gdev, > > dev_t devt) > > MAJOR(devt), gdev->id); > > > > return 0; > > + /* > > +* array sizes must ensure 64-bit alignment and not create holes in > > +* the struct packing. > > +*/ > > + BUILD_BUG_ON(IS_ALIGNED(GPIO_V2_LINES_MAX, 2)); > > + BUILD_BUG_ON(IS_ALIGNED(GPIO_MAX_NAME_SIZE, 8)); > > + > > + /* > > +* check that uAPI structs are 64-bit aligned for 32/64-bit > > +* compatibility > > +*/ > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_attribute), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct > > gpio_v2_line_config_attribute), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_config), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_request), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_info), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_info_changed), > > 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_event), 8)); > > + BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_values), 8)); > > Can we use static_assert() at the top of the file? Presumably after > inclusion block. > Good idea - will do. Cheers, Kent.
[PATCH] clk: baikal-t1: Mark Ethernet PLL as critical
We've discovered that disabling the so called Ethernet PLL causes reset of the devices consuming its outgoing clock. The resets happen automatically even if each underlying clock gate is turned off. Due to that we can't disable the Ethernet PLL until the kernel is prepared for the corresponding resets. So for now just mark the PLL clock provider as critical. Signed-off-by: Serge Semin Cc: Alexey Malahov Cc: linux-m...@vger.kernel.org --- drivers/clk/baikal-t1/clk-ccu-pll.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/clk/baikal-t1/clk-ccu-pll.c b/drivers/clk/baikal-t1/clk-ccu-pll.c index 1eec8c0b8f50..2445d4b12baf 100644 --- a/drivers/clk/baikal-t1/clk-ccu-pll.c +++ b/drivers/clk/baikal-t1/clk-ccu-pll.c @@ -51,11 +51,13 @@ struct ccu_pll_info { }; /* - * Mark as critical all PLLs except Ethernet one. CPU and DDR PLLs are sources - * of CPU cores and DDR controller reference clocks, due to which they - * obviously shouldn't be ever gated. SATA and PCIe PLLs are the parents of - * APB-bus and DDR controller AXI-bus clocks. If they are gated the system will - * be unusable. + * Alas we have to mark all PLLs as critical. CPU and DDR PLLs are sources of + * CPU cores and DDR controller reference clocks, due to which they obviously + * shouldn't be ever gated. SATA and PCIe PLLs are the parents of APB-bus and + * DDR controller AXI-bus clocks. If they are gated the system will be + * unusable. Moreover disabling SATA and Ethernet PLLs causes automatic reset + * of the corresponding subsystems. So until we aren't ready to re-initialize + * all the devices consuming those PLLs, they will be marked as critical too. */ static const struct ccu_pll_info pll_info[] = { CCU_PLL_INFO(CCU_CPU_PLL, "cpu_pll", "ref_clk", CCU_CPU_PLL_BASE, @@ -67,7 +69,7 @@ static const struct ccu_pll_info pll_info[] = { CCU_PLL_INFO(CCU_PCIE_PLL, "pcie_pll", "ref_clk", CCU_PCIE_PLL_BASE, CLK_IS_CRITICAL), CCU_PLL_INFO(CCU_ETH_PLL, "eth_pll", "ref_clk", CCU_ETH_PLL_BASE, -CLK_SET_RATE_GATE) +CLK_IS_CRITICAL | CLK_SET_RATE_GATE) }; struct ccu_pll_data { -- 2.27.0
[PATCH v3] mtd: physmap: Add Baikal-T1 physically mapped ROM support
Baikal-T1 Boot Controller provides an access to a RO storages, which are physically mapped into the SoC MMIO space. In particularly there are Internal ROM embedded into the SoC with a pre-installed firmware, externally attached SPI flash (also accessed in the read-only mode) and a memory region, which mirrors one of them in accordance with the currently enabled system boot mode (also called Boot ROM). This commit adds the Internal ROM support to the physmap driver of the MTD kernel subsystem. The driver will create the Internal ROM MTD as long as it is defined in the system dts file. The physically mapped SPI flash region will be used to implement the SPI-mem interface. The mirroring memory region won't be accessible directly since it's redundant due to both bootable regions being exposed anyway. Note we had to create a dedicated code for the ROMs since read from the corresponding memory regions must be done via the dword-aligned addresses. Signed-off-by: Serge Semin Cc: Alexey Malahov Cc: Pavel Parkhomenko Cc: Lee Jones Cc: linux-m...@vger.kernel.org --- Link: https://lore.kernel.org/linux-mtd/20200508100905.5854-1-sergey.se...@baikalelectronics.ru/ Changelog v2: - Resend. Link: https://lore.kernel.org/linux-mtd/20200526225849.20985-1-sergey.se...@baikalelectronics.ru Changelog v3: - Alphabetically order the include statements. - Discard dummy IOs. - Discard Baikal-T1 Boot ROM support. The directly mapped SPI flash memory will be used in the DW APB SSI driver instead. --- drivers/mtd/maps/Kconfig | 11 +++ drivers/mtd/maps/Makefile | 1 + drivers/mtd/maps/physmap-bt1-rom.c | 126 + drivers/mtd/maps/physmap-bt1-rom.h | 17 drivers/mtd/maps/physmap-core.c| 5 ++ 5 files changed, 160 insertions(+) create mode 100644 drivers/mtd/maps/physmap-bt1-rom.c create mode 100644 drivers/mtd/maps/physmap-bt1-rom.h diff --git a/drivers/mtd/maps/Kconfig b/drivers/mtd/maps/Kconfig index fd37553f1b07..6650acbc961e 100644 --- a/drivers/mtd/maps/Kconfig +++ b/drivers/mtd/maps/Kconfig @@ -75,6 +75,17 @@ config MTD_PHYSMAP_OF physically into the CPU's memory. The mapping description here is taken from OF device tree. +config MTD_PHYSMAP_BT1_ROM + bool "Baikal-T1 Boot ROMs OF-based physical memory map handling" + depends on MTD_PHYSMAP_OF + depends on MIPS_BAIKAL_T1 || COMPILE_TEST + select MTD_COMPLEX_MAPPINGS + select MULTIPLEXER + select MUX_MMIO + help + This provides some extra DT physmap parsing for the Baikal-T1 + platforms, some detection and setting up ROMs-specific accessors. + config MTD_PHYSMAP_VERSATILE bool "ARM Versatile OF-based physical memory map handling" depends on MTD_PHYSMAP_OF diff --git a/drivers/mtd/maps/Makefile b/drivers/mtd/maps/Makefile index c0da86a5d26f..79f018cf412f 100644 --- a/drivers/mtd/maps/Makefile +++ b/drivers/mtd/maps/Makefile @@ -18,6 +18,7 @@ obj-$(CONFIG_MTD_CK804XROM) += ck804xrom.o obj-$(CONFIG_MTD_TSUNAMI) += tsunami_flash.o obj-$(CONFIG_MTD_PXA2XX) += pxa2xx-flash.o physmap-objs-y += physmap-core.o +physmap-objs-$(CONFIG_MTD_PHYSMAP_BT1_ROM) += physmap-bt1-rom.o physmap-objs-$(CONFIG_MTD_PHYSMAP_VERSATILE) += physmap-versatile.o physmap-objs-$(CONFIG_MTD_PHYSMAP_GEMINI) += physmap-gemini.o physmap-objs-$(CONFIG_MTD_PHYSMAP_IXP4XX) += physmap-ixp4xx.o diff --git a/drivers/mtd/maps/physmap-bt1-rom.c b/drivers/mtd/maps/physmap-bt1-rom.c new file mode 100644 index ..27cfe1c63a2e --- /dev/null +++ b/drivers/mtd/maps/physmap-bt1-rom.c @@ -0,0 +1,126 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2020 BAIKAL ELECTRONICS, JSC + * + * Authors: + * Serge Semin + * + * Baikal-T1 Physically Mapped Internal ROM driver + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "physmap-bt1-rom.h" + +/* + * Baikal-T1 SoC ROMs are only accessible by the dword-aligned instructions. + * We have to take this into account when implementing the data read-methods. + * Note there is no need in bothering with endianness, since both Baikal-T1 + * CPU and MMIO are LE. + */ +static map_word __xipram bt1_rom_map_read(struct map_info *map, + unsigned long ofs) +{ + void __iomem *src = map->virt + ofs; + unsigned long shift; + map_word ret; + u32 data; + + /* Read data within offset dword. */ + shift = (unsigned long)src & 0x3; + data = readl_relaxed(src - shift); + if (!shift) { + ret.x[0] = data; + return ret; + } + ret.x[0] = data >> (shift * BITS_PER_BYTE); + + /* Read data from the next dword. */ + shift = 4 - shift; + if (ofs + shift >= map->size) + return ret; + + data = readl_relaxed(src + shift); + ret.x[0] |= d
[PATCH v2 11/11] spi: dw-dma: Add one-by-one SG list entries transfer
In case if at least one of the requested DMA engine channels doesn't support the hardware accelerated SG list entries traverse, the DMA driver will most likely work that around by performing the IRQ-based SG list entries resubmission. That might and will cause a problem if the DMA Tx channel is recharged and re-executed before the Rx DMA channel. Due to non-deterministic IRQ-handler execution latency the DMA Tx channel will start pushing data to the SPI bus before the Rx DMA channel is even reinitialized with the next inbound SG list entry. By doing so the DMA Tx channel will implicitly start filling the DW APB SSI Rx FIFO up, which while the DMA Rx channel being recharged and re-executed will eventually be overflown. In order to solve the problem we have to feed the DMA engine with SG list entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs synchronized and prevent the Rx FIFO overflow. Since in general the SPI tx_sg and rx_sg lists may have different number of entries of different lengths (though total length should match) we virtually split the SG-lists to the set of DMA transfers, which length is a minimum of the ordered SG-entries lengths. The solution described above is only executed if a full-duplex SPI transfer is requested and the DMA engine hasn't provided channels with hardware accelerated SG list traverse capability to handle both SG lists at once. Signed-off-by: Serge Semin Suggested-by: Andy Shevchenko --- drivers/spi/spi-dw-dma.c | 137 ++- drivers/spi/spi-dw.h | 1 + 2 files changed, 137 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index f333c2e23bf6..1cbb5a9efbba 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -72,6 +72,23 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws) dw_writel(dws, DW_SPI_DMATDLR, dws->txburst); } +static void dw_spi_dma_sg_burst_init(struct dw_spi *dws) +{ + struct dma_slave_caps tx = {0}, rx = {0}; + + dma_get_slave_caps(dws->txchan, &tx); + dma_get_slave_caps(dws->rxchan, &rx); + + if (tx.max_sg_burst > 0 && rx.max_sg_burst > 0) + dws->dma_sg_burst = min(tx.max_sg_burst, rx.max_sg_burst); + else if (tx.max_sg_burst > 0) + dws->dma_sg_burst = tx.max_sg_burst; + else if (rx.max_sg_burst > 0) + dws->dma_sg_burst = rx.max_sg_burst; + else + dws->dma_sg_burst = 0; +} + static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws) { struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = &dma_tx; @@ -109,6 +126,8 @@ static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws) dw_spi_dma_maxburst_init(dws); + dw_spi_dma_sg_burst_init(dws); + return 0; free_rxchan: @@ -138,6 +157,8 @@ static int dw_spi_dma_init_generic(struct device *dev, struct dw_spi *dws) dw_spi_dma_maxburst_init(dws); + dw_spi_dma_sg_burst_init(dws); + return 0; } @@ -466,11 +487,125 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws, return ret; } +/* + * In case if at least one of the requested DMA channels doesn't support the + * hardware accelerated SG list entries traverse, the DMA driver will most + * likely work that around by performing the IRQ-based SG list entries + * resubmission. That might and will cause a problem if the DMA Tx channel is + * recharged and re-executed before the Rx DMA channel. Due to + * non-deterministic IRQ-handler execution latency the DMA Tx channel will + * start pushing data to the SPI bus before the Rx DMA channel is even + * reinitialized with the next inbound SG list entry. By doing so the DMA Tx + * channel will implicitly start filling the DW APB SSI Rx FIFO up, which while + * the DMA Rx channel being recharged and re-executed will eventually be + * overflown. + * + * In order to solve the problem we have to feed the DMA engine with SG list + * entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs + * synchronized and prevent the Rx FIFO overflow. Since in general the tx_sg + * and rx_sg lists may have different number of entries of different lengths + * (though total length should match) let's virtually split the SG-lists to the + * set of DMA transfers, which length is a minimum of the ordered SG-entries + * lengths. An ASCII-sketch of the implemented algo is following: + * xfer->len + *|___| + * tx_sg list:|___||__| + * rx_sg list:|_||| + * DMA transfers: |_|_|__|_|__| + * + * Note in order to have this workaround solving the denoted problem the DMA + * engine driver should properly initialize the max_sg_burst capability and set + * the DMA device max segment size parameter with maximum data block size the + * DMA engine supports. + */ + +static int dw_spi_dma_transfer_one(struct dw_spi *dws, + struct s
[PATCH v2 04/11] spi: dw-dma: Check rx_buf availability in the xfer method
Checking rx_buf for being NULL and returning NULL from the Rx-channel preparation method doesn't let us to distinguish that situation from errors happening during the Rx SG-list preparation. So it's better to make sure that the rx_buf not-NULL and full-duplex communication is requested prior calling the Rx preparation method. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index da17897b8acb..d2a67dee1a66 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -369,9 +369,6 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, { struct dma_async_tx_descriptor *rxdesc; - if (!xfer->rx_buf) - return NULL; - rxdesc = dmaengine_prep_slave_sg(dws->rxchan, xfer->rx_sg.sgl, xfer->rx_sg.nents, @@ -435,10 +432,12 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) return -EINVAL; /* Prepare the RX dma transfer */ - rxdesc = dw_spi_dma_prepare_rx(dws, xfer); + if (xfer->rx_buf) { + rxdesc = dw_spi_dma_prepare_rx(dws, xfer); + if (!rxdesc) + return -EINVAL; - /* rx must be started before tx due to spi instinct */ - if (rxdesc) { + /* rx must be started before tx due to spi instinct */ set_bit(RX_BUSY, &dws->dma_chan_busy); dmaengine_submit(rxdesc); dma_async_issue_pending(dws->rxchan); @@ -458,7 +457,7 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) return ret; } - if (rxdesc && dws->master->cur_msg->status == -EINPROGRESS) + if (xfer->rx_buf && dws->master->cur_msg->status == -EINPROGRESS) ret = dw_spi_dma_wait_rx_done(dws); return ret; -- 2.27.0
[PATCH v2 03/11] spi: dw-dma: Configure the DMA channels in dma_setup
Mainly this is a preparation patch before adding one-by-one DMA SG entries transmission. But logically the Tx and Rx DMA channels setup should be performed in the dma_setup() callback anyway. So we'll move the DMA slave channels src/dst burst lengths, address and address width configuration from the Tx/Rx channels preparation methods to the dedicated functions and then make sure it's called at the DMA setup stage. Note we now make sure the return value of the dmaengine_slave_config() method doesn't indicate an error. It has been unnecessary in case if Dw DMAC is utilized as a DMA engine, since its device_config() callback always returns zero (though it might change in future). But since DW APB SSI driver now supports any DMA back-end we must make sure the DMA device configuration has been successful before proceeding with further setups. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 42 +--- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index 1b013ac94a3f..da17897b8acb 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -256,11 +256,9 @@ static void dw_spi_dma_tx_done(void *arg) complete(&dws->dma_completion); } -static struct dma_async_tx_descriptor * -dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_config_tx(struct dw_spi *dws) { struct dma_slave_config txconf; - struct dma_async_tx_descriptor *txdesc; memset(&txconf, 0, sizeof(txconf)); txconf.direction = DMA_MEM_TO_DEV; @@ -270,7 +268,13 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) txconf.dst_addr_width = dw_spi_dma_convert_width(dws->n_bytes); txconf.device_fc = false; - dmaengine_slave_config(dws->txchan, &txconf); + return dmaengine_slave_config(dws->txchan, &txconf); +} + +static struct dma_async_tx_descriptor * +dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) +{ + struct dma_async_tx_descriptor *txdesc; txdesc = dmaengine_prep_slave_sg(dws->txchan, xfer->tx_sg.sgl, @@ -345,14 +349,9 @@ static void dw_spi_dma_rx_done(void *arg) complete(&dws->dma_completion); } -static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, - struct spi_transfer *xfer) +static int dw_spi_dma_config_rx(struct dw_spi *dws) { struct dma_slave_config rxconf; - struct dma_async_tx_descriptor *rxdesc; - - if (!xfer->rx_buf) - return NULL; memset(&rxconf, 0, sizeof(rxconf)); rxconf.direction = DMA_DEV_TO_MEM; @@ -362,7 +361,16 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, rxconf.src_addr_width = dw_spi_dma_convert_width(dws->n_bytes); rxconf.device_fc = false; - dmaengine_slave_config(dws->rxchan, &rxconf); + return dmaengine_slave_config(dws->rxchan, &rxconf); +} + +static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, + struct spi_transfer *xfer) +{ + struct dma_async_tx_descriptor *rxdesc; + + if (!xfer->rx_buf) + return NULL; rxdesc = dmaengine_prep_slave_sg(dws->rxchan, xfer->rx_sg.sgl, @@ -381,10 +389,22 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) { u16 imr, dma_ctrl; + int ret; if (!xfer->tx_buf) return -EINVAL; + /* Setup DMA channels */ + ret = dw_spi_dma_config_tx(dws); + if (ret) + return ret; + + if (xfer->rx_buf) { + ret = dw_spi_dma_config_rx(dws); + if (ret) + return ret; + } + /* Set the DMA handshaking interface */ dma_ctrl = SPI_DMA_TDMAE; if (xfer->rx_buf) -- 2.27.0
Re: [PATCH v8 09/20] gpiolib: cdev: support edge detection for uAPI v2
On Tue, Sep 15, 2020 at 01:39:41PM +0300, Andy Shevchenko wrote: > On Wed, Sep 9, 2020 at 1:33 PM Kent Gibson wrote: > > > > Add support for edge detection to lines requested using > > GPIO_V2_GET_LINE_IOCTL. > > > > The edge_detector implementation is based on the v1 lineevent > > implementation. > > ... > [snip] > > + > > + /* > > +* We may be running from a nested threaded interrupt in which case > > +* we didn't get the timestamp from edge_irq_handler(). > > +*/ > > > + if (!line->timestamp) { > > Can it be positive conditional? > Not sure what you mean - switch the order of the if/else? > > + le.timestamp = ktime_get_ns(); > > + if (lr->num_lines != 1) > > + line->req_seqno = atomic_inc_return(&lr->seqno); > > + } else { > > + le.timestamp = line->timestamp; > > + } > > + line->timestamp = 0; > > + > > + if (line->eflags == (GPIO_V2_LINE_FLAG_EDGE_RISING | > > +GPIO_V2_LINE_FLAG_EDGE_FALLING)) { > > + int level = gpiod_get_value_cansleep(line->desc); > > + > > + if (level) > > + /* Emit low-to-high event */ > > + le.id = GPIO_V2_LINE_EVENT_RISING_EDGE; > > + else > > + /* Emit high-to-low event */ > > + le.id = GPIO_V2_LINE_EVENT_FALLING_EDGE; > > + } else if (line->eflags == GPIO_V2_LINE_FLAG_EDGE_RISING) { > > + /* Emit low-to-high event */ > > + le.id = GPIO_V2_LINE_EVENT_RISING_EDGE; > > + } else if (line->eflags == GPIO_V2_LINE_FLAG_EDGE_FALLING) { > > + /* Emit high-to-low event */ > > + le.id = GPIO_V2_LINE_EVENT_FALLING_EDGE; > > + } else { > > + return IRQ_NONE; > > + } > > + line->line_seqno++; > > + le.line_seqno = line->line_seqno; > > + le.seqno = (lr->num_lines == 1) ? le.line_seqno : line->req_seqno; > > + le.offset = gpio_chip_hwgpio(line->desc); > > + > > + ret = kfifo_in_spinlocked_noirqsave(&lr->events, &le, > > + 1, &lr->wait.lock); > > + if (ret) > > + wake_up_poll(&lr->wait, EPOLLIN); > > + else > > > + pr_debug_ratelimited("event FIFO is full - event > > dropped\n"); > > Oh, can we really avoid printf() in IRQ context? > Even in the IRQ thread? Would a tracepoint be preferable? Btw, this is drawn from the v1 implmentation. > > + > > +static void edge_detector_stop(struct line *line) > > +{ > > > + if (line->irq) { > > + free_irq(line->irq, line); > > + line->irq = 0; > > + } > > Perhaps > > if (!line->irq) > return; > > ? > No - the function is extended in subsequent patches. I usually make a note of cases like this in the commentary, but missed this one. > > + if (!eflags) > > + return 0; > > + > > + irq = gpiod_to_irq(line->desc); > > + if (irq <= 0) > > > + return -ENODEV; > > Why shadowing actual error code? > Another one drawn from the v1 implementation, so not sure. gpiod_to_irq() can potentially return EINVAL, which is definitely not appropriate to return, or ENXIO, which is actually more appropriate?? Cheers, Kent.
[PATCH v2 02/11] spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified
Since commit 46164fde6b78 ("spi: dw: Fix Rx-only DMA transfers") if DMA interface is enabled, then Tx-buffer must be available in each SPI transfer. It's required since in order to activate the incoming data reception either DMA or CPU must be pushing data out to the SPI bus. But the DW APB SSI DMA driver code is still left in state as if Tx-buffer might be optional, which is no longer true. Let's fix it so an error would be returned if no Tx-buffer detected and DMA Tx would be always enabled. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index a7ff1e357f8b..1b013ac94a3f 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -262,9 +262,6 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) struct dma_slave_config txconf; struct dma_async_tx_descriptor *txdesc; - if (!xfer->tx_buf) - return NULL; - memset(&txconf, 0, sizeof(txconf)); txconf.direction = DMA_MEM_TO_DEV; txconf.dst_addr = dws->dma_addr; @@ -383,17 +380,19 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) { - u16 imr = 0, dma_ctrl = 0; + u16 imr, dma_ctrl; - if (xfer->tx_buf) - dma_ctrl |= SPI_DMA_TDMAE; + if (!xfer->tx_buf) + return -EINVAL; + + /* Set the DMA handshaking interface */ + dma_ctrl = SPI_DMA_TDMAE; if (xfer->rx_buf) dma_ctrl |= SPI_DMA_RDMAE; dw_writel(dws, DW_SPI_DMACR, dma_ctrl); /* Set the interrupt mask */ - if (xfer->tx_buf) - imr |= SPI_INT_TXOI; + imr = SPI_INT_TXOI; if (xfer->rx_buf) imr |= SPI_INT_RXUI | SPI_INT_RXOI; spi_umask_intr(dws, imr); @@ -412,6 +411,8 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) /* Prepare the TX dma transfer */ txdesc = dw_spi_dma_prepare_tx(dws, xfer); + if (!txdesc) + return -EINVAL; /* Prepare the RX dma transfer */ rxdesc = dw_spi_dma_prepare_rx(dws, xfer); @@ -423,17 +424,15 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) dma_async_issue_pending(dws->rxchan); } - if (txdesc) { - set_bit(TX_BUSY, &dws->dma_chan_busy); - dmaengine_submit(txdesc); - dma_async_issue_pending(dws->txchan); - } + set_bit(TX_BUSY, &dws->dma_chan_busy); + dmaengine_submit(txdesc); + dma_async_issue_pending(dws->txchan); ret = dw_spi_dma_wait(dws, xfer); if (ret) return ret; - if (txdesc && dws->master->cur_msg->status == -EINPROGRESS) { + if (dws->master->cur_msg->status == -EINPROGRESS) { ret = dw_spi_dma_wait_tx_done(dws, xfer); if (ret) return ret; -- 2.27.0
[PATCH v2 10/11] spi: dw-dma: Pass exact data to the DMA submit and wait methods
In order to use the DMA submission and waiting methods in both generic DMA-based SPI transfer and one-by-one DMA SG entries transmission functions, we need to modify the dw_spi_dma_wait() and dw_spi_dma_submit_tx()/dw_spi_dma_submit_rx() prototypes. So instead of getting the SPI transfer object as the second argument they must accept the exact data structure instances they imply to use. Those are the current transfer length and the SPI bus frequency in case of dw_spi_dma_wait(), and SG list together with number of list entries in case of the DMA Tx/Rx submission methods. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 35 +-- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index 935f073a3523..f333c2e23bf6 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -188,12 +188,12 @@ static enum dma_slave_buswidth dw_spi_dma_convert_width(u8 n_bytes) return DMA_SLAVE_BUSWIDTH_UNDEFINED; } -static int dw_spi_dma_wait(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_wait(struct dw_spi *dws, unsigned int len, u32 speed) { unsigned long long ms; - ms = xfer->len * MSEC_PER_SEC * BITS_PER_BYTE; - do_div(ms, xfer->effective_speed_hz); + ms = len * MSEC_PER_SEC * BITS_PER_BYTE; + do_div(ms, speed); ms += ms + 200; if (ms > UINT_MAX) @@ -268,17 +268,16 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws) return dmaengine_slave_config(dws->txchan, &txconf); } -static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct scatterlist *sgl, + unsigned int nents) { struct dma_async_tx_descriptor *txdesc; dma_cookie_t cookie; int ret; - txdesc = dmaengine_prep_slave_sg(dws->txchan, - xfer->tx_sg.sgl, - xfer->tx_sg.nents, - DMA_MEM_TO_DEV, - DMA_PREP_INTERRUPT | DMA_CTRL_ACK); + txdesc = dmaengine_prep_slave_sg(dws->txchan, sgl, nents, +DMA_MEM_TO_DEV, +DMA_PREP_INTERRUPT | DMA_CTRL_ACK); if (!txdesc) return -ENOMEM; @@ -370,17 +369,16 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws) return dmaengine_slave_config(dws->rxchan, &rxconf); } -static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct scatterlist *sgl, + unsigned int nents) { struct dma_async_tx_descriptor *rxdesc; dma_cookie_t cookie; int ret; - rxdesc = dmaengine_prep_slave_sg(dws->rxchan, - xfer->rx_sg.sgl, - xfer->rx_sg.nents, - DMA_DEV_TO_MEM, - DMA_PREP_INTERRUPT | DMA_CTRL_ACK); + rxdesc = dmaengine_prep_slave_sg(dws->rxchan, sgl, nents, +DMA_DEV_TO_MEM, +DMA_PREP_INTERRUPT | DMA_CTRL_ACK); if (!rxdesc) return -ENOMEM; @@ -443,13 +441,14 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws, int ret; /* Submit the DMA Tx transfer */ - ret = dw_spi_dma_submit_tx(dws, xfer); + ret = dw_spi_dma_submit_tx(dws, xfer->tx_sg.sgl, xfer->tx_sg.nents); if (ret) goto err_clear_dmac; /* Submit the DMA Rx transfer if required */ if (xfer->rx_buf) { - ret = dw_spi_dma_submit_rx(dws, xfer); + ret = dw_spi_dma_submit_rx(dws, xfer->rx_sg.sgl, + xfer->rx_sg.nents); if (ret) goto err_clear_dmac; @@ -459,7 +458,7 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws, dma_async_issue_pending(dws->txchan); - ret = dw_spi_dma_wait(dws, xfer); + ret = dw_spi_dma_wait(dws, xfer->len, xfer->effective_speed_hz); err_clear_dmac: dw_writel(dws, DW_SPI_DMACR, 0); -- 2.27.0
Re: [PATCH v2 1/3] MAINTAINERS: Add Fabrizio Castro to Renesas DRIF
Hi Fabrizio, > > Renesas are expanding their DRIF support and offering, > I'll be the internal maintainer for DRIF. > > Signed-off-by: Fabrizio Castro > Reviewed-by: Laurent Pinchart Thank you for volunteering :-). Reviewed-by: Ramesh Shanmugasundaram Please feel free to take my name off the maintainers list as I am not spending time on this topic for a while now. Thanks, Ramesh > --- > v1->v2: > * No change > > MAINTAINERS | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 2575f449139a..d9ebaf0c179b 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -10909,6 +10909,7 @@ F: include/media/drv-intf/renesas-ceu.h > > MEDIA DRIVERS FOR RENESAS - DRIF > M: Ramesh Shanmugasundaram > +M: Fabrizio Castro > L: linux-me...@vger.kernel.org > L: linux-renesas-...@vger.kernel.org > S: Supported > -- > 2.25.1 >
[PATCH v2 00/11] spi: dw-dma: Add max SG entries burst capability support
Mainly this series is about fixing a very nasty problem discovered in the DW APB SSI driver working in a couple with DW DMAC, which doesn't have multi-block capability support (a.k.a. accelerated SG list entries traversal/burst, or automatic LLP entries reload, etc.). DW DMA IP-core and its DMA channel on the synthesize stage can be tuned by setting a vast number of the model parameters, some of which are provided to create a more optimized/performant controller. In particular two of those parameters are DMAH_CHx_MULTI_BLK_EN and DMAH_CHx_HC_LLP. If at least one of them is set to zero (false) then the target DMA controller will be smaller and faster but will lack of the DMA blocks chaining support. In the kernel notation it means that the controller won't be able to automatically reload a next SG-list entry, when a current one is finished. Since Linux kernel DMA subsystem interface requires to have the SG-lists execution supported, the DW DMA engine driver is developed in a way so to resubmit each SG-list entry one-by-one by software means: each SG-list entry execution finish is tracked by the DW DMA controller interrupt, which handler executes a tasklet, which then re-charges a DW DMA channel with a next SG-list entry if one is available. Such implementation is a normal design and works well for the most of the DW DMAC client devices. But it may cause problems for devices, which send and receive data by means of internal FIFOs. That requires having several DMA channels working synchronously in order to prevent the FIFOs overflow. A bright example of such device is the DW APB SSI controller. It has Tx and Rx FIFOs, which first need to be filled in with data before data sending out or receiving in. But those FIFOs are inter-dependent because of the SPI nature and its shift-register design. So each sent over Tx FIFO byte immediately causes getting a byte from the SPI bus into the Rx FIFO. It makes a strategy of working with the SPI controller a bit tricky. The more data we push into the Tx FIFO and the faster the SPI bus is, the more careful we have to be in pulling data from Rx FIFO since if software or DMA engine misses a moment when the Rx FIFO is full, for instance, due to being busy with some other activity or due to being blocked if system bus is busy with doing something else or just due to being too slow to keep up with incoming data, then Rx FIFO will be overflown, which consequently causes data loss. Needless to say that such situation is fatal and mustn't be tolerated for a bus like SPI. In application to the DW APB SSI driver the problem above may happen when DW DMAC is synthesized with no multi-block capability support and it's enabled to be working with DW APB SSI for full-duplex transfers. DW APB SSI driver allocates two DW DMAC channels to perform Tx and Rx SPI transfers, initializes them, submits Tx and Rx SG-lists for execution and then waits until the DMA transfers are finished. The issue happens when Rx SG-list consists of more than one entry. Since the multi-block capability is unsupported the DW DMAC driver will use the software-based SG-list entries traverse implementation, which by design may cause non-deterministic latency during the Rx SG-list entries re-charge. During the entries re-charge procedure the DMA Tx channel will keep pushing data into the SPI Tx FIFO. DW APB SSI controller in its turn will keep pushing data out from the Tx FIFO to the SPI bus, and will immediately fill in the Rx FIFO. So if the latency is big enough, then we'll eventually end up with Rx FIFO overflow. One of the possible solution of the problem denoted above is to feed the DMA engine with the Tx and Rx SG-list entries one-by-one. This patch-series provides an implementation of that approach. First it moves the requested DMA channels configuration into the dma_setup() callback, which should have been there in the first place. Then it's better to move the DMA transfers submission into the DMA-preparation methods to collect all the setups in a single method. After that the current implementation of a straightforward SG-lists DMA transfer should be unpinned into a dedicated method dw_spi_dma_transfer_all() since we are about to introduce an alternative DMA-based SPI transfer approach. Since DMA-transfers finish is now locally detected we can simplify the driver code a bit and move the DMAC register cleanup to a single place in the dw_spi_dma_transfer_all() method. In order to re-use the DMA-{wait, submit Tx, submit Rx} methods we have to alter their prototypes, so they would accept SG-lists instead of the SPI-transfer structure. Finally we introduce a new DMA-based SPI-transfer method, which traverses the SG-list entries in a loop and synchronously submits each of then to the Tx and Rx DMA channels in a way so the DMA engine wouldn't need to activate the prone to errors in our case SG-list entries re-charge implementation. That new method is utilized only for the DMA controllers, which can't handle all T
[PATCH v2 05/11] spi: dw-dma: Move DMA transfers submission to the channels prep methods
Indeed we can freely move the dmaengine_submit() method invocation and the Tx and Rx busy flag setting into the DMA Tx/Rx prepare methods. Since the Tx/Rx preparation method is now mainly used for the DMA transfers submission, here we suggest to rename it to have the _submit_{r,t}x suffix instead. By having this alteration applied first we implement another code preparation before adding the one-by-one DMA SG entries transmission, second we now have the dma_async_tx_descriptor descriptor used locally only in the new DMA transfers submission methods (this will be cleaned up a bit later), third we make the generic transfer method more readable, where now the functionality of submission, execution and wait procedures is transparently split up instead of having a preparation, intermixed submission/execution and wait procedures. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index d2a67dee1a66..769d10ca74b4 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -272,7 +272,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws) } static struct dma_async_tx_descriptor * -dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) +dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *txdesc; @@ -287,6 +287,9 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer) txdesc->callback = dw_spi_dma_tx_done; txdesc->callback_param = dws; + dmaengine_submit(txdesc); + set_bit(TX_BUSY, &dws->dma_chan_busy); + return txdesc; } @@ -364,7 +367,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws) return dmaengine_slave_config(dws->rxchan, &rxconf); } -static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, +static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *rxdesc; @@ -380,6 +383,9 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, rxdesc->callback = dw_spi_dma_rx_done; rxdesc->callback_param = dws; + dmaengine_submit(rxdesc); + set_bit(RX_BUSY, &dws->dma_chan_busy); + return rxdesc; } @@ -426,25 +432,21 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) struct dma_async_tx_descriptor *txdesc, *rxdesc; int ret; - /* Prepare the TX dma transfer */ - txdesc = dw_spi_dma_prepare_tx(dws, xfer); + /* Submit the DMA Tx transfer */ + txdesc = dw_spi_dma_submit_tx(dws, xfer); if (!txdesc) return -EINVAL; - /* Prepare the RX dma transfer */ + /* Submit the DMA Rx transfer if required */ if (xfer->rx_buf) { - rxdesc = dw_spi_dma_prepare_rx(dws, xfer); + rxdesc = dw_spi_dma_submit_rx(dws, xfer); if (!rxdesc) return -EINVAL; /* rx must be started before tx due to spi instinct */ - set_bit(RX_BUSY, &dws->dma_chan_busy); - dmaengine_submit(rxdesc); dma_async_issue_pending(dws->rxchan); } - set_bit(TX_BUSY, &dws->dma_chan_busy); - dmaengine_submit(txdesc); dma_async_issue_pending(dws->txchan); ret = dw_spi_dma_wait(dws, xfer); -- 2.27.0
[PATCH v2 01/11] spi: dw-dma: Set DMA Level registers on init
Indeed the registers content doesn't get cleared when the SPI controller is disabled and enabled. Max burst lengths aren't changed since the Rx and Tx DMA channels are requested on init stage and are kept acquired until the device is removed. Obviously SPI controller FIFO depth can't be changed. Due to all of that we can safely move the DMA Transmit and Receive data level registers initialization to the SPI controller DMA init stage (when the SPI controller is being probed) instead of doing it for each SPI transfer when dma_setup is called. This shall speed the DMA-based SPI transfer initialization up a bit, particularly if the APB bus is relatively slow. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index bb390ff67d1d..a7ff1e357f8b 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -49,6 +49,7 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws) max_burst = RX_BURST_LEVEL; dws->rxburst = min(max_burst, def_burst); + dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1); ret = dma_get_slave_caps(dws->txchan, &caps); if (!ret && caps.max_burst) @@ -56,7 +57,19 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws) else max_burst = TX_BURST_LEVEL; + /* +* Having a Rx DMA channel serviced with higher priority than a Tx DMA +* channel might not be enough to provide a well balanced DMA-based +* SPI transfer interface. There might still be moments when the Tx DMA +* channel is occasionally handled faster than the Rx DMA channel. +* That in its turn will eventually cause the SPI Rx FIFO overflow if +* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's +* cleared by the Rx DMA channel. In order to fix the problem the Tx +* DMA activity is intentionally slowed down by limiting the SPI Tx +* FIFO depth with a value twice bigger than the Tx burst length. +*/ dws->txburst = min(max_burst, def_burst); + dw_writel(dws, DW_SPI_DMATDLR, dws->txburst); } static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws) @@ -372,21 +385,6 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) { u16 imr = 0, dma_ctrl = 0; - /* -* Having a Rx DMA channel serviced with higher priority than a Tx DMA -* channel might not be enough to provide a well balanced DMA-based -* SPI transfer interface. There might still be moments when the Tx DMA -* channel is occasionally handled faster than the Rx DMA channel. -* That in its turn will eventually cause the SPI Rx FIFO overflow if -* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's -* cleared by the Rx DMA channel. In order to fix the problem the Tx -* DMA activity is intentionally slowed down by limiting the SPI Tx -* FIFO depth with a value twice bigger than the Tx burst length -* calculated earlier by the dw_spi_dma_maxburst_init() method. -*/ - dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1); - dw_writel(dws, DW_SPI_DMATDLR, dws->txburst); - if (xfer->tx_buf) dma_ctrl |= SPI_DMA_TDMAE; if (xfer->rx_buf) -- 2.27.0
[PATCH v2 07/11] spi: dw-dma: Remove DMA Tx-desc passing around
It's pointless to pass the Rx and Tx transfers DMA Tx-descriptors, since they are used in the Tx/Rx submit method only. Instead just return the submission status from these methods. This alteration will make the code less complex. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 31 ++- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index aa3900809126..9f70818acce6 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -271,8 +271,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws) return dmaengine_slave_config(dws->txchan, &txconf); } -static struct dma_async_tx_descriptor * -dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *txdesc; dma_cookie_t cookie; @@ -284,7 +283,7 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) DMA_MEM_TO_DEV, DMA_PREP_INTERRUPT | DMA_CTRL_ACK); if (!txdesc) - return NULL; + return -ENOMEM; txdesc->callback = dw_spi_dma_tx_done; txdesc->callback_param = dws; @@ -293,12 +292,12 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) ret = dma_submit_error(cookie); if (ret) { dmaengine_terminate_sync(dws->txchan); - return NULL; + return ret; } set_bit(TX_BUSY, &dws->dma_chan_busy); - return txdesc; + return 0; } static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws) @@ -375,8 +374,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws) return dmaengine_slave_config(dws->rxchan, &rxconf); } -static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, - struct spi_transfer *xfer) +static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *rxdesc; dma_cookie_t cookie; @@ -388,7 +386,7 @@ static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, DMA_DEV_TO_MEM, DMA_PREP_INTERRUPT | DMA_CTRL_ACK); if (!rxdesc) - return NULL; + return -ENOMEM; rxdesc->callback = dw_spi_dma_rx_done; rxdesc->callback_param = dws; @@ -397,12 +395,12 @@ static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, ret = dma_submit_error(cookie); if (ret) { dmaengine_terminate_sync(dws->rxchan); - return NULL; + return ret; } set_bit(RX_BUSY, &dws->dma_chan_busy); - return rxdesc; + return 0; } static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) @@ -445,19 +443,18 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) { - struct dma_async_tx_descriptor *txdesc, *rxdesc; int ret; /* Submit the DMA Tx transfer */ - txdesc = dw_spi_dma_submit_tx(dws, xfer); - if (!txdesc) - return -EINVAL; + ret = dw_spi_dma_submit_tx(dws, xfer); + if (ret) + return ret; /* Submit the DMA Rx transfer if required */ if (xfer->rx_buf) { - rxdesc = dw_spi_dma_submit_rx(dws, xfer); - if (!rxdesc) - return -EINVAL; + ret = dw_spi_dma_submit_rx(dws, xfer); + if (ret) + return ret; /* rx must be started before tx due to spi instinct */ dma_async_issue_pending(dws->rxchan); -- 2.27.0
[PATCH v2 06/11] spi: dw-dma: Check DMA Tx-desc submission status
We suggest to add the dmaengine_submit() return value test for errors. It has been unnecessary while the driver was expected to be utilized in pair with DW DMAC. But since now the driver can be used with any DMA engine, it might be useful to track the errors on DMA submissions so not miss them and get into an unpredictable driver behaviour. Signed-off-by: Serge Semin --- Changelog v2: - Replace negative conditional statements with the positive ones. - Terminate the prepared descriptors on submission errors. --- drivers/spi/spi-dw-dma.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index 769d10ca74b4..aa3900809126 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -275,6 +275,8 @@ static struct dma_async_tx_descriptor * dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *txdesc; + dma_cookie_t cookie; + int ret; txdesc = dmaengine_prep_slave_sg(dws->txchan, xfer->tx_sg.sgl, @@ -287,7 +289,13 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer) txdesc->callback = dw_spi_dma_tx_done; txdesc->callback_param = dws; - dmaengine_submit(txdesc); + cookie = dmaengine_submit(txdesc); + ret = dma_submit_error(cookie); + if (ret) { + dmaengine_terminate_sync(dws->txchan); + return NULL; + } + set_bit(TX_BUSY, &dws->dma_chan_busy); return txdesc; @@ -371,6 +379,8 @@ static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer) { struct dma_async_tx_descriptor *rxdesc; + dma_cookie_t cookie; + int ret; rxdesc = dmaengine_prep_slave_sg(dws->rxchan, xfer->rx_sg.sgl, @@ -383,7 +393,13 @@ static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws, rxdesc->callback = dw_spi_dma_rx_done; rxdesc->callback_param = dws; - dmaengine_submit(rxdesc); + cookie = dmaengine_submit(rxdesc); + ret = dma_submit_error(cookie); + if (ret) { + dmaengine_terminate_sync(dws->rxchan); + return NULL; + } + set_bit(RX_BUSY, &dws->dma_chan_busy); return rxdesc; -- 2.27.0
[PATCH v2 08/11] spi: dw-dma: Detach DMA transfer into a dedicated method
In order to add an alternative method of DMA-based SPI transfer first we need to detach the currently available one from the common code. Here we move the normal DMA-based SPI transfer execution functionality into a dedicated method. It will be utilized if either the DMA engine supports an unlimited number SG entries or Tx-only SPI transfer is requested. But currently just use it for any SPI transfer. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index 9f70818acce6..f2baefcae9ae 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -441,7 +441,8 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer) return 0; } -static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) +static int dw_spi_dma_transfer_all(struct dw_spi *dws, + struct spi_transfer *xfer) { int ret; @@ -462,7 +463,14 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) dma_async_issue_pending(dws->txchan); - ret = dw_spi_dma_wait(dws, xfer); + return dw_spi_dma_wait(dws, xfer); +} + +static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) +{ + int ret; + + ret = dw_spi_dma_transfer_all(dws, xfer); if (ret) return ret; -- 2.27.0
[PATCH v2 09/11] spi: dw-dma: Move DMAC register cleanup to DMA transfer method
DW APB SSI DMA driver doesn't use the native SPI core wait API since commit bdbdf0f06337 ("spi: dw: Locally wait for the DMA transfers completion"). Due to that the driver can now clear the DMAC register in a single place synchronously with the DMA transactions completion or failure. After that all the possible code paths are still covered: 1) DMA completion callbacks are executed in case if the corresponding DMA transactions are finished. When they are, one of them will eventually wake the SPI messages pump kernel thread and dw_spi_dma_transfer_all() method will clean the DMAC register as implied by this patch. 2) dma_stop is called when the SPI core detects an error either returned from the transfer_one() callback or set in the SPI message status field. Both types of errors will be noticed by the dw_spi_dma_transfer_all() method. 3) dma_exit is called when either SPI controller driver or the corresponding device is removed. In any case the SPI core will first flush the SPI messages pump kernel thread, so any pending or in-fly SPI transfers will be finished before that. Due to all of that let's simplify the DW APB SSI DMA driver a bit and move the DMAC register cleanup to a single place in the dw_spi_dma_transfer_all() method. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index f2baefcae9ae..935f073a3523 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -152,8 +152,6 @@ static void dw_spi_dma_exit(struct dw_spi *dws) dmaengine_terminate_sync(dws->rxchan); dma_release_channel(dws->rxchan); } - - dw_writel(dws, DW_SPI_DMACR, 0); } static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws) @@ -252,7 +250,6 @@ static void dw_spi_dma_tx_done(void *arg) if (test_bit(RX_BUSY, &dws->dma_chan_busy)) return; - dw_writel(dws, DW_SPI_DMACR, 0); complete(&dws->dma_completion); } @@ -355,7 +352,6 @@ static void dw_spi_dma_rx_done(void *arg) if (test_bit(TX_BUSY, &dws->dma_chan_busy)) return; - dw_writel(dws, DW_SPI_DMACR, 0); complete(&dws->dma_completion); } @@ -449,13 +445,13 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws, /* Submit the DMA Tx transfer */ ret = dw_spi_dma_submit_tx(dws, xfer); if (ret) - return ret; + goto err_clear_dmac; /* Submit the DMA Rx transfer if required */ if (xfer->rx_buf) { ret = dw_spi_dma_submit_rx(dws, xfer); if (ret) - return ret; + goto err_clear_dmac; /* rx must be started before tx due to spi instinct */ dma_async_issue_pending(dws->rxchan); @@ -463,7 +459,12 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws, dma_async_issue_pending(dws->txchan); - return dw_spi_dma_wait(dws, xfer); + ret = dw_spi_dma_wait(dws, xfer); + +err_clear_dmac: + dw_writel(dws, DW_SPI_DMACR, 0); + + return ret; } static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer) @@ -496,8 +497,6 @@ static void dw_spi_dma_stop(struct dw_spi *dws) dmaengine_terminate_sync(dws->rxchan); clear_bit(RX_BUSY, &dws->dma_chan_busy); } - - dw_writel(dws, DW_SPI_DMACR, 0); } static const struct dw_spi_dma_ops dw_spi_dma_mfld_ops = { -- 2.27.0
[PATCH 01/30] spi: dw: Discard IRQ threshold macro
The macro has been unused since a half of FIFO length was defined to be a marker of the IRQ. Let's remove it definition. Signed-off-by: Serge Semin --- drivers/spi/spi-dw.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 90dfd21622d6..51bab30b9f85 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -92,9 +92,6 @@ #define SPI_DMA_RDMAE (1 << 0) #define SPI_DMA_TDMAE (1 << 1) -/* TX RX interrupt level threshold, max can be 256 */ -#define SPI_INT_THRESHOLD 32 - enum dw_ssi_type { SSI_MOTO_SPI = 0, SSI_TI_SSP, -- 2.27.0
[PATCH 12/30] spi: dw: Detach SPI device specific CR0 config method
Indeed there is no point in detecting the SPI peripheral device parameters and initializing the CR0 register fields each time an SPI transfer is executed. Instead let's define a dedicated CR0 chip-data member, which will be initialized in accordance with the SPI device settings at the moment of setting it up. By doing so we'll finally make the SPI device chip_data serving as it's supposed to - to preserve the SPI device specific DW SPI configuration. See spi-fsl-dspi.c, spi-pl022.c, spi-pxa2xx.c drivers for example of the way the chip data is utilized. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 35 +++ 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index c21641a485ce..a9644351d75f 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -27,6 +27,7 @@ struct chip_data { u16 clk_div;/* baud rate divider */ u32 speed_hz; /* baud rate */ + u32 cr0; u32 rx_sample_dly; /* RX sample delay */ }; @@ -228,31 +229,41 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id) return dws->transfer_handler(dws); } -static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi, - struct spi_transfer *transfer) +static u32 dw_spi_get_cr0(struct dw_spi *dws, struct spi_device *spi) { - struct chip_data *chip = spi_get_ctldata(spi); - u32 cr0; - - cr0 = (transfer->bits_per_word - 1); + u32 cr0 = 0; if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) { cr0 |= SSI_MOTO_SPI << SPI_FRF_OFFSET; cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET; cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET; cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET; - cr0 |= chip->tmode << SPI_TMOD_OFFSET; } else { cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET; cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << DWC_SSI_CTRLR0_SCPOL_OFFSET; cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << DWC_SSI_CTRLR0_SCPH_OFFSET; cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET; - cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; if (dws->caps & DW_SPI_CAP_KEEMBAY_MST) cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST; } + return cr0; +} + +static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi, + struct spi_transfer *transfer) +{ + struct chip_data *chip = spi_get_ctldata(spi); + u32 cr0 = chip->cr0; + + cr0 |= (transfer->bits_per_word - 1); + + if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) + cr0 |= chip->tmode << SPI_TMOD_OFFSET; + else + cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; + dw_writel(dws, DW_SPI_CTRLR0, cr0); } @@ -350,6 +361,7 @@ static void dw_spi_handle_err(struct spi_controller *master, /* This may be called twice for each spi dev */ static int dw_spi_setup(struct spi_device *spi) { + struct dw_spi *dws = spi_controller_get_devdata(spi->controller); struct chip_data *chip; /* Only alloc on first setup */ @@ -373,6 +385,13 @@ static int dw_spi_setup(struct spi_device *spi) dws->max_freq); } + /* +* Update CR0 data each time the setup callback is invoked since +* the device parameters could have been changed, for instance, by +* the MMC SPI driver or something else. +*/ + chip->cr0 = dw_spi_get_cr0(dws, spi); + chip->tmode = SPI_TMOD_TR; return 0; -- 2.27.0
[PATCH 07/30] spi: dw: Use relaxed IO-methods to access FIFOs
In accordance with [1] the relaxed methods are guaranteed to be ordered with respect to other accesses from the same CPU thread to the same peripheral. This is what we need during the data read/write from/to the controller FIFOs being executed within a single IRQ handler or a kernel task. Such optimization shall significantly speed the data reader and writer up. For instance, the relaxed IO-accessors utilization on Baikal-T1 lets the driver to support the SPI memory operations with bus frequency three-fold faster than if normal IO-accessors would be used. [1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Serge Semin --- drivers/spi/spi-dw.h | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index ff77f39047ce..063fa1b1352d 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -161,29 +161,19 @@ static inline u32 dw_readl(struct dw_spi *dws, u32 offset) return __raw_readl(dws->regs + offset); } -static inline u16 dw_readw(struct dw_spi *dws, u32 offset) -{ - return __raw_readw(dws->regs + offset); -} - static inline void dw_writel(struct dw_spi *dws, u32 offset, u32 val) { __raw_writel(val, dws->regs + offset); } -static inline void dw_writew(struct dw_spi *dws, u32 offset, u16 val) -{ - __raw_writew(val, dws->regs + offset); -} - static inline u32 dw_read_io_reg(struct dw_spi *dws, u32 offset) { switch (dws->reg_io_width) { case 2: - return dw_readw(dws, offset); + return readw_relaxed(dws->regs + offset); case 4: default: - return dw_readl(dws, offset); + return readl_relaxed(dws->regs + offset); } } @@ -191,11 +181,11 @@ static inline void dw_write_io_reg(struct dw_spi *dws, u32 offset, u32 val) { switch (dws->reg_io_width) { case 2: - dw_writew(dws, offset, val); + writew_relaxed(val, dws->regs + offset); break; case 4: default: - dw_writel(dws, offset, val); + writel_relaxed(val, dws->regs + offset); break; } } -- 2.27.0
[PATCH 10/30] spi: dw: Add KeemBay Master capability
In a further commit we'll have to get rid of the update_cr0() callback and define a DW SSI capability instead. Since Keem Bay master/slave functionality is controller by the CTRL0 register bitfield, we need to first move the master mode selection into the internal corresponding update_cr0 method, which would be activated by means of the dedicated DW_SPI_CAP_KEEMBAY_MST capability setup. Note this will be also useful if the driver will be ever altered to support the DW SPI slave interface. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 4 drivers/spi/spi-dw-mmio.c | 20 +++- drivers/spi/spi-dw.h | 8 3 files changed, 15 insertions(+), 17 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 78e5af8ed173..8f9737640ec1 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -252,6 +252,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) { + struct dw_spi *dws = spi_controller_get_devdata(master); struct chip_data *chip = spi_get_ctldata(spi); u32 cr0; @@ -275,6 +276,9 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master, /* CTRLR0[13] Shift Register Loop */ cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET; + if (dws->caps & DW_SPI_CAP_KEEMBAY_MST) + cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST; + return cr0; } EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a); diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c index 7111cb7ca23b..c0d351fde782 100644 --- a/drivers/spi/spi-dw-mmio.c +++ b/drivers/spi/spi-dw-mmio.c @@ -48,13 +48,6 @@ struct dw_spi_mmio { #define SPARX5_FORCE_ENA 0xa4 #define SPARX5_FORCE_VAL 0xa8 -/* - * For Keem Bay, CTRLR0[31] is used to select controller mode. - * 0: SSI is slave - * 1: SSI is master - */ -#define KEEMBAY_CTRLR0_SSIC_IS_MST BIT(31) - struct dw_spi_mscc { struct regmap *syscon; void __iomem*spi_mst; /* Not sparx5 */ @@ -234,20 +227,13 @@ static int dw_spi_dwc_ssi_init(struct platform_device *pdev, return 0; } -static u32 dw_spi_update_cr0_keembay(struct spi_controller *master, -struct spi_device *spi, -struct spi_transfer *transfer) -{ - u32 cr0 = dw_spi_update_cr0_v1_01a(master, spi, transfer); - - return cr0 | KEEMBAY_CTRLR0_SSIC_IS_MST; -} - static int dw_spi_keembay_init(struct platform_device *pdev, struct dw_spi_mmio *dwsmmio) { + dwsmmio->dws.caps = DW_SPI_CAP_KEEMBAY_MST; + /* Register hook to configure CTRLR0 */ - dwsmmio->dws.update_cr0 = dw_spi_update_cr0_keembay; + dwsmmio->dws.update_cr0 = dw_spi_update_cr0_v1_01a; return 0; } diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 4c748b2853a8..da9b543322c9 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -71,6 +71,13 @@ #define DWC_SSI_CTRLR0_FRF_OFFSET 6 #define DWC_SSI_CTRLR0_DFS_OFFSET 0 +/* + * For Keem Bay, CTRLR0[31] is used to select controller mode. + * 0: SSI is slave + * 1: SSI is master + */ +#define DWC_SSI_CTRLR0_KEEMBAY_MST BIT(31) + /* Bit fields in SR, 7 bits */ #define SR_MASK0x7f/* cover 7 bits */ #define SR_BUSY(1 << 0) @@ -101,6 +108,7 @@ enum dw_ssi_type { /* DW SPI capabilities */ #define DW_SPI_CAP_CS_OVERRIDE BIT(0) +#define DW_SPI_CAP_KEEMBAY_MST BIT(1) struct dw_spi; struct dw_spi_dma_ops { -- 2.27.0
[PATCH 03/30] spi: dw: Initialize n_bytes before the memory barrier
Since n_bytes field of the DW SPI private data is also utilized by the IRQ handler, we need to make sure it' initialization is done before the memory barrier. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 8b3ce5a0378a..1af74362461d 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -299,6 +299,7 @@ static int dw_spi_transfer_one(struct spi_controller *master, dws->dma_mapped = 0; spin_lock_irqsave(&dws->buf_lock, flags); + dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE); dws->tx = (void *)transfer->tx_buf; dws->tx_end = dws->tx + transfer->len; dws->rx = transfer->rx_buf; @@ -323,7 +324,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, } transfer->effective_speed_hz = dws->max_freq / chip->clk_div; - dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE); cr0 = dws->update_cr0(master, spi, transfer); dw_writel(dws, DW_SPI_CTRLR0, cr0); -- 2.27.0
[PATCH 09/30] spi: dw: Convert CS-override to DW SPI capabilities
There are several vendor-specific versions of the DW SPI controllers, each of which may have some peculiarities with respect to the original IP-core. Seeing it has already caused adding flags and a callback into the DW SPI private data, let's introduce a generic capabilities interface to tune the generic DW SPI controller driver up in accordance with the particular controller specifics. It's done by converting a simple Alpine-specific CS-override capability into the DW SPI controller capability activated by setting the DW_SPI_CAP_CS_OVERRIDE flag. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 4 ++-- drivers/spi/spi-dw-mmio.c | 2 +- drivers/spi/spi-dw.h | 7 ++- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index dd9574f9fafc..78e5af8ed173 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -104,7 +104,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable) */ if (cs_high == enable) dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select)); - else if (dws->cs_override) + else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE) dw_writel(dws, DW_SPI_SER, 0); } EXPORT_SYMBOL_GPL(dw_spi_set_cs); @@ -435,7 +435,7 @@ static void spi_hw_init(struct device *dev, struct dw_spi *dws) } /* enable HW fixup for explicit CS deselect for Amazon's alpine chip */ - if (dws->cs_override) + if (dws->caps & DW_SPI_CAP_CS_OVERRIDE) dw_writel(dws, DW_SPI_CS_OVERRIDE, 0xF); } diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c index 18772c0c9220..7111cb7ca23b 100644 --- a/drivers/spi/spi-dw-mmio.c +++ b/drivers/spi/spi-dw-mmio.c @@ -204,7 +204,7 @@ static int dw_spi_mscc_sparx5_init(struct platform_device *pdev, static int dw_spi_alpine_init(struct platform_device *pdev, struct dw_spi_mmio *dwsmmio) { - dwsmmio->dws.cs_override = 1; + dwsmmio->dws.caps = DW_SPI_CAP_CS_OVERRIDE; /* Register hook to configure CTRLR0 */ dwsmmio->dws.update_cr0 = dw_spi_update_cr0; diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 4420b4d29bac..4c748b2853a8 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -2,6 +2,7 @@ #ifndef DW_SPI_HEADER_H #define DW_SPI_HEADER_H +#include #include #include #include @@ -98,6 +99,9 @@ enum dw_ssi_type { SSI_NS_MICROWIRE, }; +/* DW SPI capabilities */ +#define DW_SPI_CAP_CS_OVERRIDE BIT(0) + struct dw_spi; struct dw_spi_dma_ops { int (*dma_init)(struct device *dev, struct dw_spi *dws); @@ -118,7 +122,8 @@ struct dw_spi { u32 fifo_len; /* depth of the FIFO buffer */ u32 max_freq; /* max bus freq supported */ - int cs_override; + u32 caps; /* DW SPI capabilities */ + u32 reg_io_width; /* DR I/O width in bytes */ u16 bus_num; u16 num_cs; /* supported slave numbers */ -- 2.27.0
[PATCH 00/30] spi: dw: Add full Baikal-T1 SPI Controllers support
Originally I intended to merge a dedicated Baikal-T1 System Boot SPI Controller driver into the kernel and leave the DW APB SSI driver untouched. But after a long discussion (see the link at the bottom of the letter) Mark and Andy persuaded me to integrate what we developed there into the DW APB SSI core driver to be useful for another controllers, which may have got the same peculiarities/problems as ours: - No IRQ. - No DMA. - No GPIO CS, so a native CS is utilized. - small Tx/Rx FIFO depth. - Automatic CS assertion/de-assertion. - Slow system bus. All of them have been fixed in the framework of this patchset in some extent at least for the SPI memory operations. As I expected it wasn't that easy and the integration took that many patches as you can see from the subject. Though some of them are mere cleanups or weakly related with the subject fixes, but we just couldn't leave the code as is at some places since we were working with the DW APB SSI driver anyway. Here is what we did to fix the original DW APB SSI driver, to make it less messy. First two patches are just cleanups to simplify the DW APB SSI device initialization a bit. We suggest to discard the IRQ threshold macro as unused and use a ternary operator to initialize the set_cs callback instead of assigning-and-updating it. Then we've discovered that the n_bytes field of the driver private data is used by the DW APB SSI IRQ handler, which requires it to be initialized before the SMP memory barrier and to be visible from another CPUs. Speaking about the SMP memory barrier. Having one right after the shared resources initialization is enough and there is no point in using the spin-lock to protect the Tx/Rx buffer pointers. The protection functionality is redundant there by the driver design. (Though I have a doubt whether the SMP memory barrier is also required there because the normal IO-methods like readl/writel implies a full memory barrier. So any memory operations performed before them are supposed to be seen by devices and another CPUs. See the patch log for details of my concern.) Thirdly we've found out that there is some confusion in the IRQs masking/unmasking/clearing in the SPI-transfer procedure. Multiple interrupts are unmasked on the SPI-transfer initialization, but just TXEI is only masked back on completion. Similarly IRQ status isn't cleared on the controller reset, which actually makes the reset being not full and errors prone in the controller probe procedure. Another very important optimization is using the IO-relaxed accessors in the dw_read_io_reg()/dw_write_io_reg() methods. Since the Tx/Rx FIFO data registers are the most frequently accessible controller resource, using relaxed accessors there will significantly improve the data read/write performance. At least on Baikal-T1 SoC such modification opens up a way to have the DW APB SSI controller working with higher SPI bus speeds, than without it. Fifthly we've made an effort to cleanup the code using the SPI-device private data - chip_data. We suggest to remove the chip type from there since it isn't used and isn't implemented right anyway. Then instead of having a bus speed, clock divider, transfer mode preserved there, and recalculating the CR0 fields of the SPI-device-specific phase, polarity and frame format each time the SPI transfer is requested, we can save it in the chip_data instance. By doing so we'll make that structure finally used as it was supposed to by design (see the spi-fsl-dspi.c, spi-pl022.c, spi-pxa2xx.c drivers for examples). Sixthly instead of having the SPI-transfer specific CR0-update callback, we suggest to implement the DW APB SSI controller capabilities approach. By doing so we can now inject the vendor-specific peculiarities in different parts of the DW APB SSI core driver (which is required to implement both SPI-transfers and the SPI memory operations). This will also make the code less confusing like defining a callback in the core driver, setting it up in the glue layer, then calling it from the core driver again. Seeing the small capabilities implementation embedded in-situ is more readable than tracking the callbacks assignments. This will concern the CS-override, Keembay master setup, DW SSI-specific CR0 registers layout capabilities. Seventhly since there are going to be two types of the transfers implemented in the DW APB SSI core driver, we need a common method to set the controller configuration like, Tx/Rx-mode, bus speed, data frame size and number of data frames to read in case of the memory operations. So we just detached the corresponding code from the SPI-transfer-one method and made it to be a part of the new dw_spi_update_config() function, which is former update_cr0(). Note that the new method will be also useful for the glue drivers, which due to the hardware design need to create their own memory operations (for instance, for the dirmap-operations provided in the Baikal-T System Boot SPI controller driver). Eighthly
[PATCH 02/30] spi: dw: Use ternary op to init set_cs callback
Simplify the dw_spi_add_host() method a bit by replacing the set_cs callback overwrite procedure with direct setting the callback if a custom version of one is specified. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 55afdcee7d2b..8b3ce5a0378a 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -482,7 +482,7 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) master->num_chipselect = dws->num_cs; master->setup = dw_spi_setup; master->cleanup = dw_spi_cleanup; - master->set_cs = dw_spi_set_cs; + master->set_cs = dws->set_cs ?: dw_spi_set_cs; master->transfer_one = dw_spi_transfer_one; master->handle_err = dw_spi_handle_err; master->max_speed_hz = dws->max_freq; @@ -491,9 +491,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) master->flags = SPI_MASTER_GPIO_SS; master->auto_runtime_pm = true; - if (dws->set_cs) - master->set_cs = dws->set_cs; - /* Get default rx sample delay */ device_property_read_u32(dev, "rx-sample-delay-ns", &dws->def_rx_sample_dly_ns); -- 2.27.0
[PATCH 14/30] spi: dw: Simplify the SPI bus speed config procedure
The code currently responsible for the SPI communication speed setting up is a bit messy. Most likely for some historical reason the bus frequency is saved in the peripheral chip private data. It's pointless now since the custom communication speed is a SPI-transfer-specific thing and only if there is no SPI transfer data specified (like during the SPI memory operations) it can be taken from the SPI device structure. But even in the later case there is no point in having the clock divider and the SPI bus frequency saved in the chip data, because the controller can be used for both SPI-transfer-based and SPI-transfer-less communications. From software point of view keeping the current clock divider in an SPI-device specific storage may give a small performance gain (to avoid sometimes a round-up division), but in comparison to the total SPI transfer time it just doesn't worth saving a few CPU cycles in comparison to the total SPI transfer time while having the harder to read code. The only optimization, which could worth preserving in the code is to avoid unnecessary DW SPI controller registers update if it's possible. So to speak let's simplify the SPI communication speed update procedure by removing the clock-related fields from the peripheral chip data and update the DW SPI clock divider only if it's really changed. The later change is reached by keeping the effective SPI bus speed in the internal DW SPI private data. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 0b80a16d9872..92138a6ada12 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -24,9 +24,6 @@ struct chip_data { u8 tmode; /* TR/TO/RO/EEPROM */ - u16 clk_div;/* baud rate divider */ - u32 speed_hz; /* baud rate */ - u32 cr0; u32 rx_sample_dly; /* RX sample delay */ }; @@ -256,6 +253,8 @@ static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, { struct chip_data *chip = spi_get_ctldata(spi); u32 cr0 = chip->cr0; + u32 speed_hz; + u16 clk_div; cr0 |= (transfer->bits_per_word - 1); @@ -266,15 +265,13 @@ static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, dw_writel(dws, DW_SPI_CTRLR0, cr0); - /* Handle per transfer options for bpw and speed */ - if (transfer->speed_hz != dws->current_freq) { - if (transfer->speed_hz != chip->speed_hz) { - /* clk_div doesn't support odd number */ - chip->clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 0xfffe; - chip->speed_hz = transfer->speed_hz; - } - dws->current_freq = transfer->speed_hz; - spi_set_clk(dws, chip->clk_div); + /* Note DW APB SSI clock divider doesn't support odd numbers */ + clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 0xfffe; + speed_hz = dws->max_freq / clk_div; + + if (dws->current_freq != speed_hz) { + spi_set_clk(dws, clk_div); + dws->current_freq = speed_hz; } } @@ -302,7 +299,7 @@ static int dw_spi_transfer_one(struct spi_controller *master, dw_spi_update_config(dws, spi, transfer); - transfer->effective_speed_hz = dws->max_freq / chip->clk_div; + transfer->effective_speed_hz = dws->current_freq; /* Check if current transfer is a DMA transaction */ if (master->can_dma && master->can_dma(master, spi, transfer)) -- 2.27.0
[PATCH 08/30] spi: dw: Discard DW SSI chip type storages
Keeping SPI peripheral devices type is pointless since first it hasn't been functionally utilized by any of the client drivers/code and second it won't work for Microwire type at the very least. Moreover there is no point in setting up the type by means of the chip-data in the modern kernel. The peripheral devices with specific interface type need to be detected in order to activate the corresponding frame format. It most likely will require some peripheral device specific DT property or whatever to find out the interface protocol. So let's remove the serial interface type fields from the DW APB SSI controller and the SPI peripheral device private data. Note we'll preserve the explicit SSI_MOTO_SPI interface type setting up to signify the only currently supported interface protocol. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 6 ++ drivers/spi/spi-dw.h | 1 - 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index be94ed5bb896..dd9574f9fafc 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -23,7 +23,6 @@ /* Slave spi_dev related */ struct chip_data { u8 tmode; /* TR/TO/RO/EEPROM */ - u8 type;/* SPI/SSP/MicroWire */ u16 clk_div;/* baud rate divider */ u32 speed_hz; /* baud rate */ @@ -238,7 +237,7 @@ u32 dw_spi_update_cr0(struct spi_controller *master, struct spi_device *spi, /* Default SPI mode is SCPOL = 0, SCPH = 0 */ cr0 = (transfer->bits_per_word - 1) - | (chip->type << SPI_FRF_OFFSET) + | (SSI_MOTO_SPI << SPI_FRF_OFFSET) | spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) | (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) | (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET)) @@ -260,7 +259,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master, cr0 = (transfer->bits_per_word - 1); /* CTRLR0[ 7: 6] Frame Format */ - cr0 |= chip->type << DWC_SSI_CTRLR0_FRF_OFFSET; + cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET; /* * SPI mode (SCPOL|SCPH) @@ -453,7 +452,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) return -ENOMEM; dws->master = master; - dws->type = SSI_MOTO_SPI; dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR); spi_controller_set_devdata(master, dws); diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 063fa1b1352d..4420b4d29bac 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -111,7 +111,6 @@ struct dw_spi_dma_ops { struct dw_spi { struct spi_controller *master; - enum dw_ssi_typetype; void __iomem*regs; unsigned long paddr; -- 2.27.0
[PATCH 16/30] spi: dw: Add DW SPI controller config structure
DW APB SSI controller can be used by the two SPI core interfaces: traditional SPI transfers and SPI memory operations. The controller needs to be accordingly configured at runtime when the corresponding operations are executed. In order to do that for the both interfaces from a single function we introduce a new data wrapper for the transfer mode, data width, number of data frames (for the automatic data transfer) and the bus frequency. It will be used by the update_config() method to tune the DW APB SSI up. The update_config() method is made exported to be used not only by the DW SPI core driver, but by the glue layer drivers too. This will be required in a coming further commit. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 29 + drivers/spi/spi-dw.h | 10 ++ 2 files changed, 27 insertions(+), 12 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index f00fc4828480..9102685c1523 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -20,10 +20,8 @@ #include #endif -/* Slave spi_dev related */ +/* Slave spi_device related */ struct chip_data { - u8 tmode; /* TR/TO/RO/EEPROM */ - u32 cr0; u32 rx_sample_dly; /* RX sample delay */ }; @@ -248,25 +246,28 @@ static u32 dw_spi_get_cr0(struct dw_spi *dws, struct spi_device *spi) return cr0; } -static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, -struct spi_transfer *transfer) +void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, + struct dw_spi_cfg *cfg) { struct chip_data *chip = spi_get_ctldata(spi); u32 cr0 = chip->cr0; u32 speed_hz; u16 clk_div; - cr0 |= (transfer->bits_per_word - 1); + cr0 |= (cfg->dfs - 1); if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) - cr0 |= chip->tmode << SPI_TMOD_OFFSET; + cr0 |= cfg->tmode << SPI_TMOD_OFFSET; else - cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; + cr0 |= cfg->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; dw_writel(dws, DW_SPI_CTRLR0, cr0); + if (cfg->tmode == SPI_TMOD_EPROMREAD || cfg->tmode == SPI_TMOD_RO) + dw_writel(dws, DW_SPI_CTRLR1, cfg->ndf ? cfg->ndf - 1 : 0); + /* Note DW APB SSI clock divider doesn't support odd numbers */ - clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 0xfffe; + clk_div = (DIV_ROUND_UP(dws->max_freq, cfg->freq) + 1) & 0xfffe; speed_hz = dws->max_freq / clk_div; if (dws->current_freq != speed_hz) { @@ -280,11 +281,17 @@ static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, dws->cur_rx_sample_dly = chip->rx_sample_dly; } } +EXPORT_SYMBOL_GPL(dw_spi_update_config); static int dw_spi_transfer_one(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) { struct dw_spi *dws = spi_controller_get_devdata(master); + struct dw_spi_cfg cfg = { + .tmode = SPI_TMOD_TR, + .dfs = transfer->bits_per_word, + .freq = transfer->speed_hz, + }; u8 imask = 0; u16 txlevel = 0; int ret; @@ -302,7 +309,7 @@ static int dw_spi_transfer_one(struct spi_controller *master, spi_enable_chip(dws, 0); - dw_spi_update_config(dws, spi, transfer); + dw_spi_update_config(dws, spi, &cfg); transfer->effective_speed_hz = dws->current_freq; @@ -388,8 +395,6 @@ static int dw_spi_setup(struct spi_device *spi) */ chip->cr0 = dw_spi_get_cr0(dws, spi); - chip->tmode = SPI_TMOD_TR; - return 0; } diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index c02351cf2f99..2a2346438564 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -111,6 +111,14 @@ enum dw_ssi_type { #define DW_SPI_CAP_KEEMBAY_MST BIT(1) #define DW_SPI_CAP_DWC_SSI BIT(2) +/* Slave spi_transfer/spi_mem_op related */ +struct dw_spi_cfg { + u8 tmode; + u8 dfs; + u32 ndf; + u32 freq; +}; + struct dw_spi; struct dw_spi_dma_ops { int (*dma_init)(struct device *dev, struct dw_spi *dws); @@ -249,6 +257,8 @@ static inline void spi_shutdown_chip(struct dw_spi *dws) } extern void dw_spi_set_cs(struct spi_device *spi, bool enable); +extern void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, +struct dw_spi_cfg *cfg); extern int dw_spi_add_host(struct device *dev, struct dw_spi *dws); extern void dw_spi_remove_host(struct dw_spi *dws); extern int dw_spi_suspend_host(struct dw_spi *dws); -- 2.27.0
[PATCH 27/30] spi: dw: Introduce max mem-ops SPI bus frequency setting
In some circumstances the current implementation of the SPI memory operations may occasionally fail even though they are executed in the atomic context. This may happen if the system bus is relatively slow in comparison to the SPI bus frequency, or there is a concurrent access to it, which makes the MMIO-operations occasionally stalling before push-pulling data from the DW APB SPI FIFOs. These two problems we've discovered on the Baikal-T1 SoC. In order to fix them we have no choice but to set an artificial limitation on the SPI bus speed. Note currently this limitation will be only applicable for the memory operations, since the standard SPI core interface is implemented with an assumption that there is no problem with the automatic CS toggling. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 4 +++- drivers/spi/spi-dw.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index ca22f427d82d..7b901226fd38 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -608,7 +608,7 @@ static int dw_spi_exec_mem_op(struct spi_mem *mem, const struct spi_mem_op *op) * operation. Transmit-only mode is suitable for the rest of them. */ cfg.dfs = 8; - cfg.freq = mem->spi->max_speed_hz; + cfg.freq = clamp(mem->spi->max_speed_hz, 0U, dws->max_mem_freq); if (op->data.dir == SPI_MEM_DATA_IN) { cfg.tmode = SPI_TMOD_EPROMREAD; cfg.ndf = op->data.nbytes; @@ -695,6 +695,8 @@ static void dw_spi_init_mem_ops(struct dw_spi *dws) dws->mem_ops.adjust_op_size = dw_spi_adjust_mem_op_size; dws->mem_ops.supports_op = dw_spi_supports_mem_op; dws->mem_ops.exec_op = dw_spi_exec_mem_op; + if (!dws->max_mem_freq) + dws->max_mem_freq = dws->max_freq; } } diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 4b08fe34a85d..dc5781236cc6 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -148,6 +148,7 @@ struct dw_spi { unsigned long paddr; int irq; u32 fifo_len; /* depth of the FIFO buffer */ + u32 max_mem_freq; /* max mem-ops bus freq */ u32 max_freq; /* max bus freq supported */ u32 caps; /* DW SPI capabilities */ -- 2.27.0
[PATCH 18/30] spi: dw: Refactor IRQ-based SPI transfer procedure
Current IRQ-based SPI transfer execution procedure doesn't work well at the final stage of the execution. If all the Tx data is sent out (written to the Tx FIFO) but there is some data left to receive, the Tx FIFO Empty IRQ will constantly happen until all of the requested inbound data is received. Though for a short period of time, but it will make the system less responsive. In order to fix that let's refactor the SPI transfer execution procedure by taking the Rx FIFO Full IRQ into account. We'll read and write SPI transfer data each time the IRQ happens as before. If all the outbound data is sent out, we'll disable the Tx FIFO Empty IRQ. If there is still some data to receive, we'll adjust the Rx FIFO Threshold level, so the next IRQ would be raised at the moment of all incoming data being available in the Rx FIFO. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 33 - 1 file changed, 24 insertions(+), 9 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 84c1fdfd6e52..682463b2f68b 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -189,17 +189,30 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws) return IRQ_HANDLED; } + /* +* Read data from the Rx FIFO every time we've got a chance executing +* this method. If there is nothing left to receive, terminate the +* procedure. Otherwise adjust the Rx FIFO Threshold level if it's a +* final stage of the transfer. By doing so we'll get the next IRQ +* right when the leftover incoming data is received. +*/ dw_reader(dws); if (!dws->rx_len) { spi_mask_intr(dws, 0xff); spi_finalize_current_transfer(dws->master); - return IRQ_HANDLED; + } else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) { + dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1); } + + /* +* Send data out if Tx FIFO Empty IRQ is received. The IRQ will be +* disabled after the data transmission is finished so not to +* have the TXE IRQ flood at the final stage of the transfer. +*/ if (irq_status & SPI_INT_TXEI) { - spi_mask_intr(dws, SPI_INT_TXEI); dw_writer(dws); - /* Enable TX irq always, it will be disabled when RX finished */ - spi_umask_intr(dws, SPI_INT_TXEI); + if (!dws->tx_len) + spi_mask_intr(dws, SPI_INT_TXEI); } return IRQ_HANDLED; @@ -317,10 +330,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, /* For poll mode just disable all interrupts */ spi_mask_intr(dws, 0xff); - /* -* Interrupt mode -* we only need set the TXEI IRQ, as TX/RX always happen syncronizely -*/ if (dws->dma_mapped) { ret = dws->dma_ops->dma_setup(dws, transfer); if (ret < 0) { @@ -328,12 +337,18 @@ static int dw_spi_transfer_one(struct spi_controller *master, return ret; } } else { + /* +* Originally Tx and Rx data lengths match. Rx FIFO Threshold level +* will be adjusted at the final stage of the IRQ-based SPI transfer +* execution so not to lose the leftover of the incoming data. +*/ txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len); dw_writel(dws, DW_SPI_TXFTLR, txlevel); + dw_writel(dws, DW_SPI_RXFTLR, txlevel - 1); /* Set the interrupt mask */ imask |= SPI_INT_TXEI | SPI_INT_TXOI | -SPI_INT_RXUI | SPI_INT_RXOI; +SPI_INT_RXUI | SPI_INT_RXOI | SPI_INT_RXFI; spi_umask_intr(dws, imask); dws->transfer_handler = interrupt_transfer; -- 2.27.0
[PATCH 19/30] spi: dw: Perform IRQ setup in a dedicated function
In order to make the transfer_one() callback method more readable and for unification with the DMA-based transfer, let's detach the IRQ setup procedure into a dedicated function. While at it rename the IRQ-based transfer handler function to be dw_spi-prefixe and looking more like the DMA-related one. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 41 ++- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 682463b2f68b..08bc53b9de88 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -178,7 +178,7 @@ static void int_error_stop(struct dw_spi *dws, const char *msg) spi_finalize_current_transfer(dws->master); } -static irqreturn_t interrupt_transfer(struct dw_spi *dws) +static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws) { u16 irq_status = dw_readl(dws, DW_SPI_ISR); @@ -294,6 +294,27 @@ void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, } EXPORT_SYMBOL_GPL(dw_spi_update_config); +static void dw_spi_irq_setup(struct dw_spi *dws) +{ + u16 level; + u8 imask; + + /* +* Originally Tx and Rx data lengths match. Rx FIFO Threshold level +* will be adjusted at the final stage of the IRQ-based SPI transfer +* execution so not to lose the leftover of the incoming data. +*/ + level = min_t(u16, dws->fifo_len / 2, dws->tx_len); + dw_writel(dws, DW_SPI_TXFTLR, level); + dw_writel(dws, DW_SPI_RXFTLR, level - 1); + + imask = SPI_INT_TXEI | SPI_INT_TXOI | SPI_INT_RXUI | SPI_INT_RXOI | + SPI_INT_RXFI; + spi_umask_intr(dws, imask); + + dws->transfer_handler = dw_spi_transfer_handler; +} + static int dw_spi_transfer_one(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) { @@ -303,8 +324,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, .dfs = transfer->bits_per_word, .freq = transfer->speed_hz, }; - u8 imask = 0; - u16 txlevel = 0; int ret; dws->dma_mapped = 0; @@ -337,21 +356,7 @@ static int dw_spi_transfer_one(struct spi_controller *master, return ret; } } else { - /* -* Originally Tx and Rx data lengths match. Rx FIFO Threshold level -* will be adjusted at the final stage of the IRQ-based SPI transfer -* execution so not to lose the leftover of the incoming data. -*/ - txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len); - dw_writel(dws, DW_SPI_TXFTLR, txlevel); - dw_writel(dws, DW_SPI_RXFTLR, txlevel - 1); - - /* Set the interrupt mask */ - imask |= SPI_INT_TXEI | SPI_INT_TXOI | -SPI_INT_RXUI | SPI_INT_RXOI | SPI_INT_RXFI; - spi_umask_intr(dws, imask); - - dws->transfer_handler = interrupt_transfer; + dw_spi_irq_setup(dws); } spi_enable_chip(dws, 1); -- 2.27.0
[PATCH 05/30] spi: dw: Clear IRQ status on DW SPI controller reset
It turns out the IRQ status isn't cleared after switching the controller off and getting it back on, which may cause raising false error interrupts if controller has been unsuccessfully used by, for instance, a bootloader before the driver is loaded. Let's explicitly clear the interrupts status in the dedicated controller reset method. Signed-off-by: Serge Semin --- drivers/spi/spi-dw.h | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 1ab704d1ebd8..ff77f39047ce 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -229,14 +229,15 @@ static inline void spi_umask_intr(struct dw_spi *dws, u32 mask) } /* - * This does disable the SPI controller, interrupts, and re-enable the - * controller back. Transmit and receive FIFO buffers are cleared when the - * device is disabled. + * This disables the SPI controller, interrupts, clears the interrupts status, + * and re-enable the controller back. Transmit and receive FIFO buffers are + * cleared when the device is disabled. */ static inline void spi_reset_chip(struct dw_spi *dws) { spi_enable_chip(dws, 0); spi_mask_intr(dws, 0xff); + dw_readl(dws, DW_SPI_ICR); spi_enable_chip(dws, 1); } -- 2.27.0
[PATCH 17/30] spi: dw: Refactor data IO procedure
The Tx and Rx data write/read procedure can be significantly simplified by using Tx/Rx transfer lengths instead of the end pointers. By having the Tx/Rx data leftover lengths (in the number of transfer words) we can get rid of all subtraction and division operations utilized here and there in the tx_max(), rx_max(), dw_writer() and dw_reader() methods. Such modification will not only give us the more optimized IO procedures, but will make the data IO methods much more readable than before. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 37 + drivers/spi/spi-dw.h | 5 ++--- 2 files changed, 19 insertions(+), 23 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 9102685c1523..84c1fdfd6e52 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -108,9 +108,8 @@ EXPORT_SYMBOL_GPL(dw_spi_set_cs); /* Return the max entries we can fill into tx fifo */ static inline u32 tx_max(struct dw_spi *dws) { - u32 tx_left, tx_room, rxtx_gap; + u32 tx_room, rxtx_gap; - tx_left = (dws->tx_end - dws->tx) / dws->n_bytes; tx_room = dws->fifo_len - dw_readl(dws, DW_SPI_TXFLR); /* @@ -121,18 +120,15 @@ static inline u32 tx_max(struct dw_spi *dws) * shift registers. So a control from sw point of * view is taken. */ - rxtx_gap = ((dws->rx_end - dws->rx) - (dws->tx_end - dws->tx)) - / dws->n_bytes; + rxtx_gap = dws->fifo_len - (dws->rx_len - dws->tx_len); - return min3(tx_left, tx_room, (u32) (dws->fifo_len - rxtx_gap)); + return min3((u32)dws->tx_len, tx_room, rxtx_gap); } /* Return the max entries we should read out of rx fifo */ static inline u32 rx_max(struct dw_spi *dws) { - u32 rx_left = (dws->rx_end - dws->rx) / dws->n_bytes; - - return min_t(u32, rx_left, dw_readl(dws, DW_SPI_RXFLR)); + return min_t(u32, dws->rx_len, dw_readl(dws, DW_SPI_RXFLR)); } static void dw_writer(struct dw_spi *dws) @@ -141,15 +137,16 @@ static void dw_writer(struct dw_spi *dws) u16 txw = 0; while (max--) { - /* Set the tx word if the transfer's original "tx" is not null */ - if (dws->tx_end - dws->len) { + if (dws->tx) { if (dws->n_bytes == 1) txw = *(u8 *)(dws->tx); else txw = *(u16 *)(dws->tx); + + dws->tx += dws->n_bytes; } dw_write_io_reg(dws, DW_SPI_DR, txw); - dws->tx += dws->n_bytes; + --dws->tx_len; } } @@ -160,14 +157,15 @@ static void dw_reader(struct dw_spi *dws) while (max--) { rxw = dw_read_io_reg(dws, DW_SPI_DR); - /* Care rx only if the transfer's original "rx" is not null */ - if (dws->rx_end - dws->len) { + if (dws->rx) { if (dws->n_bytes == 1) *(u8 *)(dws->rx) = rxw; else *(u16 *)(dws->rx) = rxw; + + dws->rx += dws->n_bytes; } - dws->rx += dws->n_bytes; + --dws->rx_len; } } @@ -192,7 +190,7 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws) } dw_reader(dws); - if (dws->rx_end == dws->rx) { + if (!dws->rx_len) { spi_mask_intr(dws, 0xff); spi_finalize_current_transfer(dws->master); return IRQ_HANDLED; @@ -299,12 +297,11 @@ static int dw_spi_transfer_one(struct spi_controller *master, dws->dma_mapped = 0; dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE); dws->tx = (void *)transfer->tx_buf; - dws->tx_end = dws->tx + transfer->len; + dws->tx_len = transfer->len / dws->n_bytes; dws->rx = transfer->rx_buf; - dws->rx_end = dws->rx + transfer->len; - dws->len = transfer->len; + dws->rx_len = dws->tx_len; - /* Ensure dw->rx and dw->rx_end are visible */ + /* Ensure the data above is visible for all CPUs */ smp_mb(); spi_enable_chip(dws, 0); @@ -331,7 +328,7 @@ static int dw_spi_transfer_one(struct spi_controller *master, return ret; } } else { - txlevel = min_t(u16, dws->fifo_len / 2, dws->len / dws->n_bytes); + txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len); dw_writel(dws, DW_SPI_TXFTLR, txlevel); /* Set the interrupt mask */ diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 2a2346438564..cfc9f63acde4 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -147,11 +147,10 @@ struct dw_spi { void (*set_cs)(struct spi_device *spi, bool en
[PATCH 22/30] spi: dw: De-assert chip-select on reset
SPI memory operations implementation will require to have the CS register cleared before executing the operation in order not to have the transmission automatically started prior the Tx FIFO is pre-initialized. Let's clear the register then on explicit controller reset to fulfil the requirements in case of an error or having the CS left set by a bootloader or another software. Signed-off-by: Serge Semin --- drivers/spi/spi-dw.h | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index cfc9f63acde4..eb1d46983319 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -237,15 +237,16 @@ static inline void spi_umask_intr(struct dw_spi *dws, u32 mask) } /* - * This disables the SPI controller, interrupts, clears the interrupts status, - * and re-enable the controller back. Transmit and receive FIFO buffers are - * cleared when the device is disabled. + * This disables the SPI controller, interrupts, clears the interrupts status + * and CS, then re-enables the controller back. Transmit and receive FIFO + * buffers are cleared when the device is disabled. */ static inline void spi_reset_chip(struct dw_spi *dws) { spi_enable_chip(dws, 0); spi_mask_intr(dws, 0xff); dw_readl(dws, DW_SPI_ICR); + dw_writel(dws, DW_SPI_SER, 0); spi_enable_chip(dws, 1); } -- 2.27.0
[PATCH 29/30] dt-bindings: spi: dw: Add Baikal-T1 SPI Controllers
These controllers are based on the DW APB SSI IP-core and embedded into the SoC, so two of them are equipped with IRQ, DMA, 64 words FIFOs and 4 native CS, while another one as being utilized by the Baikal-T1 System Boot Controller has got a very limited resources: no IRQ, no DMA, only a single native chip-select and just 8 bytes Tx/Rx FIFOs available. That's why we have to mark the IRQ to be optional for the later interface. The SPI controller embedded into the Baikal-T1 System Boot Controller can be also used to directly access an external SPI flash by means of a dedicated FSM. The corresponding MMIO region availability is switchable by the embedded multiplexor, which phandle can be specified in the dts node. * We added a new example to test out the non-standard Baikal-T1 System Boot SPI Controller DT binding. Co-developed-by: Ramil Zaripov Signed-off-by: Ramil Zaripov Signed-off-by: Serge Semin --- .../bindings/spi/snps,dw-apb-ssi.yaml | 33 +-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml index c62cbe79f00d..d6ae35777dac 100644 --- a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml +++ b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml @@ -22,6 +22,21 @@ allOf: properties: reg: minItems: 2 + - if: + properties: +compatible: + contains: +enum: + - baikal,bt1-sys-ssi +then: + properties: +mux-controls: + maxItems: 1 + required: +- mux-controls +else: + required: +- interrupts properties: compatible: @@ -44,12 +59,16 @@ properties: - const: snps,dw-apb-ssi - description: Intel Keem Bay SPI Controller const: intel,keembay-ssi + - description: Baikal-T1 SPI Controller +const: baikal,bt1-ssi + - description: Baikal-T1 System Boot SPI Controller +const: baikal,bt1-sys-ssi reg: minItems: 1 items: - description: DW APB SSI controller memory mapped registers - - description: SPI MST region map + - description: SPI MST region map or directly mapped SPI ROM interrupts: maxItems: 1 @@ -114,7 +133,6 @@ required: - reg - "#address-cells" - "#size-cells" - - interrupts - clocks examples: @@ -130,4 +148,15 @@ examples: cs-gpios = <&gpio0 13 0>, <&gpio0 14 0>; }; + - | +spi@1f040100 { + compatible = "baikal,bt1-sys-ssi"; + reg = <0x1f040100 0x900>, +<0x1c00 0x100>; + #address-cells = <1>; + #size-cells = <0>; + mux-controls = <&boot_mux>; + clocks = <&ccu_sys>; + clock-names = "ssi_clk"; +}; ... -- 2.27.0
[PATCH 13/30] spi: dw: Update SPI bus speed in a config function
The SPI bus speed update functionality will be useful in another parts of the driver too (like to implement the SPI memory operations and from the DW SPI glue layers). Let's move it to the update_cr0() method then and since the later is now updating not only the CTRLR0 register alter its prototype to have a generic function name not related to CR0. Leave the too long line with the chip->clk_div setting as is for now, since it's going to be changed later anyway. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index a9644351d75f..0b80a16d9872 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -251,8 +251,8 @@ static u32 dw_spi_get_cr0(struct dw_spi *dws, struct spi_device *spi) return cr0; } -static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi, - struct spi_transfer *transfer) +static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, +struct spi_transfer *transfer) { struct chip_data *chip = spi_get_ctldata(spi); u32 cr0 = chip->cr0; @@ -265,6 +265,17 @@ static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi, cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; dw_writel(dws, DW_SPI_CTRLR0, cr0); + + /* Handle per transfer options for bpw and speed */ + if (transfer->speed_hz != dws->current_freq) { + if (transfer->speed_hz != chip->speed_hz) { + /* clk_div doesn't support odd number */ + chip->clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 0xfffe; + chip->speed_hz = transfer->speed_hz; + } + dws->current_freq = transfer->speed_hz; + spi_set_clk(dws, chip->clk_div); + } } static int dw_spi_transfer_one(struct spi_controller *master, @@ -289,21 +300,10 @@ static int dw_spi_transfer_one(struct spi_controller *master, spi_enable_chip(dws, 0); - /* Handle per transfer options for bpw and speed */ - if (transfer->speed_hz != dws->current_freq) { - if (transfer->speed_hz != chip->speed_hz) { - /* clk_div doesn't support odd number */ - chip->clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 0xfffe; - chip->speed_hz = transfer->speed_hz; - } - dws->current_freq = transfer->speed_hz; - spi_set_clk(dws, chip->clk_div); - } + dw_spi_update_config(dws, spi, transfer); transfer->effective_speed_hz = dws->max_freq / chip->clk_div; - dw_spi_update_cr0(dws, spi, transfer); - /* Check if current transfer is a DMA transaction */ if (master->can_dma && master->can_dma(master, spi, transfer)) dws->dma_mapped = master->cur_msg_mapped; -- 2.27.0
[PATCH 26/30] spi: dw: Add memory operations support
Aside from the synchronous Tx-Rx mode, which has been utilized to create the normal SPI transfers in the framework of the DW SSI driver, DW SPI controller supports Tx-only and EEPROM-read modes. The former one just enables the controller to transmit all the data from the Tx FIFO ignoring anything retrieved from the MISO lane. The later mode is so called write-then-read operation: DW SPI controller first pushes out all the data from the Tx FIFO, after that it'll automatically receive as much data as has been specified by means of the CTRLR1 register. Both of those modes can be used to implement the memory operations supported by the SPI-memory subsystem. The memory operation implementation is pretty much straightforward, except a few peculiarities we have had to take into account to make things working. Since DW SPI controller doesn't provide a way to directly set and clear the native CS lane level, but instead automatically de-asserts it when a transfer going on, we have to make sure the Tx FIFO isn't empty during entire Tx procedure. In addition we also need to read data from the Rx FIFO as fast as possible to prevent it' overflow with automatically fetched incoming traffic. The denoted peculiarities get to cause even more problems if DW SSI controller is equipped with relatively small FIFO and is connected to a relatively slow system bus (APB) (with respect to the SPI bus speed). In order to workaround the problems for as much as it's possible, the memory operation execution procedure collects all the Tx data into a single buffer and disables the local IRQs to speed the write-then-optionally-read method up. Note the provided memory operations are utilized by default only if a glue driver hasn't provided a custom version of ones and this is not a DW APB SSI controller with fixed automatic CS toggle functionality. Co-developed-by: Ramil Zaripov Signed-off-by: Ramil Zaripov Signed-off-by: Serge Semin --- drivers/spi/Kconfig | 1 + drivers/spi/spi-dw-core.c | 300 ++ drivers/spi/spi-dw.h | 13 ++ 3 files changed, 314 insertions(+) diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig index c6ea760ea5f0..1f70bb1e7fa9 100644 --- a/drivers/spi/Kconfig +++ b/drivers/spi/Kconfig @@ -235,6 +235,7 @@ config SPI_DAVINCI config SPI_DESIGNWARE tristate "DesignWare SPI controller core support" + imply SPI_MEM help general driver for SPI controller core from DesignWare diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 77d61ded9256..ca22f427d82d 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -8,10 +8,13 @@ #include #include #include +#include #include #include #include #include +#include +#include #include #include "spi-dw.h" @@ -401,6 +404,300 @@ static void dw_spi_handle_err(struct spi_controller *master, spi_reset_chip(dws); } +static int dw_spi_adjust_mem_op_size(struct spi_mem *mem, struct spi_mem_op *op) +{ + if (op->data.dir == SPI_MEM_DATA_IN) + op->data.nbytes = clamp_val(op->data.nbytes, 0, SPI_NDF_MASK + 1); + + return 0; +} + +static bool dw_spi_supports_mem_op(struct spi_mem *mem, + const struct spi_mem_op *op) +{ + if (op->data.buswidth > 1 || op->addr.buswidth > 1 || + op->dummy.buswidth > 1 || op->cmd.buswidth > 1) + return false; + + return spi_mem_default_supports_op(mem, op); +} + +static int dw_spi_init_mem_buf(struct dw_spi *dws, const struct spi_mem_op *op) +{ + unsigned int i, j, len; + u8 *out; + + /* +* Calculate the total length of the EEPROM command transfer and +* either use the pre-allocated buffer or create a temporary one. +*/ + len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes; + if (op->data.dir == SPI_MEM_DATA_OUT) + len += op->data.nbytes; + + if (len <= SPI_BUF_SIZE) { + out = dws->buf; + } else { + out = kzalloc(len, GFP_KERNEL); + if (!out) + return -ENOMEM; + } + + /* +* Collect the operation code, address and dummy bytes into the single +* buffer. If it's a transfer with data to be sent, also copy it into the +* single buffer in order to speed the data transmission up. +*/ + for (i = 0; i < op->cmd.nbytes; ++i) + out[i] = SPI_GET_BYTE(op->cmd.opcode, op->cmd.nbytes - i - 1); + for (j = 0; j < op->addr.nbytes; ++i, ++j) + out[i] = SPI_GET_BYTE(op->addr.val, op->addr.nbytes - j - 1); + for (j = 0; j < op->dummy.nbytes; ++i, ++j) + out[i] = 0x0; + + if (op->data.dir == SPI_MEM_DATA_OUT) + memcpy(&out[i], op->data.buf.out, op->data.nbytes); + + dws->n_bytes = 1; + dws->tx = out; + dws->tx_len = len; + if (op->data.dir
[PATCH 24/30] spi: dw: Move num-of retries parameter to the header file
The parameter will be needed for another wait-done method being added in the framework of the SPI memory operation modification in a further commit. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-dma.c | 5 ++--- drivers/spi/spi-dw.h | 2 ++ 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index bb390ff67d1d..9db119dc5554 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -17,7 +17,6 @@ #include "spi-dw.h" -#define WAIT_RETRIES 5 #define RX_BUSY0 #define RX_BURST_LEVEL 16 #define TX_BUSY1 @@ -208,7 +207,7 @@ static inline bool dw_spi_dma_tx_busy(struct dw_spi *dws) static int dw_spi_dma_wait_tx_done(struct dw_spi *dws, struct spi_transfer *xfer) { - int retry = WAIT_RETRIES; + int retry = SPI_WAIT_RETRIES; struct spi_delay delay; u32 nents; @@ -283,7 +282,7 @@ static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws) static int dw_spi_dma_wait_rx_done(struct dw_spi *dws) { - int retry = WAIT_RETRIES; + int retry = SPI_WAIT_RETRIES; struct spi_delay delay; unsigned long ns, us; u32 nents; diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index eb1d46983319..946065201c9c 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -100,6 +100,8 @@ #define SPI_DMA_RDMAE (1 << 0) #define SPI_DMA_TDMAE (1 << 1) +#define SPI_WAIT_RETRIES 5 + enum dw_ssi_type { SSI_MOTO_SPI = 0, SSI_TI_SSP, -- 2.27.0
[PATCH 23/30] spi: dw: Explicitly de-assert CS on SPI transfer completion
By design of the currently available native set_cs callback, the CS de-assertion will be done only if it's required by the corresponding controller capability. But in order to pre-fill the Tx FIFO buffer with data during the SPI memory ops execution the SER register needs to be left cleared before that. We'll also need a way to explicitly set and clear the corresponding CS bit at a certain moment of the operation. Let's alter the set_cs function then to also de-activate the CS, when it's required. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 65db4dd3ea8a..7a25ea6f4af6 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -100,7 +100,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable) */ if (cs_high == enable) dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select)); - else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE) + else dw_writel(dws, DW_SPI_SER, 0); } EXPORT_SYMBOL_GPL(dw_spi_set_cs); -- 2.27.0
[PATCH 28/30] spi: dw: Add poll-based SPI transfers support
A functionality of the poll-based transfer has been removed by commit 1ceb09717e98 ("spi: dw: remove cs_control and poll_mode members from chip_data") with a justification that "there is no user of one anymore". It turns out one of our DW APB SSI core is synthesized with no IRQ line attached and the only possible way of using it is to implement a poll-based SPI transfer procedure. So we have to get the removed functionality back, but with some alterations described below. First of all the poll-based transfer is activated only if the DW SPI controller doesn't have an IRQ line attached and the Linux IRQ number is initialized with the IRQ_NOTCONNECTED value. Secondly the transfer procedure is now executed with a delay performed between writer and reader methods. The delay value is calculated based on the number of data words expected to be received on the current iteration. Finally the errors status is checked on each iteration. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 40 ++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 7b901226fd38..cd01225a0f81 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -343,6 +343,42 @@ static void dw_spi_irq_setup(struct dw_spi *dws) dws->transfer_handler = dw_spi_transfer_handler; } +/* + * The iterative procedure of the poll-based transfer is simple: write as much + * as possible to the Tx FIFO, wait until the pending to receive data is ready + * to be read, read it from the Rx FIFO and check whether the performed + * procedure has been successful. + * + * Note this method the same way as the IRQ-based transfer won't work well for + * the SPI devices connected to the controller with native CS due to the + * automatic CS assertion/de-assertion. + */ +static int dw_spi_poll_transfer(struct dw_spi *dws, + struct spi_transfer *transfer) +{ + struct spi_delay delay; + u16 nbits; + int ret; + + delay.unit = SPI_DELAY_UNIT_SCK; + nbits = dws->n_bytes * BITS_PER_BYTE; + + do { + dw_writer(dws); + + delay.value = nbits * (dws->rx_len - dws->tx_len); + spi_delay_exec(&delay, transfer); + + dw_reader(dws); + + ret = dw_spi_check_status(dws, true); + if (ret) + return ret; + } while (dws->rx_len); + + return 0; +} + static int dw_spi_transfer_one(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) { @@ -387,6 +423,8 @@ static int dw_spi_transfer_one(struct spi_controller *master, if (dws->dma_mapped) return dws->dma_ops->dma_transfer(dws, transfer); + else if (dws->irq == IRQ_NOTCONNECTED) + return dw_spi_poll_transfer(dws, transfer); dw_spi_irq_setup(dws); @@ -795,7 +833,7 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev), master); - if (ret < 0) { + if (ret < 0 && ret != -ENOTCONN) { dev_err(dev, "can not get IRQ\n"); goto err_free_master; } -- 2.27.0
[PATCH 20/30] spi: dw: Unmask IRQs after enabling the chip
It's theoretically erroneous to enable IRQ before the chip is turned on. If IRQ handler gets executed before the chip is enabled, then any data written to the Tx FIFO will be just ignored. I say "theoretically" because we haven't noticed any problem with that, but let's fix it anyway just in case... Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 08bc53b9de88..8dbe11c1821c 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -355,8 +355,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, spi_enable_chip(dws, 1); return ret; } - } else { - dw_spi_irq_setup(dws); } spi_enable_chip(dws, 1); @@ -364,6 +362,8 @@ static int dw_spi_transfer_one(struct spi_controller *master, if (dws->dma_mapped) return dws->dma_ops->dma_transfer(dws, transfer); + dw_spi_irq_setup(dws); + return 1; } -- 2.27.0
[PATCH 04/30] Revert: spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls
There is no point in having the commit 19b61392c5a8 ("spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls") applied. The commit author made an assumption that the problem with the rx data mismatch was due to the lack of the data protection. While most likely it was caused by the lack of the memory barrier. So having the commit bfda044533b2 ("spi: dw: use "smp_mb()" to avoid sending spi data error") applied would be enough to fix the problem. Indeed the spin unlock operation makes sure each memory operation issued before the release will be completed before it's completed. In other words it works as an implicit one way memory barrier. So having both smp_mb() and the spin_unlock_irqrestore() here is just redundant. One of them would be enough. It's better to leave the smp_mb() since the Tx/Rx buffers consistency is provided by the data transfer algorithm implementation: first we initialize the buffers pointers, then make sure the assignments are visible by the other CPUs by calling the smp_mb(), only after that enable the interrupt, which handler uses the buffers. Signed-off-by: Serge Semin --- Folks. I have also a doubt whether the SMP memory barrier is required there because the normal IO-methods like readl/writel imply a full memory barrier. So any memory operation performed before them are supposed to be seen by devices and another CPUs [1]. So most likely there could have been a problem with those IOs implementation on the subject platform or a spurious interrupt could have been raised during the data initialization. What do you think? Am I missing something? [1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" --- drivers/spi/spi-dw-core.c | 14 ++ drivers/spi/spi-dw.h | 1 - 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 1af74362461d..18411f5b9954 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -142,11 +142,9 @@ static inline u32 rx_max(struct dw_spi *dws) static void dw_writer(struct dw_spi *dws) { - u32 max; + u32 max = tx_max(dws); u16 txw = 0; - spin_lock(&dws->buf_lock); - max = tx_max(dws); while (max--) { /* Set the tx word if the transfer's original "tx" is not null */ if (dws->tx_end - dws->len) { @@ -158,16 +156,13 @@ static void dw_writer(struct dw_spi *dws) dw_write_io_reg(dws, DW_SPI_DR, txw); dws->tx += dws->n_bytes; } - spin_unlock(&dws->buf_lock); } static void dw_reader(struct dw_spi *dws) { - u32 max; + u32 max = rx_max(dws); u16 rxw; - spin_lock(&dws->buf_lock); - max = rx_max(dws); while (max--) { rxw = dw_read_io_reg(dws, DW_SPI_DR); /* Care rx only if the transfer's original "rx" is not null */ @@ -179,7 +174,6 @@ static void dw_reader(struct dw_spi *dws) } dws->rx += dws->n_bytes; } - spin_unlock(&dws->buf_lock); } static void int_error_stop(struct dw_spi *dws, const char *msg) @@ -291,21 +285,18 @@ static int dw_spi_transfer_one(struct spi_controller *master, { struct dw_spi *dws = spi_controller_get_devdata(master); struct chip_data *chip = spi_get_ctldata(spi); - unsigned long flags; u8 imask = 0; u16 txlevel = 0; u32 cr0; int ret; dws->dma_mapped = 0; - spin_lock_irqsave(&dws->buf_lock, flags); dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE); dws->tx = (void *)transfer->tx_buf; dws->tx_end = dws->tx + transfer->len; dws->rx = transfer->rx_buf; dws->rx_end = dws->rx + transfer->len; dws->len = transfer->len; - spin_unlock_irqrestore(&dws->buf_lock, flags); /* Ensure dw->rx and dw->rx_end are visible */ smp_mb(); @@ -464,7 +455,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) dws->master = master; dws->type = SSI_MOTO_SPI; dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR); - spin_lock_init(&dws->buf_lock); spi_controller_set_devdata(master, dws); diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 51bab30b9f85..1ab704d1ebd8 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -131,7 +131,6 @@ struct dw_spi { size_t len; void*tx; void*tx_end; - spinlock_t buf_lock; void*rx; void*rx_end; int dma_mapped; -- 2.27.0
[PATCH 06/30] spi: dw: Disable all IRQs when controller is unused
It's a good practice to disable all IRQs if a device is fully unused. In our case it is supposed to be done before requesting the IRQ and after the last byte of an SPI transfer is received. In the former case it's required to prevent the IRQ handler invocation before the driver data is fully initialized (which may happen if the IRQs status has been left uncleared before the device is probed). So we just moved the spi_hw_init() method invocation to the earlier stage before requesting the IRQ. In the later case there is just no point in having any of the IRQs enabled between SPI transfers and when there is no SPI message currently being processed. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 18411f5b9954..be94ed5bb896 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -198,7 +198,7 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws) dw_reader(dws); if (dws->rx_end == dws->rx) { - spi_mask_intr(dws, SPI_INT_TXEI); + spi_mask_intr(dws, 0xff); spi_finalize_current_transfer(dws->master); return IRQ_HANDLED; } @@ -222,7 +222,7 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id) return IRQ_NONE; if (!master->cur_msg) { - spi_mask_intr(dws, SPI_INT_TXEI); + spi_mask_intr(dws, 0xff); return IRQ_HANDLED; } @@ -458,6 +458,9 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) spi_controller_set_devdata(master, dws); + /* Basic HW init */ + spi_hw_init(dev, dws); + ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev), master); if (ret < 0) { @@ -485,9 +488,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws) device_property_read_u32(dev, "rx-sample-delay-ns", &dws->def_rx_sample_dly_ns); - /* Basic HW init */ - spi_hw_init(dev, dws); - if (dws->dma_ops && dws->dma_ops->dma_init) { ret = dws->dma_ops->dma_init(dev, dws); if (ret) { -- 2.27.0
[PATCH 25/30] spi: dw: Add generic DW SSI status-check method
The DW SSI errors handling method can be generically implemented for all types of the transfers: IRQ, DMA and poll-based ones. It will be a function which checks the overflow/underflow error flags and resets the controller if any of them is set. In the framework of this commit we make use of the new method to detect the errors in the IRQ- and DMA-based SPI transfer execution procedures. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 43 +++ drivers/spi/spi-dw-dma.c | 11 ++ drivers/spi/spi-dw.h | 1 + 3 files changed, 37 insertions(+), 18 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 7a25ea6f4af6..77d61ded9256 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -169,23 +169,48 @@ static void dw_reader(struct dw_spi *dws) } } -static void int_error_stop(struct dw_spi *dws, const char *msg) +int dw_spi_check_status(struct dw_spi *dws, bool raw) { - spi_reset_chip(dws); + u32 irq_status; + int ret = 0; + + if (raw) + irq_status = dw_readl(dws, DW_SPI_RISR); + else + irq_status = dw_readl(dws, DW_SPI_ISR); + + if (irq_status & SPI_INT_RXOI) { + dev_err(&dws->master->dev, "RX FIFO overflow detected\n"); + ret = -EIO; + } + + if (irq_status & SPI_INT_RXUI) { + dev_err(&dws->master->dev, "RX FIFO underflow detected\n"); + ret = -EIO; + } - dev_err(&dws->master->dev, "%s\n", msg); - dws->master->cur_msg->status = -EIO; - spi_finalize_current_transfer(dws->master); + if (irq_status & SPI_INT_TXOI) { + dev_err(&dws->master->dev, "TX FIFO overflow detected\n"); + ret = -EIO; + } + + /* Generically handle the erroneous situation */ + if (ret) { + spi_reset_chip(dws); + if (dws->master->cur_msg) + dws->master->cur_msg->status = ret; + } + + return ret; } +EXPORT_SYMBOL_GPL(dw_spi_check_status); static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws) { u16 irq_status = dw_readl(dws, DW_SPI_ISR); - /* Error handling */ - if (irq_status & (SPI_INT_TXOI | SPI_INT_RXOI | SPI_INT_RXUI)) { - dw_readl(dws, DW_SPI_ICR); - int_error_stop(dws, "interrupt_transfer: fifo overrun/underrun"); + if (dw_spi_check_status(dws, false)) { + spi_finalize_current_transfer(dws->master); return IRQ_HANDLED; } diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c index 9db119dc5554..1969b09b4f5e 100644 --- a/drivers/spi/spi-dw-dma.c +++ b/drivers/spi/spi-dw-dma.c @@ -144,17 +144,10 @@ static void dw_spi_dma_exit(struct dw_spi *dws) static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws) { - u16 irq_status = dw_readl(dws, DW_SPI_ISR); + dw_spi_check_status(dws, false); - if (!irq_status) - return IRQ_NONE; - - dw_readl(dws, DW_SPI_ICR); - spi_reset_chip(dws); - - dev_err(&dws->master->dev, "%s: FIFO overrun/underrun\n", __func__); - dws->master->cur_msg->status = -EIO; complete(&dws->dma_completion); + return IRQ_HANDLED; } diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h index 946065201c9c..5eb98ece2f2a 100644 --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -261,6 +261,7 @@ static inline void spi_shutdown_chip(struct dw_spi *dws) extern void dw_spi_set_cs(struct spi_device *spi, bool enable); extern void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, struct dw_spi_cfg *cfg); +extern int dw_spi_check_status(struct dw_spi *dws, bool raw); extern int dw_spi_add_host(struct device *dev, struct dw_spi *dws); extern void dw_spi_remove_host(struct dw_spi *dws); extern int dw_spi_suspend_host(struct dw_spi *dws); -- 2.27.0
[PATCH 15/30] spi: dw: Update Rx sample delay in the config function
Rx sample delay can be SPI device specific, and should be synchronously initialized with the rest of the communication and peripheral device related controller setups. So let's move the Rx-sample delay setup into the DW APB SSI configuration update method. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 92138a6ada12..f00fc4828480 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -273,13 +273,18 @@ static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi, spi_set_clk(dws, clk_div); dws->current_freq = speed_hz; } + + /* Update RX sample delay if required */ + if (dws->cur_rx_sample_dly != chip->rx_sample_dly) { + dw_writel(dws, DW_SPI_RX_SAMPLE_DLY, chip->rx_sample_dly); + dws->cur_rx_sample_dly = chip->rx_sample_dly; + } } static int dw_spi_transfer_one(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) { struct dw_spi *dws = spi_controller_get_devdata(master); - struct chip_data *chip = spi_get_ctldata(spi); u8 imask = 0; u16 txlevel = 0; int ret; @@ -305,12 +310,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, if (master->can_dma && master->can_dma(master, spi, transfer)) dws->dma_mapped = master->cur_msg_mapped; - /* Update RX sample delay if required */ - if (dws->cur_rx_sample_dly != chip->rx_sample_dly) { - dw_writel(dws, DW_SPI_RX_SAMPLE_DLY, chip->rx_sample_dly); - dws->cur_rx_sample_dly = chip->rx_sample_dly; - } - /* For poll mode just disable all interrupts */ spi_mask_intr(dws, 0xff); -- 2.27.0
[PATCH 21/30] spi: dw: Discard chip enabling on DMA setup error
It's pointless to enable the chip back if the DMA setup procedure fails, since we'll disable it on the next transfer anyway. For the same reason We don't do that in case of a failure detected in any other methods called from the transfer_one() method. While at it consider any non-zero value returned from the dma_setup callback to be erroneous as it's supposed to be in the kernel. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 8dbe11c1821c..65db4dd3ea8a 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -351,10 +351,8 @@ static int dw_spi_transfer_one(struct spi_controller *master, if (dws->dma_mapped) { ret = dws->dma_ops->dma_setup(dws, transfer); - if (ret < 0) { - spi_enable_chip(dws, 1); + if (ret) return ret; - } } spi_enable_chip(dws, 1); -- 2.27.0
[PATCH 11/30] spi: dw: Add DWC SSI capability
Currently DWC SSI core is supported by means of setting up the core-specific update_cr0() callback. It isn't suitable for multiple reasons. First of all having exported several methods doing the same thing but for different chips makes the code harder to maintain. Secondly the spi-dw-core driver exports the methods, then the spi-dw-mmio driver sets the private data callback with one of them so to be called by the core driver again. That makes the code logic too complicated. Thirdly using callbacks for just updating the CR0 register is problematic, since in case if the register needed to be updated from different parts of the code, we'd have to create another callback (for instance the SPI device-specific parameters don't need to be calculated each time the SPI transfer is submitted, so it's better to pre-calculate the CR0 data at the SPI-device setup stage). So keeping all the above in mind let's discard the update_cr0() callbacks, define a generic and static dw_spi_update_cr0() method and create the DW_SPI_CAP_DWC_SSI capability, which when enabled would activate the alternative CR0 register layout. Signed-off-by: Serge Semin --- drivers/spi/spi-dw-core.c | 69 --- drivers/spi/spi-dw-mmio.c | 20 ++-- drivers/spi/spi-dw.h | 9 + 3 files changed, 23 insertions(+), 75 deletions(-) diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c index 8f9737640ec1..c21641a485ce 100644 --- a/drivers/spi/spi-dw-core.c +++ b/drivers/spi/spi-dw-core.c @@ -228,60 +228,33 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id) return dws->transfer_handler(dws); } -/* Configure CTRLR0 for DW_apb_ssi */ -u32 dw_spi_update_cr0(struct spi_controller *master, struct spi_device *spi, - struct spi_transfer *transfer) +static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi, + struct spi_transfer *transfer) { struct chip_data *chip = spi_get_ctldata(spi); u32 cr0; - /* Default SPI mode is SCPOL = 0, SCPH = 0 */ - cr0 = (transfer->bits_per_word - 1) - | (SSI_MOTO_SPI << SPI_FRF_OFFSET) - | spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) | - (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) | - (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET)) - | (chip->tmode << SPI_TMOD_OFFSET); - - return cr0; -} -EXPORT_SYMBOL_GPL(dw_spi_update_cr0); - -/* Configure CTRLR0 for DWC_ssi */ -u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master, -struct spi_device *spi, -struct spi_transfer *transfer) -{ - struct dw_spi *dws = spi_controller_get_devdata(master); - struct chip_data *chip = spi_get_ctldata(spi); - u32 cr0; - - /* CTRLR0[ 4: 0] Data Frame Size */ cr0 = (transfer->bits_per_word - 1); - /* CTRLR0[ 7: 6] Frame Format */ - cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET; - - /* -* SPI mode (SCPOL|SCPH) -* CTRLR0[ 8] Serial Clock Phase -* CTRLR0[ 9] Serial Clock Polarity -*/ - cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << DWC_SSI_CTRLR0_SCPOL_OFFSET; - cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << DWC_SSI_CTRLR0_SCPH_OFFSET; - - /* CTRLR0[11:10] Transfer Mode */ - cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; - - /* CTRLR0[13] Shift Register Loop */ - cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET; - - if (dws->caps & DW_SPI_CAP_KEEMBAY_MST) - cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST; + if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) { + cr0 |= SSI_MOTO_SPI << SPI_FRF_OFFSET; + cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET; + cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET; + cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET; + cr0 |= chip->tmode << SPI_TMOD_OFFSET; + } else { + cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET; + cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << DWC_SSI_CTRLR0_SCPOL_OFFSET; + cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << DWC_SSI_CTRLR0_SCPH_OFFSET; + cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET; + cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET; + + if (dws->caps & DW_SPI_CAP_KEEMBAY_MST) + cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST; + } - return cr0; + dw_writel(dws, DW_SPI_CTRLR0, cr0); } -EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a); static int dw_spi_transfer_one(struct spi_controller *master, struct spi_device *spi, struct spi_transfer *transfer) @@ -290,7 +263,6 @@ static int dw_spi_transfer_one(struct spi_controller *master, struct chip_data *chip = spi_get
[PATCH 1/2] lib: devres: provide devm_iounremap_resource()
Driver doesn't have a single helper function to release memroy allocated by devm_ioremap_resource(). That mean it needs respectively to call devm_release_mem_region() and devm_iounmap() for memory release. This patch creates a helper, devm_iounremap_resource(), to combine above operations. Signed-off-by: pierre Kuo --- include/linux/device.h | 2 ++ lib/devres.c | 25 + 2 files changed, 27 insertions(+) diff --git a/include/linux/device.h b/include/linux/device.h index 9e6ea8931a52..33ec7e54c1a9 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -240,6 +240,8 @@ void devm_free_pages(struct device *dev, unsigned long addr); void __iomem *devm_ioremap_resource(struct device *dev, const struct resource *res); +void devm_iounremap_resource(struct device *dev, +const struct resource *res, void __iomem *addr); void __iomem *devm_ioremap_resource_wc(struct device *dev, const struct resource *res); diff --git a/lib/devres.c b/lib/devres.c index ebb1573d9ae3..cdda0cd0a263 100644 --- a/lib/devres.c +++ b/lib/devres.c @@ -113,6 +113,31 @@ void devm_iounmap(struct device *dev, void __iomem *addr) } EXPORT_SYMBOL(devm_iounmap); +/** + * devm_iounremap_resource() - release mem region, and unremap address + * @dev: generic device to handle the resource for + * @res: resource of mem region to be release + * @addr: address to unmap + * + * Release memory region and unmap address. + */ +void devm_iounremap_resource(struct device *dev, +const struct resource *res, void __iomem *addr) +{ + resource_size_t size; + + BUG_ON(!dev); + if (!res || resource_type(res) != IORESOURCE_MEM) { + dev_err(dev, "invalid resource\n"); + return; + } + + size = resource_size(res); + devm_release_mem_region(dev, res->start, size); + devm_iounmap(dev, addr); +} +EXPORT_SYMBOL(devm_iounremap_resource); + static void __iomem * __devm_ioremap_resource(struct device *dev, const struct resource *res, enum devm_ioremap_type type) -- 2.17.1
[PATCH 2/2] driver core: platform: provide devm_platform_iounremap_resource
Combine platform_get_resource() and devm_iounremap_resource() to release the iomem allocated by devm_platform_get_and_ioremap_resource(). Signed-off-by: pierre Kuo --- drivers/base/platform.c | 24 include/linux/platform_device.h | 4 2 files changed, 28 insertions(+) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index e5d8a0503b4f..e2655c00873f 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -84,6 +84,30 @@ devm_platform_get_and_ioremap_resource(struct platform_device *pdev, } EXPORT_SYMBOL_GPL(devm_platform_get_and_ioremap_resource); +/** + * devm_platform_iounremap_resource - call devm_iounremap_resource() for a + * platform device with memory that addr points to. + * + * @pdev: platform device to use both for memory resource lookup as well as + *resource management + * @index: resource index + * @addr: address to be unmap. + */ +void +devm_platform_iounremap_resource(struct platform_device *pdev, +unsigned int index, void __iomem *addr) +{ + struct resource *r; + + r = platform_get_resource(pdev, IORESOURCE_MEM, index); + if (!r) + dev_err(&pdev->dev, + "MEM resource index %d not found\n", index); + else + devm_iounremap_resource(&pdev->dev, r, addr); +} +EXPORT_SYMBOL_GPL(devm_platform_iounremap_resource); + /** * devm_platform_ioremap_resource - call devm_ioremap_resource() for a platform * device diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index 77a2aada106d..75da15937679 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -67,6 +67,10 @@ devm_platform_ioremap_resource_wc(struct platform_device *pdev, extern void __iomem * devm_platform_ioremap_resource_byname(struct platform_device *pdev, const char *name); +extern void +devm_platform_iounremap_resource(struct platform_device *pdev, +unsigned int index, +void __iomem *addr); extern int platform_get_irq(struct platform_device *, unsigned int); extern int platform_get_irq_optional(struct platform_device *, unsigned int); extern int platform_irq_count(struct platform_device *); -- 2.17.1
[PATCH 30/30] spi: dw: Add Baikal-T1 SPI Controller glue driver
Baikal-T1 is equipped with three DW APB SSI-based MMIO SPI controllers. Two of them are pretty much normal: with IRQ, DMA, FIFOs of 64 words depth, 4x CSs, but the third one as being a part of the Baikal-T1 System Boot Controller has got a very limited resources: no IRQ, no DMA, only a single native chip-select and Tx/Rx FIFO with just 8 words depth available. In order to provide a transparent initial boot code execution the Boot SPI controller is also utilized by an vendor-specific IP-block, which exposes an SPI flash direct mapping interface. Since both direct mapping and SPI controller normal utilization are mutual exclusive only one of these interfaces can be used to access an external SPI slave device. That's why a dedicated mux is embedded into the System Boot Controller. All of that is taken into account in the Baikal-T1-specific DW APB SSI glue driver implemented by means of the DW SPI core module. Co-developed-by: Ramil Zaripov Signed-off-by: Ramil Zaripov Signed-off-by: Serge Semin --- drivers/spi/Kconfig | 28 drivers/spi/Makefile | 1 + drivers/spi/spi-dw-bt1.c | 339 +++ 3 files changed, 368 insertions(+) create mode 100644 drivers/spi/spi-dw-bt1.c diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig index 1f70bb1e7fa9..415d57b2057f 100644 --- a/drivers/spi/Kconfig +++ b/drivers/spi/Kconfig @@ -252,6 +252,34 @@ config SPI_DW_MMIO tristate "Memory-mapped io interface driver for DW SPI core" depends on HAS_IOMEM +config SPI_DW_BT1 + tristate "Baikal-T1 SPI driver for DW SPI core" + depends on MIPS_BAIKAL_T1 || COMPILE_TEST + help + Baikal-T1 SoC is equipped with three DW APB SSI-based MMIO SPI + controllers. Two of them are pretty much normal: with IRQ, DMA, + FIFOs of 64 words depth, 4x CSs, but the third one as being a + part of the Baikal-T1 System Boot Controller has got a very + limited resources: no IRQ, no DMA, only a single native + chip-select and Tx/Rx FIFO with just 8 words depth available. + The later one is normally connected to an external SPI-nor flash + of 128Mb (in general can be of bigger size). + +config SPI_DW_BT1_DIRMAP + bool "Directly mapped Baikal-T1 Boot SPI flash support" + depends on SPI_DW_BT1 + select MULTIPLEXER + select MUX_MMIO + help + Directly mapped SPI flash memory is an interface specific to the + Baikal-T1 System Boot Controller. It is a 16MB MMIO region, which + can be used to access a peripheral memory device just by + reading/writing data from/to it. Note that the system APB bus + will stall during each IO from/to the dirmap region until the + operation is finished. So try not to use it concurrently with + time-critical tasks (like the SPI memory operations implemented + in this driver). + endif config SPI_DLN2 diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile index cf955ea803cd..21dc75842aca 100644 --- a/drivers/spi/Makefile +++ b/drivers/spi/Makefile @@ -39,6 +39,7 @@ obj-$(CONFIG_SPI_DLN2)+= spi-dln2.o obj-$(CONFIG_SPI_DESIGNWARE) += spi-dw.o spi-dw-y := spi-dw-core.o spi-dw-$(CONFIG_SPI_DW_DMA)+= spi-dw-dma.o +obj-$(CONFIG_SPI_DW_BT1) += spi-dw-bt1.o obj-$(CONFIG_SPI_DW_MMIO) += spi-dw-mmio.o obj-$(CONFIG_SPI_DW_PCI) += spi-dw-pci.o obj-$(CONFIG_SPI_EFM32)+= spi-efm32.o diff --git a/drivers/spi/spi-dw-bt1.c b/drivers/spi/spi-dw-bt1.c new file mode 100644 index ..f382dfad7842 --- /dev/null +++ b/drivers/spi/spi-dw-bt1.c @@ -0,0 +1,339 @@ +// SPDX-License-Identifier: GPL-2.0-only +// +// Copyright (C) 2020 BAIKAL ELECTRONICS, JSC +// +// Authors: +// Ramil Zaripov +// Serge Semin +// +// Baikal-T1 DW APB SPI and System Boot SPI driver +// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "spi-dw.h" + +#define BT1_BOOT_DIRMAP0 +#define BT1_BOOT_REGS 1 + +struct dw_spi_bt1 { + struct dw_spi dws; + struct clk *clk; + struct mux_control *mux; + +#ifdef CONFIG_SPI_DW_BT1_DIRMAP + void __iomem*map; + resource_size_t map_len; +#endif +}; +#define to_dw_spi_bt1(_ctlr) \ + container_of(spi_controller_get_devdata(_ctlr), struct dw_spi_bt1, dws) + +typedef int (*dw_spi_bt1_init_cb)(struct platform_device *pdev, + struct dw_spi_bt1 *dwsbt1); + +#ifdef CONFIG_SPI_DW_BT1_DIRMAP + +static int dw_spi_bt1_dirmap_create(struct spi_mem_dirmap_desc *desc) +{ + struct dw_spi_bt1 *dwsbt1 = to_dw_spi_bt1(desc->mem->spi->controller); + + if (!dwsbt1->map || + !dwsbt1->dws.mem
outside repository fatal error
The commit bcf4271d4bc3 ("checkpatch: allow not using -f with files that are in git") in linux-next seems to cause checkpatch to fail on a file containing a patch if that file is not in the directory containing the Linux kernel. Is that intentional? thanks, julia
[PATCH 13/14] misc: mic: drop double zeroing
sg_init_table zeroes its first argument, so the allocation of that argument doesn't have to. the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // @@ expression x,n,flags; @@ x = - kcalloc + kmalloc_array (n,sizeof(struct scatterlist),flags) ... sg_init_table(x,n) // Signed-off-by: Julia Lawall --- drivers/misc/mic/scif/scif_nodeqp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -u -p a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c --- a/drivers/misc/mic/scif/scif_nodeqp.c +++ b/drivers/misc/mic/scif/scif_nodeqp.c @@ -363,7 +363,7 @@ scif_p2p_setsg(phys_addr_t pa, int page_ struct page *page; int i; - sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL); + sg = kmalloc_array(page_cnt, sizeof(struct scatterlist), GFP_KERNEL); if (!sg) return NULL; sg_init_table(sg, page_cnt);
[PATCH 08/14] xprtrdma: drop double zeroing
sg_init_table zeroes its first argument, so the allocation of that argument doesn't have to. the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // @@ expression x,n,flags; @@ x = - kcalloc + kmalloc_array (n,sizeof(*x),flags) ... sg_init_table(x,n) // Signed-off-by: Julia Lawall --- net/sunrpc/xprtrdma/frwr_ops.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -u -p a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -124,7 +124,7 @@ int frwr_mr_init(struct rpcrdma_xprt *r_ if (IS_ERR(frmr)) goto out_mr_err; - sg = kcalloc(depth, sizeof(*sg), GFP_NOFS); + sg = kmalloc_array(depth, sizeof(*sg), GFP_NOFS); if (!sg) goto out_list_err;
[PATCH 06/14] block: drop double zeroing
sg_init_table zeroes its first argument, so the allocation of that argument doesn't have to. the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // @@ expression x; @@ x = - kzalloc + kmalloc (...) ... sg_init_table(x,...) // Signed-off-by: Julia Lawall --- block/bsg-lib.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -u -p a/block/bsg-lib.c b/block/bsg-lib.c --- a/block/bsg-lib.c +++ b/block/bsg-lib.c @@ -207,7 +207,7 @@ static int bsg_map_buffer(struct bsg_buf BUG_ON(!req->nr_phys_segments); - buf->sg_list = kzalloc(sz, GFP_KERNEL); + buf->sg_list = kmalloc(sz, GFP_KERNEL); if (!buf->sg_list) return -ENOMEM; sg_init_table(buf->sg_list, req->nr_phys_segments);