date:20200920

[PATCH] lib/scatterlist: Avoid a double memset

2020-09-20 Thread Christophe JAILLET

'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
clear it twice.

Remove the redundant initialization.

Signed-off-by: Christophe JAILLET 
---
 lib/scatterlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 5d63a8857f36..d94628fa3349 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -504,7 +504,7 @@ struct scatterlist *sgl_alloc_order(unsigned long long 
length,
nalloc++;
}
sgl = kmalloc_array(nalloc, sizeof(struct scatterlist),
-   (gfp & ~GFP_DMA) | __GFP_ZERO);
+   gfp & ~GFP_DMA);
if (!sgl)
return NULL;
 
-- 
2.25.1

ERROR: modpost: "rcu_idle_enter" undefined!

2020-09-20 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   325d0eab4f31c6240b59d5b2b8042c88f59405b5
commit: 1fecfdbb7acc6624655450a609221c89b5197a06 ACPI: processor: Take over 
RCU-idle for C3-BM idle
date:   4 days ago
config: ia64-defconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 1fecfdbb7acc6624655450a609221c89b5197a06
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=ia64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "rcu_idle_enter" [drivers/acpi/processor.ko] undefined!
>> ERROR: modpost: "rcu_idle_exit" [drivers/acpi/processor.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] lib/scatterlist: Avoid a double memset

2020-09-20 Thread Joe Perches

On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote:
> 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
> clear it twice.
> 
> Remove the redundant initialization.

I didn't look very thoroughly, but there are at least
a few more of these with kcalloc and kzalloc like

block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * 
req->nr_phys_segments);
block/bsg-lib.c-
block/bsg-lib.c-BUG_ON(!req->nr_phys_segments);
block/bsg-lib.c-
block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL);
block/bsg-lib.c-if (!buf->sg_list)
block/bsg-lib.c-return -ENOMEM;
block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments);
---
drivers/target/target_core_rd.c-sg = kcalloc(sg_per_table + 
chain_entry, sizeof(*sg),
drivers/target/target_core_rd.c-GFP_KERNEL);
drivers/target/target_core_rd.c-if (!sg)
drivers/target/target_core_rd.c-return -ENOMEM;
drivers/target/target_core_rd.c-
drivers/target/target_core_rd.c:sg_init_table(sg, sg_per_table 
+ chain_entry);
---
net/rds/rdma.c- sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL);
net/rds/rdma.c- if (!sg) {
net/rds/rdma.c- ret = -ENOMEM;
net/rds/rdma.c- goto out;
net/rds/rdma.c- }
net/rds/rdma.c- WARN_ON(!nents);
net/rds/rdma.c: sg_init_table(sg, nents);

Scammed Victim

2020-09-20 Thread Ms.Helen Clark

SCAMMED VICTIM REF CODE: 06654

This is to bring to your notice that I am a delegate from the United
Nations to The IMF (International Monetary Fund) United States Regional
Payment Office to pay 20 scam victims $1,000,000.00 USD (One Million
Dollars only) each. You are listed and approved for this payment as one
of the scammed victims to be paid this amount,respond to this mail as
soon as possible for the immediate payments of your $1,000,000.00 USD
compensations funds.

You are to send your:

NAME:
ADDRESS:
TELEPHONE:
DATE OF BIRTH:
COUNTRY:
SEX:
OCCUPATION:

CONTACT OUR HSBC BANK PAYMENT OFFICER

Name: Tom Robert
Email: tomrob...@accountant.com

Please you are advice to stop any communication with any person or Body
that you are currently paying money to due to transaction which you have
online with the person.Please let us know right now via email or
telephone your present dealing online, so we can help you verify the
authenticity of the transaction which you are presently involved in, so
you don't get scammed. Please also send me proof of payment in any
previous scam If you still have any (optional).I shall feed you with
further modalities as soon as I hear from you.

Regards,
Ms.Helen Clark
United Nation Representative.
http://www.undp.org


NOTE : If You Receive This Message In Your Junk Or Spam Its Due To Your
Internet Provider.

[PATCH] scsi: pmcraid: Fix memory allocation in 'pmcraid_alloc_sglist()'

2020-09-20 Thread Christophe JAILLET

When the scatter list is allocated in 'pmcraid_alloc_sglist()', the
corresponding pointer should be stored in 'scatterlist' within the
'pmcraid_sglist' structure. Otherwise, 'scatterlist' is NULL.

This leads to a potential memory leak and NULL pointer dereference.

Fixes: ed4414cef2ad ("scsi: pmcraid: Use sgl_alloc_order() and 
sgl_free_order()")
Signed-off-by: Christophe JAILLET 
---
This patch is completely speculative and untested.

Should it be correct, I think that their should be some trouble somewhere.
Either NULL pointer dereference or incorrect behavior.
The patch that introduced this potential bug is 2 years 1/2 old. This
should have been spotted earlier.

So unless this driver is mostly unused, this looks odd to me.
Feedback appreciated.
---
 drivers/scsi/pmcraid.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
index d99568fdf4af..00e155c88f03 100644
--- a/drivers/scsi/pmcraid.c
+++ b/drivers/scsi/pmcraid.c
@@ -3230,8 +3230,9 @@ static struct pmcraid_sglist *pmcraid_alloc_sglist(int 
buflen)
return NULL;
 
sglist->order = order;
-   sgl_alloc_order(buflen, order, false,
-   GFP_KERNEL | GFP_DMA | __GFP_ZERO, &sglist->num_sg);
+   sglist->scatterlist = sgl_alloc_order(buflen, order, false,
+ GFP_KERNEL | GFP_DMA | __GFP_ZERO,
+ &sglist->num_sg);
 
return sglist;
 }
-- 
2.25.1

[tip:x86/cpu] BUILD SUCCESS e1ebb2b49048c4767cfa0d8466f9c701e549fa5e

2020-09-20 Thread kernel test robot

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
x86/cpu
branch HEAD: e1ebb2b49048c4767cfa0d8466f9c701e549fa5e  KVM: SVM: Don't flush 
cache if hardware enforces cache coherency across encryption domains

elapsed time: 721m

configs tested: 61
configs skipped: 65

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
pariscgeneric-32bit_defconfig
xtensa   common_defconfig
arcnsimosci_defconfig
sh  lboxre2_defconfig
powerpc ps3_defconfig
powerpc tqm8541_defconfig
mips  pic32mzda_defconfig
armmmp2_defconfig
sparcalldefconfig
powerpcicon_defconfig
m68k   m5208evb_defconfig
csky alldefconfig
powerpc tqm8555_defconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
i386 allyesconfig
i386defconfig
x86_64   randconfig-a005-20200920
x86_64   randconfig-a003-20200920
x86_64   randconfig-a004-20200920
x86_64   randconfig-a002-20200920
x86_64   randconfig-a006-20200920
x86_64   randconfig-a001-20200920
i386 randconfig-a002-20200920
i386 randconfig-a006-20200920
i386 randconfig-a003-20200920
i386 randconfig-a004-20200920
i386 randconfig-a005-20200920
i386 randconfig-a001-20200920
i386 randconfig-a012-20200920
i386 randconfig-a014-20200920
i386 randconfig-a016-20200920
i386 randconfig-a013-20200920
i386 randconfig-a011-20200920
i386 randconfig-a015-20200920
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a011-20200920
x86_64   randconfig-a013-20200920
x86_64   randconfig-a014-20200920
x86_64   randconfig-a015-20200920
x86_64   randconfig-a012-20200920
x86_64   randconfig-a016-20200920

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-20 Thread Daniel Vetter

On Sun, Sep 20, 2020 at 08:23:26AM +0200, Thomas Gleixner wrote:
> On Sat, Sep 19 2020 at 12:37, Daniel Vetter wrote:
> > On Sat, Sep 19, 2020 at 12:35 PM Daniel Vetter  wrote:
> >> I think it should be the case, but I want to double check: Will
> >> copy_*_user be allowed within a kmap_temporary section? This would
> >> allow us to ditch an absolute pile of slowpaths.
> >
> > (coffee just kicked in) copy_*_user is ofc allowed, but if you hit a
> > page fault you get a short read/write. This looks like it would remove
> > the need to handle these in a slowpath, since page faults can now be
> > served in this new kmap_temporary sections. But this sounds too good
> > to be true, so I'm wondering what I'm missing.
> 
> In principle we could allow pagefaults, but not with the currently
> proposed interface which can be called from any context. Obviously if
> called from atomic context it can't handle user page faults.
 
Yeah that's clear, but does the implemention need to disable pagefaults
unconditionally?

> In theory we could make a variant which does not disable pagefaults, but
> that's what kmap() already provides.

Currently we have a bunch of code which roughly does

kmap_atomic();
copy_*_user();
kunmap_atomic();

if (short_copy_user) {
kmap();
copy_*_user(remaining_stuff);
kunmap();
}

And the only reason is that kmap is a notch slower, hence the fastpath. If
we get a kmap which is fast and allows pagefaults (only in contexts that
allow pagefaults already ofc) then we can ditch a pile of kmap users.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/bridge: dw-mipi-dsi: Use kmemdup cf. kmalloc+memcpy

2020-09-20 Thread Daniel Vetter

On Sat, Sep 19, 2020 at 9:31 PM Alex Dewar  wrote:
>
> On 2020-09-11 13:57, Neil Armstrong wrote:
> > On 09/09/2020 21:02, Alex Dewar wrote:
> >> kmemdup can be used instead of kmalloc+memcpy. Replace an occurrence of
> >> this pattern.
> Friendly ping?
> >>
> >> Issue identified with Coccinelle.
> >>
> >> Signed-off-by: Alex Dewar 
> >> ---
> >>   drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 4 +---
> >>   1 file changed, 1 insertion(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
> >> b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
> >> index 52f5c5a2ed64..7e9a62ad56e8 100644
> >> --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
> >> +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
> >> @@ -1049,12 +1049,10 @@ static void debugfs_create_files(void *data)
> >>  };
> >>  int i;
> >>
> >> -dsi->debugfs_vpg = kmalloc(sizeof(debugfs), GFP_KERNEL);
> >> +dsi->debugfs_vpg = kmemdup(debugfs, sizeof(debugfs), GFP_KERNEL);
> >>  if (!dsi->debugfs_vpg)
> >>  return;
> >>
> >> -memcpy(dsi->debugfs_vpg, debugfs, sizeof(debugfs));
> >> -
> >>  for (i = 0; i < ARRAY_SIZE(debugfs); i++)
> >>  debugfs_create_file(dsi->debugfs_vpg[i].name, 0644,
> >>  dsi->debugfs, &dsi->debugfs_vpg[i],
> >>
> > Acked-by: Neil Armstrong 

Neil has commit rights, so I was assuming he'd push this to drm-misc-next.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH 4.19] dmaengine: at_hdmac: Fix memory leak

2020-09-20 Thread Pavel Machek


This fixes memory leak in at_hdmac. Mainline does not have the same
problem.

Signed-off-by: Pavel Machek (CIP) 

diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 86427f6ba78c..0847b2055857 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -1714,8 +1714,10 @@ static struct dma_chan *at_dma_xlate(struct 
of_phandle_args *dma_spec,
atslave->dma_dev = &dmac_pdev->dev;
 
chan = dma_request_channel(mask, at_dma_filter, atslave);
-   if (!chan)
+   if (!chan) {
+   kfree(atslave);
return NULL;
+   }
 
atchan = to_at_dma_chan(chan);
atchan->per_if = dma_spec->args[0] & 0xff;

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

RE: [PATCH] [v2] nvme: use correct upper limit for tag in nvme_handle_cqe()

2020-09-20 Thread Tianxianting

Hi,
I test and get the init flow of nvme admin queue and io queue in kernel 5.6,
Currently, the code use nvmeq->q_depth as the upper limit for tag in 
nvme_handle_cqe(), according to below init flow, we already have the race 
currently.

Admin queue init flow:
1, set nvmeq->q_depth 32 for admin queue;
2, register irq handler(nvme_irq) for admin queue 0;
3, set admin_tagset.queue_depth to 30 and alloc rqs;
4, nvme irq happen on admin queue;

IO queue init flow:
1, set nvmeq->q_depth 1024 for io queue 1~16;
2, register irq handler(nvme_irq) for io queue 1~16;
3, set tagset.queue_depth to 1023 and alloc rqs;
4, nvme irq happen on io queue;

So we have two issues need to fix:
1, register interrupt handler(nvme_irq) after tagset init(step 3 above) done to 
avoid a race;
2, use correct upper limit(queue_depth in tagset) for tag in nvme_handle_cqe(), 
which is the issue that will be solved in this patch I submitted.

Is it right? Thanks a lot.

-Original Message-
From: tianxianting (RD) 
Sent: Saturday, September 19, 2020 11:15 AM
To: 'Keith Busch' 
Cc: ax...@fb.com; h...@lst.de; s...@grimberg.me; 
linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org
Subject: RE: [PATCH] [v2] nvme: use correct upper limit for tag in 
nvme_handle_cqe()

Hi Keith,
Thanks a lot for your comments,
I will try to figure out a safe fix for this issue, then for you review:) 

-Original Message-
From: Keith Busch [mailto:kbu...@kernel.org] 
Sent: Saturday, September 19, 2020 3:21 AM
To: tianxianting (RD) 
Cc: ax...@fb.com; h...@lst.de; s...@grimberg.me; 
linux-n...@lists.infradead.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] [v2] nvme: use correct upper limit for tag in 
nvme_handle_cqe()

On Fri, Sep 18, 2020 at 06:44:20PM +0800, Xianting Tian wrote:
> @@ -940,7 +940,9 @@ static inline void nvme_handle_cqe(struct nvme_queue 
> *nvmeq, u16 idx)
>   struct nvme_completion *cqe = &nvmeq->cqes[idx];
>   struct request *req;
>  
> - if (unlikely(cqe->command_id >= nvmeq->q_depth)) {
> + if (unlikely(cqe->command_id >=
> + nvmeq->qid ? nvmeq->dev->tagset.queue_depth :
> + nvmeq->dev->admin_tagset.queue_depth)) {

Both of these values are set before blk_mq_alloc_tag_set(), so you still have a 
race. The interrupt handler probably just shouldn't be registered with the 
queue before the tagset is initialized since there can't be any work for the 
handler to do before that happens anyway.

The controller is definitely broken, though, and will lead to unavoidable 
corruption if it's really behaving this way.

Re: [PATCH] x86/boot: Delay BSP init until after FPU initialization

2020-09-20 Thread Borislav Petkov

On Sun, Sep 20, 2020 at 10:03:10AM +0900, Mike Hommey wrote:
> FPU initialization handles the clearcpuid command line argument. If it
> comes after BSP init, clearcpuid cannot be used to disable features that
> trigger some parts of the BSP init code.
> 
> Signed-off-by: Mike Hommey 
> ---
>  arch/x86/kernel/cpu/common.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> I was trying to use clearcpuid=440 to disable X86_FEATURES_AMD_SSBD to
> reproduce the behavior that happens on Zen/Zen+ on a Zen2 machine, but
> that didn't work because the command line is handled after the setup for
> X86_FEATURE_LS_CFG_SSBD.
> 
> I tought about either moving the command line handling earlier, but it
> seems there wasn't a specific reason for BSP init being earlier than FPU
> initialization so I went with reordering those instead.

Our boot order is fragile and the functionality in
fpu__init_parse_early_param() which does the clearcpuid= parsing should be
independent from FPU, as your use case shows.

So I'd prefer if you moved that function perhaps to right after the call

  setup_force_cpu_cap(X86_FEATURE_CPUID);

in early_identify_cpu() and renamed it to something generic instead.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

[PATCH] watchdog: fix memory leak in error path

2020-09-20 Thread Pavel Machek

Fix memory leak in error path.

Signed-off-by: Pavel Machek (CIP) 

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 6798addabd5a..785270ee337c 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -994,8 +994,10 @@ static int watchdog_cdev_register(struct watchdog_device 
*wdd)
wd_data->wdd = wdd;
wdd->wd_data = wd_data;
 
-   if (IS_ERR_OR_NULL(watchdog_kworker))
+   if (IS_ERR_OR_NULL(watchdog_kworker)) {
+   kfree(wd_data);
return -ENODEV;
+   }
 
device_initialize(&wd_data->dev);
wd_data->dev.devt = MKDEV(MAJOR(watchdog_devt), wdd->id);

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH] lib/scatterlist: Avoid a double memset

2020-09-20 Thread Julia Lawall




On Sun, 20 Sep 2020, Joe Perches wrote:

> On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote:
> > 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
> > clear it twice.
> >
> > Remove the redundant initialization.
>
> I didn't look very thoroughly, but there are at least
> a few more of these with kcalloc and kzalloc like
>
> block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * 
> req->nr_phys_segments);
> block/bsg-lib.c-
> block/bsg-lib.c-BUG_ON(!req->nr_phys_segments);
> block/bsg-lib.c-
> block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL);
> block/bsg-lib.c-if (!buf->sg_list)
> block/bsg-lib.c-return -ENOMEM;
> block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments);
> ---
> drivers/target/target_core_rd.c-  sg = kcalloc(sg_per_table + 
> chain_entry, sizeof(*sg),
> drivers/target/target_core_rd.c-  GFP_KERNEL);
> drivers/target/target_core_rd.c-  if (!sg)
> drivers/target/target_core_rd.c-  return -ENOMEM;
> drivers/target/target_core_rd.c-
> drivers/target/target_core_rd.c:  sg_init_table(sg, sg_per_table 
> + chain_entry);
> ---
> net/rds/rdma.c-   sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL);
> net/rds/rdma.c-   if (!sg) {
> net/rds/rdma.c-   ret = -ENOMEM;
> net/rds/rdma.c-   goto out;
> net/rds/rdma.c-   }
> net/rds/rdma.c-   WARN_ON(!nents);
> net/rds/rdma.c:   sg_init_table(sg, nents);

I found 16 occurrences in the following files:

net/rds/rdma.c
drivers/infiniband/hw/efa/efa_verbs.c
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
drivers/misc/mic/scif/scif_nodeqp.c
block/bsg-lib.c
drivers/dma/sh/rcar-dmac.c
drivers/spi/spi-topcliff-pch.c
net/sunrpc/xprtrdma/frwr_ops.c
drivers/dma/imx-dma.c
drivers/pci/p2pdma.c
drivers/dma/sh/shdma-base.c
drivers/target/target_core_rd.c
drivers/media/common/saa7146/saa7146_core.c
drivers/tty/serial/pch_uart.c
drivers/net/wireless/intel/iwlwifi/fw/dbg.c

julia

[PATCH] usb: yurex: Rearrange code not to need GFP_ATOMIC

2020-09-20 Thread Pavel Machek


Move prepare to wait around, so that normal GFP_KERNEL allocation can
be used.

Signed-off-by: Pavel Machek (CIP) 
Acked-by: Alan Stern 

diff --git a/drivers/usb/misc/yurex.c b/drivers/usb/misc/yurex.c
index b2e09883c7e2..071f1debebba 100644
--- a/drivers/usb/misc/yurex.c
+++ b/drivers/usb/misc/yurex.c
@@ -489,10 +489,10 @@ static ssize_t yurex_write(struct file *file, const char 
__user *user_buffer,
}
 
/* send the data as the control msg */
-   prepare_to_wait(&dev->waitq, &wait, TASK_INTERRUPTIBLE);
dev_dbg(&dev->interface->dev, "%s - submit %c\n", __func__,
dev->cntl_buffer[0]);
-   retval = usb_submit_urb(dev->cntl_urb, GFP_ATOMIC);
+   retval = usb_submit_urb(dev->cntl_urb, GFP_KERNEL);
+   prepare_to_wait(&dev->waitq, &wait, TASK_INTERRUPTIBLE);
if (retval >= 0)
timeout = schedule_timeout(YUREX_WRITE_TIMEOUT);
finish_wait(&dev->waitq, &wait);

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

2020-09-20 Thread Greg Kroah-Hartman

On Sat, Sep 19, 2020 at 04:22:35PM -0300, Jason Gunthorpe wrote:
> On Sat, Sep 19, 2020 at 07:27:30PM +0200, Greg Kroah-Hartman wrote:
> > > It's probably heresy, but why do I need to integrate into the RDMA 
> > > subsystem ?
> > > I understand your reasoning about networking (Ethernet) as the driver
> > > connects to the kernel networking stack (netdev), but with RDMA the
> > > driver doesn't use or connect to anything in that stack. If I were to
> > > support IBverbs and declare that I support it, then of course I would
> > > need to integrate to the RDMA subsystem and add my backend to
> > > rdma-core.
> > 
> > IBverbs are horrid and I would not wish them on anyone.  Seriously.
> 
> I'm curious what drives this opinion? Did you have it since you
> reviewed the initial submission all those years ago?

As I learned more about that interface, yes, I like it less and less :)

But that's the userspace api you all are stuck with, for various
reasons, my opinion doesn't matter here.

> > I think the general rdma apis are the key here, not the userspace api.
> 
> Are you proposing that habana should have uAPI in drivers/misc and
> present a standard rdma-core userspace for it? This is the only
> userspace programming interface for RoCE HW. I think that would be
> much more work.
> 
> If not, what open source userspace are you going to ask them to
> present to merge the kernel side into misc?

I don't think that they have a userspace api to their rdma feature from
what I understand, but I could be totally wrong as I do not know their
hardware at all, so I'll let them answer this question.

> > Note, I do not know exactly what they are, but no, IBverbs are not ok.
> 
> Should we stop merging new drivers and abandon the RDMA subsystem? Is
> there something you'd like to see fixed?
> 
> Don't really understand your position, sorry.

For anything that _has_ to have a userspace RMDA interface, sure ibverbs
are the one we are stuck with, but I didn't think that was the issue
here at all, which is why I wrote the above comments.

thanks,

greg k-h

Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends

2020-09-20 Thread Thomas Gleixner

On Sun, Sep 20 2020 at 08:41, Thomas Gleixner wrote:
> On Sat, Sep 19 2020 at 10:18, Linus Torvalds wrote:
>> Maybe I've missed something.  Is it because the new interface still
>> does "pagefault_disable()" perhaps?
>>
>> But does it even need the pagefault_disable() at all? Yes, the
>> *atomic* one obviously needed it. But why does this new one need to
>> disable page faults?
>
> It disables pagefaults because it can be called from atomic and
> non-atomic context. That was the point to get rid of
>
>  if (preeemptible())
>   kmap();
>  else
> kmap_atomic();
>
> If it does not disable pagefaults, then it's just a lightweight variant
> of kmap() for short lived mappings.

Actually most usage sites of kmap atomic do not need page faults to be
disabled at all. As Daniel pointed out the implicit pagefault disable
enforces copy_from_user_inatomic() even in places which are clearly
preemptible task context.

As we need to look at each instance anyway we can add the PF disable
explicitely to the very few places which actually need it.

Thanks,

tglx

Re: [PATCH] lib/scatterlist: Avoid a double memset

2020-09-20 Thread Christophe JAILLET


Le 20/09/2020 à 10:32, Julia Lawall a écrit :



On Sun, 20 Sep 2020, Joe Perches wrote:


On Sun, 2020-09-20 at 09:15 +0200, Christophe JAILLET wrote:

'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
clear it twice.

Remove the redundant initialization.


I didn't look very thoroughly, but there are at least
a few more of these with kcalloc and kzalloc like

block/bsg-lib.c-size_t sz = (sizeof(struct scatterlist) * 
req->nr_phys_segments);
block/bsg-lib.c-
block/bsg-lib.c-BUG_ON(!req->nr_phys_segments);
block/bsg-lib.c-
block/bsg-lib.c-buf->sg_list = kzalloc(sz, GFP_KERNEL);
block/bsg-lib.c-if (!buf->sg_list)
block/bsg-lib.c-return -ENOMEM;
block/bsg-lib.c:sg_init_table(buf->sg_list, req->nr_phys_segments);
---
drivers/target/target_core_rd.c-sg = kcalloc(sg_per_table + 
chain_entry, sizeof(*sg),
drivers/target/target_core_rd.c-GFP_KERNEL);
drivers/target/target_core_rd.c-if (!sg)
drivers/target/target_core_rd.c-return -ENOMEM;
drivers/target/target_core_rd.c-
drivers/target/target_core_rd.c:sg_init_table(sg, sg_per_table 
+ chain_entry);
---
net/rds/rdma.c- sg = kcalloc(nents, sizeof(*sg), GFP_KERNEL);
net/rds/rdma.c- if (!sg) {
net/rds/rdma.c- ret = -ENOMEM;
net/rds/rdma.c- goto out;
net/rds/rdma.c- }
net/rds/rdma.c- WARN_ON(!nents);
net/rds/rdma.c: sg_init_table(sg, nents);


I found 16 occurrences in the following files:

net/rds/rdma.c
drivers/infiniband/hw/efa/efa_verbs.c
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
drivers/misc/mic/scif/scif_nodeqp.c
block/bsg-lib.c
drivers/dma/sh/rcar-dmac.c
drivers/spi/spi-topcliff-pch.c
net/sunrpc/xprtrdma/frwr_ops.c
drivers/dma/imx-dma.c
drivers/pci/p2pdma.c
drivers/dma/sh/shdma-base.c
drivers/target/target_core_rd.c
drivers/media/common/saa7146/saa7146_core.c
drivers/tty/serial/pch_uart.c
drivers/net/wireless/intel/iwlwifi/fw/dbg.c

julia



Also in:
sound/soc/sprd/sprd-pcm-dma.c
   sound/soc/sprd/sprd-pcm-compress.c
which are a bit tricky. It also uses some, IMHO, pointless devm_ functions.

Feel free to propose patches. I'm not focused on that at the moment.

CJ

[PATCH] media: firewire: fix memory leak

2020-09-20 Thread Pavel Machek


Fix memory leak in node_probe.

Signed-off-by: Pavel Machek (CIP) 

diff --git a/drivers/media/firewire/firedtv-fw.c 
b/drivers/media/firewire/firedtv-fw.c
index 3f1ca40b9b98..8a8585261bb8 100644
--- a/drivers/media/firewire/firedtv-fw.c
+++ b/drivers/media/firewire/firedtv-fw.c
@@ -272,8 +272,10 @@ static int node_probe(struct fw_unit *unit, const struct 
ieee1394_device_id *id)
 
name_len = fw_csr_string(unit->directory, CSR_MODEL,
 name, sizeof(name));
-   if (name_len < 0)
-   return name_len;
+   if (name_len < 0) {
+   err = name_len;
+   goto fail_free;
+   }
for (i = ARRAY_SIZE(model_names); --i; )
if (strlen(model_names[i]) <= name_len &&
strncmp(name, model_names[i], name_len) == 0)

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

[PATCH] 6lowpan: Add missing locking

2020-09-20 Thread Pavel Machek

I believe set_lock should be taken in exit function, too.

Signed-off-by: Pavel Machek (CIP) 


diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index cff4944d5b66..1420734394e9 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -1301,10 +1301,12 @@ static void __exit bt_6lowpan_exit(void)
debugfs_remove(lowpan_enable_debugfs);
debugfs_remove(lowpan_control_debugfs);
 
+   mutex_lock(&set_lock);
if (listen_chan) {
l2cap_chan_close(listen_chan, 0);
l2cap_chan_put(listen_chan);
}
+   mutex_unlock(&set_lock);
 
disconnect_devices();
 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

[PATCH 1/4] blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()

2020-09-20 Thread Baolin Wang

The throtl_downgrade_state() is always used to change to LIMIT_LOW
limitation, thus remove the latter meaningless parameter which
indicates the limitation index.

Signed-off-by: Baolin Wang 
---
 block/blk-throttle.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 36ba61c..4007b26 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1970,7 +1970,7 @@ static void throtl_upgrade_state(struct throtl_data *td)
queue_work(kthrotld_workqueue, &td->dispatch_work);
 }
 
-static void throtl_downgrade_state(struct throtl_data *td, int new)
+static void throtl_downgrade_state(struct throtl_data *td)
 {
td->scale /= 2;
 
@@ -1980,7 +1980,7 @@ static void throtl_downgrade_state(struct throtl_data 
*td, int new)
return;
}
 
-   td->limit_index = new;
+   td->limit_index = LIMIT_LOW;
td->low_downgrade_time = jiffies;
 }
 
@@ -2067,7 +2067,7 @@ static void throtl_downgrade_check(struct throtl_grp *tg)
 * cgroups
 */
if (throtl_hierarchy_can_downgrade(tg))
-   throtl_downgrade_state(tg->td, LIMIT_LOW);
+   throtl_downgrade_state(tg->td);
 
tg->last_bytes_disp[READ] = 0;
tg->last_bytes_disp[WRITE] = 0;
-- 
1.8.3.1

[PATCH 0/4] Some improvements for blk throttle

2020-09-20 Thread Baolin Wang

Hi,

This patch set did some improvements for blk throttle, please
help to review. Thanks.

Baolin Wang (4):
  blk-throttle: Remove a meaningless parameter for
throtl_downgrade_state()
  blk-throttle: Avoid getting the current time if tg->last_finish_time
is 0
  blk-throttle: Avoid tracking latency if low limit is invalid
  blk-throttle: Fix IO hang for a corner case

 block/blk-throttle.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

-- 
1.8.3.1

[PATCH 2/4] blk-throttle: Avoid getting the current time if tg->last_finish_time is 0

2020-09-20 Thread Baolin Wang

We only update the tg->last_finish_time when the low limitaion is
enabled, so we can move the tg->last_finish_time validation a little
forward to avoid getting the unnecessary current time stamp if the
the low limitation is not enabled.

Signed-off-by: Baolin Wang 
---
 block/blk-throttle.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 4007b26..7e72102 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2077,10 +2077,14 @@ static void throtl_downgrade_check(struct throtl_grp 
*tg)
 
 static void blk_throtl_update_idletime(struct throtl_grp *tg)
 {
-   unsigned long now = ktime_get_ns() >> 10;
+   unsigned long now;
unsigned long last_finish_time = tg->last_finish_time;
 
-   if (now <= last_finish_time || last_finish_time == 0 ||
+   if (last_finish_time == 0)
+   return;
+
+   now = ktime_get_ns() >> 10;
+   if (now <= last_finish_time ||
last_finish_time == tg->checked_last_finish_time)
return;
 
-- 
1.8.3.1

[PATCH 3/4] blk-throttle: Avoid tracking latency if low limit is invalid

2020-09-20 Thread Baolin Wang

The IO latency tracking is only for LOW limit, so we should add a
validation to avoid redundant latency tracking if the LOW limit
is not valid.

Signed-off-by: Baolin Wang 
---
 block/blk-throttle.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7e72102..b0d8f7c 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2100,7 +2100,7 @@ static void throtl_update_latency_buckets(struct 
throtl_data *td)
unsigned long last_latency[2] = { 0 };
unsigned long latency[2];
 
-   if (!blk_queue_nonrot(td->queue))
+   if (!blk_queue_nonrot(td->queue) || !td->limit_valid[LIMIT_LOW])
return;
if (time_before(jiffies, td->last_calculate_time + HZ))
return;
@@ -2338,6 +2338,8 @@ void blk_throtl_bio_endio(struct bio *bio)
if (!blkg)
return;
tg = blkg_to_tg(blkg);
+   if (!tg->td->limit_valid[LIMIT_LOW])
+   return;
 
finish_time_ns = ktime_get_ns();
tg->last_finish_time = finish_time_ns >> 10;
-- 
1.8.3.1

[PATCH 4/4] blk-throttle: Fix IO hang for a corner case

2020-09-20 Thread Baolin Wang

It can not scale up in throtl_adjusted_limit() if we set bps or iops is
1, which will cause IO hang when enable low limit. Thus we should treat
1 as a illegal value to avoid this issue.

Signed-off-by: Baolin Wang 
---
 block/blk-throttle.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index b0d8f7c..0649bce 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1687,13 +1687,13 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of,
goto out_finish;
 
ret = -EINVAL;
-   if (!strcmp(tok, "rbps"))
+   if (!strcmp(tok, "rbps") && val > 1)
v[0] = val;
-   else if (!strcmp(tok, "wbps"))
+   else if (!strcmp(tok, "wbps") && val > 1)
v[1] = val;
-   else if (!strcmp(tok, "riops"))
+   else if (!strcmp(tok, "riops") && val > 1)
v[2] = min_t(u64, val, UINT_MAX);
-   else if (!strcmp(tok, "wiops"))
+   else if (!strcmp(tok, "wiops") && val > 1)
v[3] = min_t(u64, val, UINT_MAX);
else if (off == LIMIT_LOW && !strcmp(tok, "idle"))
idle_time = val;
-- 
1.8.3.1

[PATCH v2] checkpatch: extend author Signed-off-by check for split From: header

2020-09-20 Thread Dwaipayan Ray

Checkpatch did not handle cases where the author From: header
was split into multiple lines. The author identity could not
be resolved and checkpatch generated a false NO_AUTHOR_SIGN_OFF
warning.

A typical example is Commit e33bcbab16d1 ("tee: add support for
session's client UUID generation"). When checkpatch was run on
this commit, it displayed:

"WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal
patch author ''"

This was due to split header lines not being handled properly and
the author himself wrote in Commit cd2614967d8b ("checkpatch: warn
if missing author Signed-off-by"):

"Split From: headers are not fully handled: only the first part
is compared."

Support split From: headers by correctly parsing the header
extension lines. RFC 2822, Section-2.2.3 stated that each extended
line must start with a WSP character (a space or htab). The solution
was therefore to concatenate the lines which start with a WSP to
get the correct long header.

Suggested-by: Joe Perches 
Link: 
https://lore.kernel.org/linux-kernel-mentees/f5d8124e54a50480b0a9fa638787bc29b6e09854.ca...@perches.com/
Signed-off-by: Dwaipayan Ray 
---
 scripts/checkpatch.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 504d2e431c60..9e65d21456f1 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2661,6 +2661,10 @@ sub process {
 # Check the patch for a From:
if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
$author = $1;
+   my $curline = $linenr;
+   while(defined($rawlines[$curline]) && 
($rawlines[$curline++] =~ /^[ \t]\s*(.*)/)) {
+   $author .= $1;
+   }
$author = encode("utf8", $author) if ($line =~ 
/=\?utf-8\?/i);
$author =~ s/"//g;
$author = reformat_email($author);
-- 
2.27.0

KMSAN: uninit-value in is_idle_task

2020-09-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:3b3ea602 x86: add failure injection to get/put/clear_user
git tree:   https://github.com/google/kmsan.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=144b896590
kernel config:  https://syzkaller.appspot.com/x/.config?x=ee5f7a0b2e48ed66
dashboard link: https://syzkaller.appspot.com/bug?extid=a4b29e75da1c2c8bd34f
compiler:   clang version 10.0.0 (https://github.com/llvm/llvm-project/ 
c2443155a0fb245c8f17f2c1c72b6ea391e86e81)

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a4b29e75da1c2c8bd...@syzkaller.appspotmail.com

=
BUG: KMSAN: uninit-value in is_idle_task+0x65/0x70 include/linux/sched.h:1672
CPU: 0 PID: 5351 Comm: syz-executor.3 Not tainted 5.8.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:118
 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
 is_idle_task+0x65/0x70 include/linux/sched.h:1672
 sysvec_apic_timer_interrupt+0x2e/0x130 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:593
RIP: 0010:kmsan_get_metadata+0x136/0x180 mm/kmsan/kmsan_shadow.c:201
Code: 75 5f 48 89 df e8 6a 00 00 00 48 85 c0 74 50 48 8b 48 40 48 85 c9 74 47 
48 8b 40 48 48 85 c0 74 3e 81 e3 ff 0f 00 00 45 84 f6 <48> 0f 45 c8 48 b8 00 00 
00 00 00 16 00 00 48 01 c8 48 c1 e8 04 48
RSP: 0018:8881cde9f460 EFLAGS: 0246
RAX: ea00090451b0 RBX: 0608 RCX: ea00090311b0
RDX: 0001cde9f608 RSI: 0720 RDI: 8881cde9f608
RBP: 8881cde9f478 R08: ea0f R09: 88812fffa000
R10: 0004 R11:  R12: 006a
R13:  R14:  R15: 
 kmsan_get_shadow_origin_ptr+0x6c/0xb0 mm/kmsan/kmsan_shadow.c:149
 __msan_metadata_ptr_for_load_4+0x10/0x20 mm/kmsan/kmsan_instr.c:54
 vga_8planes_fillrect drivers/video/fbdev/vga16fb.c:855 [inline]
 vga16fb_fillrect+0xbbc/0x1870 drivers/video/fbdev/vga16fb.c:947
 bit_clear_margins+0x582/0x740 drivers/video/fbdev/core/bitblit.c:232
 fbcon_clear_margins drivers/video/fbdev/core/fbcon.c:1381 [inline]
 fbcon_switch+0x2e79/0x3830 drivers/video/fbdev/core/fbcon.c:2363
 redraw_screen+0x9b2/0x2980 drivers/tty/vt/vt.c:1015
 fbcon_modechanged+0x110d/0x1320 drivers/video/fbdev/core/fbcon.c:3001
 fbcon_update_vcs+0x86/0xa0 drivers/video/fbdev/core/fbcon.c:3048
 fb_set_var+0x1420/0x1850 drivers/video/fbdev/core/fbmem.c:1056
 do_fb_ioctl+0xc00/0x1150 drivers/video/fbdev/core/fbmem.c:1109
 fb_ioctl+0x1e4/0x210 drivers/video/fbdev/core/fbmem.c:1185
 vfs_ioctl fs/ioctl.c:48 [inline]
 ksys_ioctl fs/ioctl.c:753 [inline]
 __do_sys_ioctl fs/ioctl.c:762 [inline]
 __se_sys_ioctl+0x319/0x4d0 fs/ioctl.c:760
 __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:760
 do_syscall_64+0xad/0x160 arch/x86/entry/common.c:386
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45d5f9
Code: Bad RIP value.
RSP: 002b:7fdcda761c78 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: e280 RCX: 0045d5f9
RDX: 20c0 RSI: 4601 RDI: 0003
RBP: 0118cf80 R08:  R09: 
R10:  R11: 0246 R12: 0118cf4c
R13: 0169fb6f R14: 7fdcda7629c0 R15: 0118cf4c

[ cut here ]
slab index 32768 out of bounds (558) for stack id 262b8000
WARNING: CPU: 0 PID: 5351 at lib/stackdepot.c:235 stack_depot_fetch+0x2d/0x60 
lib/stackdepot.c:234


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

kernel BUG at fs/f2fs/segment.h:LINE!

2020-09-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:fc4f28bb Merge tag 'for-5.9-rc5-tag' of git://git.kernel.o..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=134d97dd90
kernel config:  https://syzkaller.appspot.com/x/.config?x=c61f6bd349c981f3
dashboard link: https://syzkaller.appspot.com/bug?extid=3698081bcf0bb2d12174
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3698081bcf0bb2d12...@syzkaller.appspotmail.com

[ cut here ]
kernel BUG at fs/f2fs/segment.h:657!
invalid opcode:  [#1] PREEMPT SMP KASAN
CPU: 1 PID: 16220 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:f2fs_ra_meta_pages+0xa51/0xdc0 fs/f2fs/segment.h:657
Code: 08 49 83 ed 01 e9 3f f8 ff ff c7 44 24 18 80 00 00 00 45 31 ff 45 31 f6 
31 f6 41 bc 01 00 00 00 e9 cd fb ff ff e8 0f 48 3e fe <0f> 0b e8 08 48 3e fe 4c 
8b 6c 24 08 be 04 00 00 00 4c 89 ef e8 76
RSP: 0018:c9000b4177a8 EFLAGS: 00010246
RAX: 0004 RBX: dc00 RCX: c9000ac51000
RDX: 0004 RSI: 833605d1 RDI: 0004
RBP: 88804e738000 R08: 0001 R09: ea00060e7337
R10: 0037 R11:  R12: 0001
R13: 0601 R14: 88808cb73800 R15: 001e
FS:  7fcc23249700() GS:8880ae70() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 560df32133b7 CR3: 74a09000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 build_sit_entries fs/f2fs/segment.c:4195 [inline]
 f2fs_build_segment_manager+0x4b8a/0xa3c0 fs/f2fs/segment.c:4779
 f2fs_fill_super+0x377d/0x6b80 fs/f2fs/super.c:3633
 mount_bdev+0x32e/0x3f0 fs/super.c:1417
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1547
 do_new_mount fs/namespace.c:2875 [inline]
 path_mount+0x1387/0x2070 fs/namespace.c:3192
 do_mount fs/namespace.c:3205 [inline]
 __do_sys_mount fs/namespace.c:3413 [inline]
 __se_sys_mount fs/namespace.c:3390 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x46004a
Code: b8 a6 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd 89 fb ff c3 66 2e 0f 1f 
84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 
da 89 fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:7fcc23248a88 EFLAGS: 0202 ORIG_RAX: 00a5
RAX: ffda RBX: 7fcc23248b20 RCX: 0046004a
RDX: 2000 RSI: 2100 RDI: 7fcc23248ae0
RBP: 7fcc23248ae0 R08: 7fcc23248b20 R09: 2000
R10:  R11: 0202 R12: 2000
R13: 2100 R14: 2200 R15: 20012400
Modules linked in:
---[ end trace a60c6b47a90afa81 ]---
RIP: 0010:f2fs_ra_meta_pages+0xa51/0xdc0 fs/f2fs/segment.h:657
Code: 08 49 83 ed 01 e9 3f f8 ff ff c7 44 24 18 80 00 00 00 45 31 ff 45 31 f6 
31 f6 41 bc 01 00 00 00 e9 cd fb ff ff e8 0f 48 3e fe <0f> 0b e8 08 48 3e fe 4c 
8b 6c 24 08 be 04 00 00 00 4c 89 ef e8 76
RSP: 0018:c9000b4177a8 EFLAGS: 00010246
RAX: 0004 RBX: dc00 RCX: c9000ac51000
RDX: 0004 RSI: 833605d1 RDI: 0004
RBP: 88804e738000 R08: 0001 R09: ea00060e7337
R10: 0037 R11:  R12: 0001
R13: 0601 R14: 88808cb73800 R15: 001e
FS:  7fcc23249700() GS:8880ae70() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f253183c000 CR3: 74a09000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

[GIT PULL] EDAC urgent for v5.9-rc6

2020-09-20 Thread Borislav Petkov

Hi Linus,

please pull two more fixes for ghes_edac resulting from playing with
CONFIG_DEBUG_TEST_DRIVER_REMOVE=y.

Thx.

---
The following changes since commit 856deb866d16e29bd65952e0289066f6078af773:

  Linux 5.9-rc5 (2020-09-13 16:06:00 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/edac_urgent_for_v5.9_rc6

for you to fetch changes up to 251c54ea26fa6029b01a76161a37a12fde5124e4:

  EDAC/ghes: Check whether the driver is on the safe list correctly (2020-09-15 
09:42:15 +0200)


Two fixes for resulting from CONFIG_DEBUG_TEST_DRIVER_REMOVE=y experiments:

* The first one completes a previous fix to reset a local structure
containing scanned system data properly so that the driver rescans, as
it should, on a second load.

* The second one addresses a refcount underflow due to not paying
attention to the driver whitelest on unregister.


Borislav Petkov (2):
  EDAC/ghes: Clear scanned data on unload
  EDAC/ghes: Check whether the driver is on the safe list correctly

 drivers/edac/ghes_edac.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 54ebc8afc6b1..94d1e3165052 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -508,6 +508,7 @@ int ghes_edac_register(struct ghes *ghes, struct device 
*dev)
if (!force_load && idx < 0)
return -ENODEV;
} else {
+   force_load = true;
idx = 0;
}
 
@@ -629,9 +630,13 @@ void ghes_edac_unregister(struct ghes *ghes)
struct mem_ctl_info *mci;
unsigned long flags;
 
+   if (!force_load)
+   return;
+
mutex_lock(&ghes_reg_mutex);
 
system_scanned = false;
+   memset(&ghes_hw, 0, sizeof(struct ghes_hw_desc));
 
if (!refcount_dec_and_test(&ghes_refcount))
goto unlock;

-- 
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG 
Nürnberg

RE: R: [PATCH v2 4/4] dt-bindings: net: Document use of mac-address-increment

2020-09-20 Thread ansuelsmth

> On Sun, Sep 20, 2020 at 02:39:39AM +0200, ansuels...@gmail.com
> wrote:
> >
> >
> > > -Messaggio originale-
> > > Da: Andrew Lunn 
> > > Inviato: domenica 20 settembre 2020 02:31
> > > A: Ansuel Smith 
> > > Cc: Miquel Raynal ; Richard Weinberger
> > > ; Vignesh Raghavendra ; Rob
> Herring
> > > ; David S. Miller ; Jakub
> > > Kicinski ; Heiner Kallweit ;
> > > Russell King ; Frank Rowand
> > > ; Boris Brezillon ;
> linux-
> > > m...@lists.infradead.org; devicet...@vger.kernel.org; linux-
> > > ker...@vger.kernel.org; net...@vger.kernel.org
> > > Oggetto: Re: [PATCH v2 4/4] dt-bindings: net: Document use of mac-
> > > address-increment
> > >
> > > > +  mac-address-increment:
> > > > +description:
> > > > +  The MAC address can optionally be increased (or decreased
using
> > > > +  negative values) from the original value readed (from a nvmem
> > cell
> > >
> > > Read is irregular, there is no readed, just read.
> > >
> > > > +  for example). This can be used if the mac is readed from a
> > dedicated
> > > > +  partition and must be increased based on the number of device
> > > > +  present in the system.
> > >
> > > You should probably add there is no underflow/overflow to other bytes
> > > of the MAC address. 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00.
> > >
> > > > +minimum: -255
> > > > +maximum: 255
> > > > +
> > > > +  mac-address-increment-byte:
> > > > +description:
> > > > +  If 'mac-address-increment' is defined, this will tell what
byte
> > of
> > > > +  the mac-address will be increased. If 'mac-address-increment'
is
> > > > +  not defined, this option will do nothing.
> > > > +default: 5
> > > > +minimum: 0
> > > > +maximum: 5
> > >
> > > Is there a real need for this? A value of 0 seems like a bad idea,
> > > since a unicast address could easily become a multicast address, which
> > > is not valid for an interface address. It also does not seem like a
> > > good idea to allow the OUI to be changed. So i think only bytes 3-5
> > > should be allowed, but even then, i don't think this is needed, unless
> > > you do have a clear use case.
> > >
> > > Andrew
> >
> > Honestly the mac-address-increment-byte is added to give user some
> control
> > but I
> > don't really have a use case for it. Should I limit it to 3 or just
remove
> > the function?
> 
> If you have no use case, just remove it and document that last byte
> will be incremented. I somebody does need it, it can be added in a
> backwards compatible way.
> 
>  Andrew

I just rechecked mac-address-increment-byte and we have one device that
use it and would benefits from this.  I will change the Documentation to
min 3 and leave it.

[PATCH v3 1/4] mtd: Add nvmem support for mtd nvmem-providers

2020-09-20 Thread Ansuel Smith

Introduce 2 new bindings for the mtd structure.
Mtd partitions can be set as 'nvmem-provider' and any subpartition defined
with the tag 'nvmem-cell' are skipped by the 'fixed-partitions' parser
and registred as a nvmem cell by the nvmem api.

Signed-off-by: Ansuel Smith 
---
 drivers/mtd/mtdcore.c| 3 ++-
 drivers/mtd/parsers/ofpart.c | 8 
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 7d930569a7df..ba5236db8318 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -551,6 +551,7 @@ static int mtd_nvmem_reg_read(void *priv, unsigned int 
offset,
 
 static int mtd_nvmem_add(struct mtd_info *mtd)
 {
+   struct device_node *mtd_node = mtd_get_of_node(mtd);
struct nvmem_config config = {};
 
config.id = -1;
@@ -563,7 +564,7 @@ static int mtd_nvmem_add(struct mtd_info *mtd)
config.stride = 1;
config.read_only = true;
config.root_only = true;
-   config.no_of_node = true;
+   config.no_of_node = !of_property_read_bool(mtd_node, "nvmem-provider");
config.priv = mtd;
 
mtd->nvmem = nvmem_register(&config);
diff --git a/drivers/mtd/parsers/ofpart.c b/drivers/mtd/parsers/ofpart.c
index daf507c123e6..442e039214bc 100644
--- a/drivers/mtd/parsers/ofpart.c
+++ b/drivers/mtd/parsers/ofpart.c
@@ -61,6 +61,10 @@ static int parse_fixed_partitions(struct mtd_info *master,
if (!dedicated && node_has_compatible(pp))
continue;
 
+   /* skip adding if a nvmem-cell is detected */
+   if (of_property_read_bool(pp, "nvmem-cell"))
+   continue;
+
nr_parts++;
}
 
@@ -80,6 +84,10 @@ static int parse_fixed_partitions(struct mtd_info *master,
if (!dedicated && node_has_compatible(pp))
continue;
 
+   /* skip adding if a nvmem-cell is detected */
+   if (of_property_read_bool(pp, "nvmem-cell"))
+   continue;
+
reg = of_get_property(pp, "reg", &len);
if (!reg) {
if (dedicated) {
-- 
2.27.0

[PATCH v3 2/4] dt-bindings: mtd: partition: Document use of nvmem-provider

2020-09-20 Thread Ansuel Smith

Document the use of this 2 new bindings, nvmem-provider and nvmem-cell,
used to describe the nvmem cell that the subpartition provide to the
nvmem api and the system. Nvmem cell are direct subnode of the
subpartition and are skipped by the 'fixed-partitions' parser if they
contain the 'nvmem-cell' tag. The subpartition must have the
'nvmem-provider' tag or the subpartition will not register the cell to
the nvmem api.

Signed-off-by: Ansuel Smith 
---
 .../devicetree/bindings/mtd/partition.txt | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/Documentation/devicetree/bindings/mtd/partition.txt 
b/Documentation/devicetree/bindings/mtd/partition.txt
index 4a39698221a2..66d3a3f0a021 100644
--- a/Documentation/devicetree/bindings/mtd/partition.txt
+++ b/Documentation/devicetree/bindings/mtd/partition.txt
@@ -64,6 +64,16 @@ Optional properties:
 - slc-mode: This parameter, if present, allows one to emulate SLC mode on a
   partition attached to an MLC NAND thus making this partition immune to
   paired-pages corruptions
+- nvmem-provider : Optionally a subpartition can be set as a nvmem-provider. 
This can
+  be very useful if some data like the mac-address is stored in a special 
partition
+  at a specific offset. Subpartition that describe nvmem-cell must have set the
+  'nvmem-cell' of they will be treated as a subpartition and not skipped and 
registred
+  as nvmem cells. In this specific case '#address-cells' and '#size-cells' 
must be
+  provided.
+- nvmem-cell : A direct subnode of a subpartition can be described as a 
nvmem-cell and
+  skipped by the fixed-partition parser and registred as a nvmem-cell of the 
registred
+  nvmem subpartition IF it does contain the 'nvmem-provider tag. If the 
subpartition
+  lacks of such tag the subnode will be skipped and the nvmem api won't 
register them.
 
 Examples:
 
@@ -158,3 +168,52 @@ flash@3 {
};
};
 };
+
+flash@0 {
+   partitions {
+   compatible = "fixed-partitions";
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   partition@0 {
+   label = "u-boot";
+   reg = <0x000 0x10>;
+   read-only;
+   };
+
+   art: art@120 {
+   label = "art";
+   reg = <0x120 0x014>;
+   read-only;
+   nvmem-provider;
+
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   macaddr_gmac1: macaddr_gmac1@0 {
+   nvmem-cell;
+   reg = <0x0 0x6>;
+   };
+
+   macaddr_gmac2: macaddr_gmac2@6 {
+   nvmem-cell;
+   reg = <0x6 0x6>;
+   };
+
+   macaddr_wifi: macaddr_wifi@6 {
+   nvmem-cell;
+   reg = <0x6 0x6>;
+   };
+
+   pre_cal_24g: pre_cal_24g@1000 {
+   nvmem-cell;
+   reg = <0x1000 0x2f20>;
+   };
+
+   pre_cal_5g: pre_cal_5g@5000{
+   nvmem-cell;
+   reg = <0x5000 0x2f20>;
+   };
+   };
+   };
+};
\ No newline at end of file
-- 
2.27.0

[PATCH v3 0/4] Actually implement nvmem support for mtd

2020-09-20 Thread Ansuel Smith

The mtd support for the nvmem api has been stalled from 2018 with a patch
half pushed hoping that a scheme is found for the mtd name later. This
pathset try to address this and add a very needed feature for the
mac-address.

My solution to the already discussed problem here [1] is to keep it simple.
A mtd subpartition can be set as a nvmem-provider with a specific tag and
each direct subnode is treat by the nvmem api as a nvmem-cell and gets
correctly registred. The mtd driver will treat a direct subnode of the
subpartition as a legacy subpartition of the subpartition using the
'fixed-partition' parser. To fix this every nvmem-cell has to have the
'nvmem-cell' tag and the fixed-partition parser will skip these node if 
this tag is detected. Simple as that. The subpartition with the tag 
'nvmem-provider' will then be set by the config to not skip the of
registration in the config and the nvmem-cells are correctly registred
and can be used to set mac-address of the devices on the system.

The last 2 patch try to address a problem in the embedded devices (mostly
routers) that have the mac-address saved in a dedicated partition and is
a ""master"" mac-address. Each device increment or decrement the extracted
mac-address based on the device number. The increment function is
implemented in the of_net function to get the mac-address that every net
driver should allready use if they don't have a trusty mac-address source.
(One example of this implementation are many ath10k device that gets the
mac-address from the art mtd partition assigned to the network driver and
increments it 1 for the wifi 2.4ghz and 2 for the wifi 5ghz).

I really hope my mtd nvmem implementation doesn't conflicts with others
and can be used, since this would remove many patch used to get mac-address
and other nvmem data.

[1] https://lore.kernel.org/patchwork/patch/765435/

Changes:
v3:
* Fix const discard warning in of_net.c
* Add some info about overflow/underflow of mac-increment
* Limit mac-increment-bytes to the last 3 bytes
v2:
* Fix compile error (missing mtd_node in mtdcore)

Ansuel Smith (4):
  mtd: Add nvmem support for mtd nvmem-providers
  dt-bindings: mtd: partition: Document use of nvmem-provider
  of_net: add mac-address-increment support
  dt-bindings: net: Document use of mac-address-increment

 .../devicetree/bindings/mtd/partition.txt | 59 +++
 .../bindings/net/ethernet-controller.yaml | 21 +++
 drivers/mtd/mtdcore.c |  3 +-
 drivers/mtd/parsers/ofpart.c  |  8 +++
 drivers/of/of_net.c   | 57 ++
 5 files changed, 134 insertions(+), 14 deletions(-)

-- 
2.27.0

[PATCH v3 3/4] of_net: add mac-address-increment support

2020-09-20 Thread Ansuel Smith

Lots of embedded devices use the mac-address of other interface
extracted from nvmem cells and increments it by one or two. Add two
bindings to integrate this and directly use the right mac-address for
the interface. Some example are some routers that use the gmac
mac-address stored in the art partition and increments it by one for the
wifi. mac-address-increment-byte bindings is used to tell what byte of
the mac-address has to be increased (if not defined the last byte is
increased) and mac-address-increment tells how much the byte decided
early has to be increased.

Signed-off-by: Ansuel Smith 
---
 drivers/of/of_net.c | 57 ++---
 1 file changed, 44 insertions(+), 13 deletions(-)

diff --git a/drivers/of/of_net.c b/drivers/of/of_net.c
index 6e411821583e..bafbc833e659 100644
--- a/drivers/of/of_net.c
+++ b/drivers/of/of_net.c
@@ -45,7 +45,7 @@ int of_get_phy_mode(struct device_node *np, phy_interface_t 
*interface)
 }
 EXPORT_SYMBOL_GPL(of_get_phy_mode);
 
-static const void *of_get_mac_addr(struct device_node *np, const char *name)
+static void *of_get_mac_addr(struct device_node *np, const char *name)
 {
struct property *pp = of_find_property(np, name, NULL);
 
@@ -54,26 +54,31 @@ static const void *of_get_mac_addr(struct device_node *np, 
const char *name)
return NULL;
 }
 
-static const void *of_get_mac_addr_nvmem(struct device_node *np)
+static void *of_get_mac_addr_nvmem(struct device_node *np, int *err)
 {
int ret;
-   const void *mac;
+   void *mac;
u8 nvmem_mac[ETH_ALEN];
struct platform_device *pdev = of_find_device_by_node(np);
 
-   if (!pdev)
-   return ERR_PTR(-ENODEV);
+   if (!pdev) {
+   *err = -ENODEV;
+   return NULL;
+   }
 
ret = nvmem_get_mac_address(&pdev->dev, &nvmem_mac);
if (ret) {
put_device(&pdev->dev);
-   return ERR_PTR(ret);
+   *err = ret;
+   return NULL;
}
 
mac = devm_kmemdup(&pdev->dev, nvmem_mac, ETH_ALEN, GFP_KERNEL);
put_device(&pdev->dev);
-   if (!mac)
-   return ERR_PTR(-ENOMEM);
+   if (!mac) {
+   *err = -ENOMEM;
+   return NULL;
+   }
 
return mac;
 }
@@ -98,24 +103,50 @@ static const void *of_get_mac_addr_nvmem(struct 
device_node *np)
  * this case, the real MAC is in 'local-mac-address', and 'mac-address' exists
  * but is all zeros.
  *
+ * DT can tell the system to increment the mac-address after is extracted by
+ * using:
+ * - mac-address-increment-byte to decide what byte to increase
+ *   (if not defined is increased the last byte)
+ * - mac-address-increment to decide how much to increase. The value will
+ *   not overflow to other bytes if the increment is over 255.
+ *   (example 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00)
+ *
  * Return: Will be a valid pointer on success and ERR_PTR in case of error.
 */
 const void *of_get_mac_address(struct device_node *np)
 {
-   const void *addr;
+   u32 inc_idx, mac_inc;
+   int ret = 0;
+   u8 *addr;
+
+   /* Check first if the increment byte is present and valid.
+* If not set assume to increment the last byte if found.
+*/
+   if (of_property_read_u32(np, "mac-address-increment-byte", &inc_idx))
+   inc_idx = 5;
+   if (inc_idx < 3 || inc_idx > 5)
+   return ERR_PTR(-EINVAL);
 
addr = of_get_mac_addr(np, "mac-address");
if (addr)
-   return addr;
+   goto found;
 
addr = of_get_mac_addr(np, "local-mac-address");
if (addr)
-   return addr;
+   goto found;
 
addr = of_get_mac_addr(np, "address");
if (addr)
-   return addr;
+   goto found;
+
+   addr = of_get_mac_addr_nvmem(np, &ret);
+   if (ret)
+   return ERR_PTR(ret);
+
+found:
+   if (!of_property_read_u32(np, "mac-address-increment", &mac_inc))
+   addr[inc_idx] += mac_inc;
 
-   return of_get_mac_addr_nvmem(np);
+   return addr;
 }
 EXPORT_SYMBOL(of_get_mac_address);
-- 
2.27.0

[PATCH v3 4/4] dt-bindings: net: Document use of mac-address-increment

2020-09-20 Thread Ansuel Smith

Two new bindings are now supported by the of_net driver to increase (or
decrease) a mac-address. This can be very useful in case where the
system extract the mac-address for the device from a dedicated partition
and have a generic mac-address that needs to be incremented based on the
device number.
- mac-address-increment-byte is used to tell what byte must be
  incremented (if not set the last byte is increased)
- mac-address-increment is used to tell how much to increment of the
  extracted mac-address decided byte.

Signed-off-by: Ansuel Smith 
---
 .../bindings/net/ethernet-controller.yaml | 21 +++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml 
b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
index fa2baca8c726..8174f64e79bd 100644
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
@@ -32,6 +32,27 @@ properties:
   - minItems: 6
 maxItems: 6
 
+  mac-address-increment:
+description:
+  The MAC address can optionally be increased (or decreased using
+  negative values) from the original value read (from a nvmem cell
+  for example). This can be used if the mac is read from a dedicated
+  partition and must be increased based on the number of device
+  present in the system. The increment will not overflow/underflow to
+  the other bytes if it's over 255 or 0.
+  (example 00:01:02:03:04:ff + 1 == 00:01:02:03:04:00)
+minimum: -255
+maximum: 255
+
+  mac-address-increment-byte:
+description:
+  If 'mac-address-increment' is defined, this will tell what byte of
+  the mac-address will be increased. If 'mac-address-increment' is
+  not defined, this option will do nothing.
+default: 5
+minimum: 3
+maximum: 5
+
   max-frame-size:
 $ref: /schemas/types.yaml#definitions/uint32
 description:
-- 
2.27.0

Re: [RFC PATCH 21/24] mm/hugetlb: Merge pte to huge pmd only for gigantic page

2020-09-20 Thread Muchun Song

On Tue, Sep 15, 2020 at 9:03 PM Muchun Song  wrote:
>
> Merge pte to huge pmd if it has ever been split. Now only support
> gigantic page which's vmemmap pages size is an integer multiple of
> PMD_SIZE. This is the simplest case to handle.
>
> Signed-off-by: Muchun Song 
> ---
>  include/linux/hugetlb.h |   7 +++
>  mm/hugetlb.c| 104 +++-
>  2 files changed, 109 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index e3aa192f1c39..c56df0da7ae5 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -611,6 +611,13 @@ static inline bool vmemmap_pmd_huge(pmd_t *pmd)
>  }
>  #endif
>
> +#ifndef vmemmap_pmd_mkhuge
> +static inline pmd_t vmemmap_pmd_mkhuge(struct page *page)
> +{
> +   return pmd_mkhuge(mk_pmd(page, PAGE_KERNEL));
> +}
> +#endif
> +
>  #ifndef VMEMMAP_HPAGE_SHIFT
>  #define VMEMMAP_HPAGE_SHIFTPMD_SHIFT
>  #endif
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 28c154679838..3ca36e259b4e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1759,6 +1759,62 @@ static void __remap_huge_page_pte_vmemmap(struct page 
> *reuse, pte_t *ptep,
> }
>  }
>
> +static void __replace_huge_page_pte_vmemmap(pte_t *ptep, unsigned long start,
> +   unsigned int nr, struct page 
> *huge,
> +   struct list_head *free_pages)
> +{
> +   unsigned long addr;
> +   unsigned long end = start + (nr  << PAGE_SHIFT);
> +
> +   for (addr = start; addr < end; addr += PAGE_SIZE, ptep++) {
> +   struct page *page;
> +   pte_t old = *ptep;
> +   pte_t entry;
> +
> +   prepare_vmemmap_page(huge);
> +
> +   entry = mk_pte(huge++, PAGE_KERNEL);
> +   VM_WARN_ON(!pte_present(old));
> +   page = pte_page(old);
> +   list_add(&page->lru, free_pages);
> +
> +   set_pte_at(&init_mm, addr, ptep, entry);
> +   }
> +}
> +
> +static void replace_huge_page_pmd_vmemmap(pmd_t *pmd, unsigned long start,
> + struct page *huge,
> + struct list_head *free_pages)
> +{
> +   unsigned long end = start + VMEMMAP_HPAGE_SIZE;
> +
> +   flush_cache_vunmap(start, end);
> +   __replace_huge_page_pte_vmemmap(pte_offset_kernel(pmd, start), start,
> +   VMEMMAP_HPAGE_NR, huge, free_pages);
> +   flush_tlb_kernel_range(start, end);
> +}
> +
> +static pte_t *merge_vmemmap_pte(pmd_t *pmdp, unsigned long addr)
> +{
> +   pte_t *pte;
> +   struct page *page;
> +
> +   pte = pte_offset_kernel(pmdp, addr);
> +   page = pte_page(*pte);
> +   set_pmd(pmdp, vmemmap_pmd_mkhuge(page));
> +
> +   return pte;
> +}
> +
> +static void merge_huge_page_pmd_vmemmap(pmd_t *pmd, unsigned long start,
> +   struct page *huge,
> +   struct list_head *free_pages)
> +{
> +   replace_huge_page_pmd_vmemmap(pmd, start, huge, free_pages);
> +   pte_free_kernel(&init_mm, merge_vmemmap_pte(pmd, start));
> +   flush_tlb_kernel_range(start, start + VMEMMAP_HPAGE_SIZE);
> +}
> +
>  static inline void alloc_vmemmap_pages(struct hstate *h, struct list_head 
> *list)
>  {
> int i;
> @@ -1772,6 +1828,15 @@ static inline void alloc_vmemmap_pages(struct hstate 
> *h, struct list_head *list)
> }
>  }
>
> +static inline void dissolve_compound_page(struct page *page, unsigned int 
> order)
> +{
> +   int i;
> +   unsigned int nr_pages = 1 << order;
> +
> +   for (i = 1; i < nr_pages; i++)
> +   set_page_refcounted(page + i);
> +}
> +
>  static void alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
>  {
> pmd_t *pmd;
> @@ -1791,10 +1856,45 @@ static void alloc_huge_page_vmemmap(struct hstate *h, 
> struct page *head)
> __remap_huge_page_pte_vmemmap);
> if (!freed_vmemmap_hpage_dec(pmd_page(*pmd)) && pmd_split(pmd)) {
> /*
> -* Todo:
> -* Merge pte to huge pmd if it has ever been split.
> +* Merge pte to huge pmd if it has ever been split. Now only
> +* support gigantic page which's vmemmap pages size is an
> +* integer multiple of PMD_SIZE. This is the simplest case
> +* to handle.
>  */
> clear_pmd_split(pmd);
> +
> +   if (IS_ALIGNED(nr_vmemmap(h), VMEMMAP_HPAGE_NR)) {
> +   unsigned long addr = (unsigned long)head;
> +   unsigned long end = addr + nr_vmemmap_size(h);
> +
> +   spin_unlock(ptl);
> +
> +   for (; addr < end; addr += VMEMMAP_HPAGE_SIZE) {
> +   void

Re: [PATCH] PR_SPEC_DISABLE_NOEXEC support for arm64.

2020-09-20 Thread Anthony Steinhauser

> As a heads up: I'm currently reworking most of this, and hope to post
> something within the next two weeks.

Sure. Let me know whether you want to implement the
PR_SPEC_DISABLE_NOEXEC support directly or whether this patch would be
relevant even after your rework.
>
> > diff --git a/arch/arm64/include/asm/ssbd.h b/arch/arm64/include/asm/ssbd.h
> > new file mode 100644
> > index ..68c716dc5811
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/ssbd.h
...
> > +}
>
> I'd prefer to keep these where they are and have an out-of-line call if
> necessary. We should try to keep the SSBD stuff in one place.

OK.
>
> > +
> > +#endif   /* __ASM_SSBD_H */
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 6089638c7d43..ad3c67c86c4c 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -54,6 +54,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >
> >  #if defined(CONFIG_STACKPROTECTOR) && 
> > !defined(CONFIG_STACKPROTECTOR_PER_TASK)
> > @@ -588,6 +589,18 @@ void arch_setup_new_exec(void)
> >   current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
> >
> >   ptrauth_thread_init_user(current);
> > +
> > + /*
> > +  * Don't inherit TIF_SSBD across exec boundary when
> > +  * PR_SPEC_DISABLE_NOEXEC is used.
> > +  */
> > + if (test_thread_flag(TIF_SSBD) &&
> > + task_spec_ssb_noexec(current)) {
> > + clear_thread_flag(TIF_SSBD);
> > + task_clear_spec_ssb_disable(current);
> > + task_clear_spec_ssb_noexec(current);
> > + ssbd_ssbs_enable(current);
> > + }
>
> How is this supposed to work with CPUs that expose SSBS directly to
> userspace? I suppose we should be using PR_SPEC_DISABLE_NOEXEC to decide
> what we set the SSBS bit to on exec, but the logic here requires TIF_SSBD
> to be set and so won't trigger afaict.
>

You're right. The SSBS support is incomplete. I guess
"test_thread_flag(TIF_SSBD)" can be replaced just with
"arm64_get_ssbd_state() == ARM64_SSBD_KERNEL".
Thanks,
Anthony

Re: [PATCH v8 04/20] gpio: uapi: define uAPI v2

2020-09-20 Thread Kent Gibson

On Tue, Sep 15, 2020 at 01:06:36PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 9, 2020 at 1:29 PM Kent Gibson  wrote:
> >
> 
> Thanks again for doing this and sorry for quite a delay from my side.
> Below my comments regarding uAPI v2 (mostly nit-picks as I agree with
> it in general).
> 

No problems with most of your points - so I'll chop those bits out.
Responses to the remainder below...

> > +/**
> > + * enum gpio_v2_line_attr_id - &struct gpio_v2_line_attribute.id values
> 
> Perhaps per-item description (couple of words per each)?
> 

I'll add kernel doc for all the enum values - see if that works.

> > + * GPIO_V2_LINE_FLAG_OUTPUT etc, OR:ed together.  This is the default for
> 
> I think "OR:ed together" is understandable for programmers but a bit
> special for normal phrases. I would rephrase it somehow, but I'm not a
> native speaker.
> Something like "or any of their combinations".
> 

My bad - I was assuming the primary audience was programmers ;-).

Actually that description is drawn from the v1 uAPI, as is the case for
all of the fields that were carried over.

And "or any of their combinations" is misleading - some combinations are
invalid.

I'll take another look at it, but I'm ok with it as is. (and I' ok with
'ok' over 'OK' or 'okay' as well, btw ;-)

> > + * @attrs: the configuration attributes associated with the requested
> > + * lines.  Any attribute should only be associated with a particular line
> > + * once.  If an attribute is associated with a line multiple times then the
> > + * first occurrence (i.e. lowest index) has precedence.
> 
> This is a bit unexpected. I anticipated last come last served as in a
> shell script argument list.
> 

I don't want to encourage userspace to just add another entry to attrs
to encode a change.
The behaviour was originally as you describe, but was changed in v3 to
encourage userspace to keep the configuration attributes as simplified as
possible, and to make clearer that the fields, particularly the flags bits,
are not overlayed.

> > +   /*
> > +* Pad to fill implicit padding and provide space for future use.
> > +*/
> 
> > +   __u32 padding[5];
> 
> This is still questionable. First of all why to waste so many bytes (I
> propose 5 -> 1) and how will you do when the structure changes (we
> still need some type of versioning or flags for which one u32 is more
> than enough).
> 

Ack - we disagree on how to manage future changes and the impact of
reserving space.

> > + * @num_lines: number of lines requested in this request, i.e. the number
> > + * of valid fields in the GPIO_V2_LINES_MAX sized arrays, set to 1 to
> > + * request a single line
> 
> I would rather call it "amount" or something which is one word, but
> you may have a reason for the current, so I don't insist.
> Also, I would describe what will be the expected behaviour outside of
> the range (and what is the range?).
> For example, would it mean that we consider all lines if num_lines >=
> _LINES_MAX?
> 

Using num_X to describe the number of active entries in array X is a
common pattern, so I'll stick with that.

And num_lines > _LINES_MAX is interpreted as invalid - since they wont
fit.  The EINVAL the user will get if they try should make that clear.

> > + * @consumer: a functional name for the consumer of this GPIO line as set
> > + * by whatever is using it, will be empty if there is no current user but
> > + * may also be empty if the consumer doesn't set this up
> 
> In both cases "empty" means what? All \0:s, only first \0, white spaces?
> 

Another one inherited from v1 - empty => consumer[0] == '\0'.

Cheers,
Kent.

possible deadlock in xfrm_policy_lookup_bytype

2020-09-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:5fa35f24 Add linux-next specific files for 20200916
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16ec67fd90
kernel config:  https://syzkaller.appspot.com/x/.config?x=6bdb7e39caf48f53
dashboard link: https://syzkaller.appspot.com/bug?extid=4cbd5e3669aee5ac5149
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17ca2c7390
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1366870190

The issue was bisected to:

commit 1909760f5fc3f123e47b4e24e0ccdc0fc8f3f106
Author: Ahmed S. Darwish 
Date:   Fri Sep 4 15:32:31 2020 +

seqlock: PREEMPT_RT: Do not starve seqlock_t writers

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=156b27dd90
final oops: https://syzkaller.appspot.com/x/report.txt?x=176b27dd90
console output: https://syzkaller.appspot.com/x/log.txt?x=136b27dd90

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+4cbd5e3669aee5ac5...@syzkaller.appspotmail.com
Fixes: 1909760f5fc3 ("seqlock: PREEMPT_RT: Do not starve seqlock_t writers")


WARNING: possible irq lock inversion dependency detected
5.9.0-rc5-next-20200916-syzkaller #0 Not tainted

syz-executor974/6847 just changed the state of lock:
8ae7a3c8 (&s->seqcount#9){+..-}-{0:0}, at: 
xfrm_policy_lookup_bytype+0x183/0xa40 net/xfrm/xfrm_policy.c:2088
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (&s->seqcount#8){+.+.}-{0:0}


and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

   CPU0CPU1
   
  lock(&s->seqcount#8);
   local_irq_disable();
   lock(&s->seqcount#9);
   lock(&s->seqcount#8);
  
lock(&s->seqcount#9);

 *** DEADLOCK ***

4 locks held by syz-executor974/6847:
 #0: 8aae80a8 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach 
drivers/net/tun.c:687 [inline]
 #0: 8aae80a8 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 
drivers/net/tun.c:3390
 #1: c9007d80 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: lockdep_copy_map 
include/linux/lockdep.h:35 [inline]
 #1: c9007d80 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: 
call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1403
 #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: read_pnet 
include/net/net_namespace.h:327 [inline]
 #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: dev_net 
include/linux/netdevice.h:2290 [inline]
 #2: 89e71cc0 (rcu_read_lock){}-{1:2}, at: mld_sendpack+0x165/0xdb0 
net/ipv6/mcast.c:1646
 #3: 89e71cc0 (rcu_read_lock){}-{1:2}, at: 
xfrm_policy_lookup_bytype+0x104/0xa40 net/xfrm/xfrm_policy.c:2082

the shortest dependencies between 2nd lock and 1st lock:
 -> (&s->seqcount#8){+.+.}-{0:0} {
HARDIRQ-ON-W at:
  lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398
  write_seqcount_t_begin_nested include/linux/seqlock.h:509 
[inline]
  write_seqcount_t_begin include/linux/seqlock.h:535 
[inline]
  write_seqlock include/linux/seqlock.h:883 [inline]
  xfrm_set_spdinfo+0x302/0x660 net/xfrm/xfrm_user.c:1185
  xfrm_user_rcv_msg+0x414/0x700 net/xfrm/xfrm_user.c:2684
  netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2470
  xfrm_netlink_rcv+0x6b/0x90 net/xfrm/xfrm_user.c:2692
  netlink_unicast_kernel net/netlink/af_netlink.c:1304 
[inline]
  netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
  netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
  sock_sendmsg_nosec net/socket.c:651 [inline]
  sock_sendmsg+0xcf/0x120 net/socket.c:671
  sys_sendmsg+0x6e8/0x810 net/socket.c:2362
  ___sys_sendmsg+0xf3/0x170 net/socket.c:2416
  __sys_sendmsg+0xe5/0x1b0 net/socket.c:2449
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
SOFTIRQ-ON-W at:
  lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398
  write_seqcount_t_begin_nested include/linux/seqlock.h:509 
[inline]
  write_seqcount_t_begin include/linux/seqlock.h:535 
[inline]
  write_seqlock include/linux/seqlock.h:883 [inline]
  xfrm_set_spdinfo+0x302/0x660 net/xfrm/xfrm_user.c:1185
  xfrm_user_rcv_msg+0x414/0x700 net/xfrm/xfrm_user.

[PATCH 1/2] mips: Add strong UC ordering config

2020-09-20 Thread Serge Semin

In accordance with [1, 2] memory transactions using CCA=2 (Uncached
Cacheability and Coherency Attribute) are always strongly ordered. This
means the younger memory accesses using CCA=2 are never allowed to be
executed before older memory accesses using CCA=2 (no bypassing is
allowed), and Loads and Stores using CCA=2 are never speculative. It is
expected by the specification that the rest of the system maintains these
properties for processor initiated uncached accesses. So the system IO
interconnect doesn't reorder uncached transactions once they have left the
processor subsystem. Taking into account these properties and what [3]
says about the relaxed IO-accessors we can infer that normal Loads and
Stores from/to CCA=2 memory and without any additional execution barriers
will fully comply with the {read,write}X_relaxed() methods requirements.

Let's convert then currently generated relaxed IO-accessors to being pure
Loads and Stores. Seeing the commit 3d474dacae72 ("MIPS: Enforce strong
ordering for MMIO accessors") and commit 8b656253a7a4 ("MIPS: Provide
actually relaxed MMIO accessors") have already made a preparation in the
corresponding macro, we can do that just by replacing the "barrier"
parameter utilization with the "relax" one. Note the "barrier" macro
argument can be removed, since it isn't fully used anyway other than being
always assigned to 1.

Of course it would be fullish to believe that all the available MIPS-based
CPUs completely follow the denoted specification, especially considering
how old the architecture is. Instead we introduced a dedicated kernel
config, which when enabled will convert the relaxed IO-accessors to being
pure Loads and Stores without any additional barriers around. So if some
CPU supports the strongly ordered UC memory access, it can enable that
config and use a fully optimized relaxed IO-methods. For instance,
Baikal-T1 architecture support code will do that.

[1] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior,
p. 33

[2] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58

[3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
Cc: Maciej W. Rozycki 
---
 arch/mips/Kconfig  |  8 
 arch/mips/include/asm/io.h | 20 ++--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index c95fa3a2484c..2c82d927347d 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2066,6 +2066,14 @@ config WEAK_ORDERING
 #
 config WEAK_REORDERING_BEYOND_LLSC
bool
+
+#
+# CPU may not reorder reads and writes R->R, R->W, W->R, W->W within Uncached
+# Cacheability and Coherency Attribute (CCA=2)
+#
+config STRONG_UC_ORDERING
+   bool
+
 endmenu
 
 #
diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h
index 78537aa23500..130c4b6458fc 100644
--- a/arch/mips/include/asm/io.h
+++ b/arch/mips/include/asm/io.h
@@ -213,7 +213,7 @@ void iounmap(const volatile void __iomem *addr);
 #define war_io_reorder_wmb()   barrier()
 #endif
 
-#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, relax, irq)\
+#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, relax, irq) \
\
 static inline void pfx##write##bwlq(type val,  \
volatile void __iomem *mem) \
@@ -221,7 +221,7 @@ static inline void pfx##write##bwlq(type val,   
\
volatile type *__mem;   \
type __val; \
\
-   if (barrier)\
+   if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING)))  \
iobarrier_rw(); \
else\
war_io_reorder_wmb();   \
@@ -262,7 +262,7 @@ static inline type pfx##read##bwlq(const volatile void 
__iomem *mem)\
\
__mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));\
\
-   if (barrier)\
+   if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING)))  \
iobarrier_rw(); \
\
if (sizeof(type) != sizeof(u64) || size

[PATCH 2/2] mips: Introduce MIPS CM2 GCR Control register accessors

2020-09-20 Thread Serge Semin

For some reason these accessors have been absent from the MIPS kernel,
while some of them can be used to tune the MIPS code execution up (the
default value are fully acceptable though). For instance, in the framework
of MIPS P5600/P6600 (see [1] for details) if we are sure the IO
interconnect doesn't reorder the requests we can freely set
GCR_CONTROL.SYNCDIS, which will make CM2 to respond on SYNCs just
after a request is accepted on the L2/Memory interface instead of
executing the legacy SYNC and waiting for the full response from
L2/Memory. Needless to say that this will significantly speed the
{read,write}X() IO-accessors due to having more lightweight barriers
around the IO Loads and Stores. There are others MIPS Coherency Manager
optimizations available in framework of that register like cache ops
serialization limits, speculative read enable, etc, which can be useful
for the various MIPS platforms.

[1] MIPS32 P5600 Multiprocessing System Software User's Manual,
Document Number: MD01025, Revision 01.60, April 19, 2016, p. 400

Signed-off-by: Serge Semin 

---

Folks, do you think it would be better to implement a dedicated config for
arch/mips/kernel/mips-cm.c code, which would disable the SI_SyncTxEn
acceptance by setting the GCR_CONTROL.SYNCDIS bit? Currently I intend to
set it in the out platform-specific prom_init() method.
---
 arch/mips/include/asm/mips-cm.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
index aeae2effa123..17b2adf57e0c 100644
--- a/arch/mips/include/asm/mips-cm.h
+++ b/arch/mips/include/asm/mips-cm.h
@@ -143,6 +143,21 @@ GCR_ACCESSOR_RW(64, 0x008, base)
 #define  CM_GCR_BASE_CMDEFTGT_IOCU02
 #define  CM_GCR_BASE_CMDEFTGT_IOCU13
 
+/* GCR_CONTROL - Global CM2 Settings */
+GCR_ACCESSOR_RW(64, 0x010, control)
+#define CM_GCR_CONTROL_SYNCCTL BIT(16)
+#define CM_GCR_CONTROL_SYNCDIS BIT(5)
+#define CM_GCR_CONTROL_IVU_EN  BIT(4)
+#define CM_GCR_CONTROL_SHST_EN BIT(3)
+#define CM_GCR_CONTROL_PARK_EN BIT(2)
+#define CM_GCR_CONTROL_MMIO_LIMIT_DIS  BIT(1)
+#define CM_GCR_CONTROL_SPEC_READ_ENBIT(0)
+
+/* GCR_CONTROL2 - Global CM2 Settings (continue) */
+GCR_ACCESSOR_RW(64, 0x018, control2)
+#define CM_GCR_CONTROL2_L2_CACHEOP_LIMIT   GENMASK(19, 16)
+#define CM_GCR_CONTROL2_L1_CACHEOP_LIMIT   GENMASK(3, 0)
+
 /* GCR_ACCESS - Controls core/IOCU access to GCRs */
 GCR_ACCESSOR_RW(32, 0x020, access)
 #define CM_GCR_ACCESS_ACCESSEN GENMASK(7, 0)
-- 
2.27.0

[PATCH 0/2] mips: Introduce some IO-accessors optimizations

2020-09-20 Thread Serge Semin

It has been discovered that on our MIPS P5600-based CPU the IO accessors
aren't that rapid as they could be even taking into account a relatively
slow AXI2APB bridge embedded into the system interconnect. Turned out we
can introduce two types of optimizations. First we can remove the
execution barriers from the relaxed IO-accessors as our CPU conforms to
the MIPS Coherency Protocol Specification [1, 2]. Of course it also
concerns the IO interconnect implementation. So in accordance with [3] we
suggest to remove the barriers at least for the platforms which conform
the specification the same way as ours. Second there is a dedicated
Coherency Manager control register, which can be also used to tune the IO
methods up. For some reason it hasn't been added to the MIPS arch code so
far, while it provides flags for instance to speed the SYNC barrier for
the platforms with non-re-ordering IO interconnect, to set the cache ops
serialization limits, enable the speculative reads, etc. For now we
suggest to add just the macro with the CM2 GCR_CONTROL register accessors
and fields description. So any platform could use it to activate the
corresponding optimization. Our platform-wise we'll do this in the
framework of our Baikal-T1 platform code in the prom_init() method.

[1] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior,
p. 33

[2] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58

[3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Vadim Vlasov 
Cc: Maciej W. Rozycki 
Cc: linux-m...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (2):
  mips: Add strong UC ordering config
  mips: Introduce MIPS CM2 GCR Control register accessors

 arch/mips/Kconfig   |  8 
 arch/mips/include/asm/io.h  | 20 ++--
 arch/mips/include/asm/mips-cm.h | 15 +++
 3 files changed, 33 insertions(+), 10 deletions(-)

-- 
2.27.0

[PATCH 1/3] hwmon: bt1-pvt: Test sensor power supply on probe

2020-09-20 Thread Serge Semin

Baikal-T1 PVT sensor has got a dedicated power supply domain (feed up by
the external GPVT/VPVT_18 pins). In case if it isn't powered up, the
registers will be accessible, but the sensor conversion just won't happen.
Due to that an attempt to read data from any PVT sensor will cause the
task hanging up.  For instance that will happen if XP11 jumper isn't
installed on the Baikal-T1-based BFK3.1 board. Let's at least test whether
the conversion work on the device probe procedure. By doing so will make
sure that the PVT sensor is powered up at least at boot time.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index 94698cae0497..f4b7353c078a 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -982,6 +983,41 @@ static int pvt_request_clks(struct pvt_hwmon *pvt)
return 0;
 }
 
+static int pvt_check_pwr(struct pvt_hwmon *pvt)
+{
+   unsigned long tout;
+   int ret = 0;
+   u32 data;
+
+   /*
+* Test out the sensor conversion functionality. If it is not done on
+* time then the domain must have been unpowered and we won't be able
+* to use the device later in this driver.
+* Note If the power source is lost during the normal driver work the
+* data read procedure will either return -ETIMEDOUT (for the
+* alarm-less driver configuration) or just stop the repeated
+* conversion. In the later case alas we won't be able to detect the
+* problem.
+*/
+   pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_ALL, PVT_INTR_ALL);
+   pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
+   pvt_set_tout(pvt, 0);
+   readl(pvt->regs + PVT_DATA);
+
+   tout = PVT_TOUT_MIN / NSEC_PER_USEC;
+   usleep_range(tout, 2 * tout);
+
+   data = readl(pvt->regs + PVT_DATA);
+   if (!(data & PVT_DATA_VALID)) {
+   ret = -ENODEV;
+   dev_err(pvt->dev, "Sensor is powered down\n");
+   }
+
+   pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+
+   return ret;
+}
+
 static void pvt_init_iface(struct pvt_hwmon *pvt)
 {
u32 trim, temp;
@@ -1109,6 +1145,10 @@ static int pvt_probe(struct platform_device *pdev)
if (ret)
return ret;
 
+   ret = pvt_check_pwr(pvt);
+   if (ret)
+   return ret;
+
pvt_init_iface(pvt);
 
ret = pvt_request_irq(pvt);
-- 
2.27.0

[PATCH 0/3] hwmon: bt1-pvt: Fix PVT sensor being unpowered

2020-09-20 Thread Serge Semin

Baikal-T1 PVT sensor has got a dedicated power domain with 1.8V intended
to be supplied via the external GPVT and VPVT_18 pins. In case if the
power isn't supplied, the sensor IO registers will be accessible, but the
data conversion just won't happen. The situation of the power loss
currently will cause the temperature and voltage read procedures to hang
up. The problem is fixed in the framework of this patchset firstly by
checking whether the conversion works at the sensor probe procedure, and
secondly by keeping the current conversion update timeout in the private
driver data cache and using it to set the wait-completion timeout.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
Cc: Maxim Kaurkin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: linux-m...@vger.kernel.org
Cc: linux-hw...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (3):
  hwmon: bt1-pvt: Test sensor power supply on probe
  hwmon: bt1-pvt: Cache current update timeout
  hwmon: bt1-pvt: Wait for the completion with timeout

 drivers/hwmon/bt1-pvt.c | 138 
 drivers/hwmon/bt1-pvt.h |   3 +
 2 files changed, 101 insertions(+), 40 deletions(-)

-- 
2.27.0

[PATCH 3/3] hwmon: bt1-pvt: Wait for the completion with timeout

2020-09-20 Thread Serge Semin

If the PVT sensor is suddenly powered down while a caller is waiting for
the conversion completion, the request won't be finished and the task will
hang up on this procedure until the power is back up again. Let's call the
wait_for_completion_timeout() method instead to prevent that. The cached
timeout is exactly what we need to predict for how long conversion could
normally last.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index 2600426a3b21..3e1d56585b91 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -477,6 +477,7 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
 long *val)
 {
struct pvt_cache *cache = &pvt->cache[type];
+   unsigned long timeout;
u32 data;
int ret;
 
@@ -500,7 +501,14 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, 0);
pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
 
-   wait_for_completion(&cache->conversion);
+   /*
+* Wait with timeout since in case if the sensor is suddenly powered
+* down the request won't be completed and the caller will hang up on
+* this procedure until the power is back up again. Multiply the
+* timeout by the factor of two to prevent a false timeout.
+*/
+   timeout = 2 * usecs_to_jiffies(ktime_to_us(pvt->timeout));
+   ret = wait_for_completion_timeout(&cache->conversion, timeout);
 
pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
@@ -510,6 +518,9 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
 
mutex_unlock(&pvt->iface_mtx);
 
+   if (!ret)
+   return -ETIMEDOUT;
+
if (type == PVT_TEMP)
*val = pvt_calc_poly(&poly_N_to_temp, data);
else
-- 
2.27.0

[PATCH 2/3] hwmon: bt1-pvt: Cache current update timeout

2020-09-20 Thread Serge Semin

Instead of converting the update timeout data to the milliseconds each
time on the read procedure let's preserve the currently set timeout in the
dedicated driver private data cache. The cached value will be then used in
the timeout read method and in the alarm-less data conversion to prevent
the caller task hanging up in case if the PVT sensor is suddenly powered
down.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 85 ++---
 drivers/hwmon/bt1-pvt.h |  3 ++
 2 files changed, 49 insertions(+), 39 deletions(-)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index f4b7353c078a..2600426a3b21 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -655,44 +655,16 @@ static int pvt_write_trim(struct pvt_hwmon *pvt, long val)
 
 static int pvt_read_timeout(struct pvt_hwmon *pvt, long *val)
 {
-   unsigned long rate;
-   ktime_t kt;
-   u32 data;
-
-   rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
-   if (!rate)
-   return -ENODEV;
-
-   /*
-* Don't bother with mutex here, since we just read data from MMIO.
-* We also have to scale the ticks timeout up to compensate the
-* ms-ns-data translations.
-*/
-   data = readl(pvt->regs + PVT_TTIMEOUT) + 1;
+   int ret;
 
-   /*
-* Calculate ref-clock based delay (Ttotal) between two consecutive
-* data samples of the same sensor. So we first must calculate the
-* delay introduced by the internal ref-clock timer (Tref * Fclk).
-* Then add the constant timeout cuased by each conversion latency
-* (Tmin). The basic formulae for each conversion is following:
-*   Ttotal = Tref * Fclk + Tmin
-* Note if alarms are enabled the sensors are polled one after
-* another, so in order to have the delay being applicable for each
-* sensor the requested value must be equally redistirbuted.
-*/
-#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
-   kt = ktime_set(PVT_SENSORS_NUM * (u64)data, 0);
-   kt = ktime_divns(kt, rate);
-   kt = ktime_add_ns(kt, PVT_SENSORS_NUM * PVT_TOUT_MIN);
-#else
-   kt = ktime_set(data, 0);
-   kt = ktime_divns(kt, rate);
-   kt = ktime_add_ns(kt, PVT_TOUT_MIN);
-#endif
+   ret = mutex_lock_interruptible(&pvt->iface_mtx);
+   if (ret)
+   return ret;
 
/* Return the result in msec as hwmon sysfs interface requires. */
-   *val = ktime_to_ms(kt);
+   *val = ktime_to_ms(pvt->timeout);
+
+   mutex_unlock(&pvt->iface_mtx);
 
return 0;
 }
@@ -700,7 +672,7 @@ static int pvt_read_timeout(struct pvt_hwmon *pvt, long 
*val)
 static int pvt_write_timeout(struct pvt_hwmon *pvt, long val)
 {
unsigned long rate;
-   ktime_t kt;
+   ktime_t kt, cache;
u32 data;
int ret;
 
@@ -713,7 +685,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long 
val)
 * between all available sensors to have the requested delay
 * applicable to each individual sensor.
 */
-   kt = ms_to_ktime(val);
+   cache = kt = ms_to_ktime(val);
 #if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
kt = ktime_divns(kt, PVT_SENSORS_NUM);
 #endif
@@ -742,6 +714,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long 
val)
return ret;
 
pvt_set_tout(pvt, data);
+   pvt->timeout = cache;
 
mutex_unlock(&pvt->iface_mtx);
 
@@ -1018,10 +991,17 @@ static int pvt_check_pwr(struct pvt_hwmon *pvt)
return ret;
 }
 
-static void pvt_init_iface(struct pvt_hwmon *pvt)
+static int pvt_init_iface(struct pvt_hwmon *pvt)
 {
+   unsigned long rate;
u32 trim, temp;
 
+   rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
+   if (!rate) {
+   dev_err(pvt->dev, "Invalid reference clock rate\n");
+   return -ENODEV;
+   }
+
/*
 * Make sure all interrupts and controller are disabled so not to
 * accidentally have ISR executed before the driver data is fully
@@ -1036,12 +1016,37 @@ static void pvt_init_iface(struct pvt_hwmon *pvt)
pvt_set_mode(pvt, pvt_info[pvt->sensor].mode);
pvt_set_tout(pvt, PVT_TOUT_DEF);
 
+   /*
+* Preserve the current ref-clock based delay (Ttotal) between the
+* sensors data samples in the driver data so not to recalculate it
+* each time on the data requests and timeout reads. It consists of the
+* delay introduced by the internal ref-clock timer (N / Fclk) and the
+* constant timeout caused by each conversion latency (Tmin):
+*   Ttotal = N / Fclk + Tmin
+* If alarms are enabled the sensors are polled one after another and
+* in order to get the next measurement of a particular sensor the
+* caller will have to wait for at most until all the others are
+

[rcu:dev.2020.09.17b] BUILD SUCCESS ef749b8a4cc43dcbdf6e5848acdc2cb8d6950b45

2020-09-20 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git  
dev.2020.09.17b
branch HEAD: ef749b8a4cc43dcbdf6e5848acdc2cb8d6950b45  fixup! locktorture: 
Prevent hangs for invalid arguments

elapsed time: 722m

configs tested: 110
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
powerpc   maple_defconfig
sh microdev_defconfig
armneponset_defconfig
powerpc kmeter1_defconfig
powerpc ps3_defconfig
powerpc tqm8541_defconfig
mips  pic32mzda_defconfig
armmmp2_defconfig
sparcalldefconfig
powerpcicon_defconfig
riscvnommu_k210_defconfig
sh kfr2r09-romimage_defconfig
m68k  atari_defconfig
powerpc rainier_defconfig
powerpc sbc8548_defconfig
mips   jazz_defconfig
powerpc pseries_defconfig
mips   capcella_defconfig
xtensa   alldefconfig
powerpc tqm8540_defconfig
powerpc mpc512x_defconfig
m68k   m5208evb_defconfig
csky alldefconfig
powerpc tqm8555_defconfig
h8300   h8s-sim_defconfig
sh  rsk7264_defconfig
sh   se7724_defconfig
mips  decstation_64_defconfig
mips   ci20_defconfig
powerpcsocrates_defconfig
shhp6xx_defconfig
shedosk7705_defconfig
arm pxa_defconfig
powerpc mpc832x_mds_defconfig
arm   h5000_defconfig
mips   ip32_defconfig
mips   ip22_defconfig
powerpc   allnoconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
x86_64   randconfig-a005-20200920
x86_64   randconfig-a003-20200920
x86_64   randconfig-a004-20200920
x86_64   randconfig-a002-20200920
x86_64   randconfig-a006-20200920
x86_64   randconfig-a001-20200920
i386 randconfig-a002-20200920
i386 randconfig-a006-20200920
i386 randconfig-a003-20200920
i386 randconfig-a004-20200920
i386 randconfig-a005-20200920
i386 randconfig-a001-20200920
i386 randconfig-a012-20200920
i386 randconfig-a014-20200920
i386 randconfig-a016-20200920
i386 randconfig-a013-20200920
i386 randconfig-a011-20200920
i386 randconfig-a015-20200920
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig

Re: [PATCH v7 07/20] gpiolib: cdev: support GPIO_V2_GET_LINE_IOCTL and GPIO_V2_LINE_GET_VALUES_IOCTL

2020-09-20 Thread Kent Gibson

On Tue, Sep 15, 2020 at 01:31:27PM +0300, Andy Shevchenko wrote:
> On Sat, Sep 5, 2020 at 4:49 PM Kent Gibson  wrote:
> >
> > Add support for requesting lines using the GPIO_V2_GET_LINE_IOCTL, and
> > returning their current values using GPIO_V2_LINE_GET_VALUES_IOCTL.
> >
> > The struct linereq implementation is based on the v1 struct linehandle
> > implementation.
> >
> > Signed-off-by: Kent Gibson 
> > ---
> >

[snip]
> 
> > +   /* Bias requires explicit direction. */
> > +   if ((flags & GPIO_V2_LINE_BIAS_FLAGS) &&
> > +   !(flags & GPIO_V2_LINE_DIRECTION_FLAGS))
> > +   return -EINVAL;
> 
> Why is this? If I request a line as is and after set a bias, should I
> really care about direction?
> 

Yeah, you probably should be aware of the direction if they are setting
the bias, so this makes sure they are.

The practical reason is that gpiod_direction_output() or
gpiod_direction_input() have to be called to set the bias, and they are
only called if the corresponding flag is set.

> 
> > +   for (num_get = 0, i = 0; i < lr->num_lines; i++) {
> > +   if (lv.mask & BIT_ULL(i)) {
> 
> for_each_set_bit() ?
> 

No - lv.mask is u64, not unsigned long.
You could do a cast, but it would break on BE-32.
Sound familar? - you caught me doing just that in your review of an earlier
version.

> > +   ulr.consumer[sizeof(ulr.consumer)-1] = '\0';
> > +   if (strlen(ulr.consumer)) {
> > +   lr->label = kstrdup(ulr.consumer, GFP_KERNEL);
> 
> Sounds like kstrndup()
> 

Been here before too - they differ slightly in that here lr->label is
left null if consumer is empty,  whereas kstrndup() would alloc one byte
just for the null terminator.

> > +   }
> > +
> > +   blocking_notifier_call_chain(&desc->gdev->notifier,
> > +GPIOLINE_CHANGED_REQUESTED, 
> > desc);
> > +
> 
> > +   dev_dbg(&gdev->dev, "registered chardev handle for line 
> > %d\n",
> > +   offset);
> 
> Hmm... I would rather see trace events / points than new dev_dbg() /
> pr_debug() calls.
> 

Agreed - it is on the TODO list.
I have looked at it and doing it properly would mean adding tracepoints
to gpiolib.c, and modifying the v1 code as well, so it is best done in a
separate patch later...

> > @@ -1104,6 +1505,25 @@ int gpiolib_cdev_register(struct gpio_device *gdev, 
> > dev_t devt)
> >  MAJOR(devt), gdev->id);
> >
> > return 0;
> > +   /*
> > +* array sizes must ensure 64-bit alignment and not create holes in
> > +* the struct packing.
> > +*/
> > +   BUILD_BUG_ON(IS_ALIGNED(GPIO_V2_LINES_MAX, 2));
> > +   BUILD_BUG_ON(IS_ALIGNED(GPIO_MAX_NAME_SIZE, 8));
> > +
> > +   /*
> > +* check that uAPI structs are 64-bit aligned for 32/64-bit
> > +* compatibility
> > +*/
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_attribute), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct 
> > gpio_v2_line_config_attribute), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_config), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_request), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_info), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_info_changed), 
> > 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_event), 8));
> > +   BUILD_BUG_ON(IS_ALIGNED(sizeof(struct gpio_v2_line_values), 8));
> 
> Can we use static_assert() at the top of the file? Presumably after
> inclusion block.
> 

Good idea - will do.

Cheers,
Kent.

[PATCH] clk: baikal-t1: Mark Ethernet PLL as critical

2020-09-20 Thread Serge Semin

We've discovered that disabling the so called Ethernet PLL causes reset of
the devices consuming its outgoing clock. The resets happen automatically
even if each underlying clock gate is turned off. Due to that we can't
disable the Ethernet PLL until the kernel is prepared for the corresponding
resets. So for now just mark the PLL clock provider as critical.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: linux-m...@vger.kernel.org
---
 drivers/clk/baikal-t1/clk-ccu-pll.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/clk/baikal-t1/clk-ccu-pll.c 
b/drivers/clk/baikal-t1/clk-ccu-pll.c
index 1eec8c0b8f50..2445d4b12baf 100644
--- a/drivers/clk/baikal-t1/clk-ccu-pll.c
+++ b/drivers/clk/baikal-t1/clk-ccu-pll.c
@@ -51,11 +51,13 @@ struct ccu_pll_info {
 };
 
 /*
- * Mark as critical all PLLs except Ethernet one. CPU and DDR PLLs are sources
- * of CPU cores and DDR controller reference clocks, due to which they
- * obviously shouldn't be ever gated. SATA and PCIe PLLs are the parents of
- * APB-bus and DDR controller AXI-bus clocks. If they are gated the system will
- * be unusable.
+ * Alas we have to mark all PLLs as critical. CPU and DDR PLLs are sources of
+ * CPU cores and DDR controller reference clocks, due to which they obviously
+ * shouldn't be ever gated. SATA and PCIe PLLs are the parents of APB-bus and
+ * DDR controller AXI-bus clocks. If they are gated the system will be
+ * unusable. Moreover disabling SATA and Ethernet PLLs causes automatic reset
+ * of the corresponding subsystems. So until we aren't ready to re-initialize
+ * all the devices consuming those PLLs, they will be marked as critical too.
  */
 static const struct ccu_pll_info pll_info[] = {
CCU_PLL_INFO(CCU_CPU_PLL, "cpu_pll", "ref_clk", CCU_CPU_PLL_BASE,
@@ -67,7 +69,7 @@ static const struct ccu_pll_info pll_info[] = {
CCU_PLL_INFO(CCU_PCIE_PLL, "pcie_pll", "ref_clk", CCU_PCIE_PLL_BASE,
 CLK_IS_CRITICAL),
CCU_PLL_INFO(CCU_ETH_PLL, "eth_pll", "ref_clk", CCU_ETH_PLL_BASE,
-CLK_SET_RATE_GATE)
+CLK_IS_CRITICAL | CLK_SET_RATE_GATE)
 };
 
 struct ccu_pll_data {
-- 
2.27.0

[PATCH v3] mtd: physmap: Add Baikal-T1 physically mapped ROM support

2020-09-20 Thread Serge Semin

Baikal-T1 Boot Controller provides an access to a RO storages, which are
physically mapped into the SoC MMIO space. In particularly there are
Internal ROM embedded into the SoC with a pre-installed firmware,
externally attached SPI flash (also accessed in the read-only mode) and a
memory region, which mirrors one of them in accordance with the currently
enabled system boot mode (also called Boot ROM).

This commit adds the Internal ROM support to the physmap driver of the MTD
kernel subsystem. The driver will create the Internal ROM MTD as long as
it is defined in the system dts file. The physically mapped SPI flash
region will be used to implement the SPI-mem interface. The mirroring
memory region won't be accessible directly since it's redundant due to
both bootable regions being exposed anyway.

Note we had to create a dedicated code for the ROMs since read from the
corresponding memory regions must be done via the dword-aligned addresses.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Lee Jones 
Cc: linux-m...@vger.kernel.org

---

Link: 
https://lore.kernel.org/linux-mtd/20200508100905.5854-1-sergey.se...@baikalelectronics.ru/
Changelog v2:
- Resend.

Link: 
https://lore.kernel.org/linux-mtd/20200526225849.20985-1-sergey.se...@baikalelectronics.ru
Changelog v3:
- Alphabetically order the include statements.
- Discard dummy IOs.
- Discard Baikal-T1 Boot ROM support. The directly mapped SPI flash
  memory will be used in the DW APB SSI driver instead.
---
 drivers/mtd/maps/Kconfig   |  11 +++
 drivers/mtd/maps/Makefile  |   1 +
 drivers/mtd/maps/physmap-bt1-rom.c | 126 +
 drivers/mtd/maps/physmap-bt1-rom.h |  17 
 drivers/mtd/maps/physmap-core.c|   5 ++
 5 files changed, 160 insertions(+)
 create mode 100644 drivers/mtd/maps/physmap-bt1-rom.c
 create mode 100644 drivers/mtd/maps/physmap-bt1-rom.h

diff --git a/drivers/mtd/maps/Kconfig b/drivers/mtd/maps/Kconfig
index fd37553f1b07..6650acbc961e 100644
--- a/drivers/mtd/maps/Kconfig
+++ b/drivers/mtd/maps/Kconfig
@@ -75,6 +75,17 @@ config MTD_PHYSMAP_OF
  physically into the CPU's memory. The mapping description here is
  taken from OF device tree.
 
+config MTD_PHYSMAP_BT1_ROM
+   bool "Baikal-T1 Boot ROMs OF-based physical memory map handling"
+   depends on MTD_PHYSMAP_OF
+   depends on MIPS_BAIKAL_T1 || COMPILE_TEST
+   select MTD_COMPLEX_MAPPINGS
+   select MULTIPLEXER
+   select MUX_MMIO
+   help
+ This provides some extra DT physmap parsing for the Baikal-T1
+ platforms, some detection and setting up ROMs-specific accessors.
+
 config MTD_PHYSMAP_VERSATILE
bool "ARM Versatile OF-based physical memory map handling"
depends on MTD_PHYSMAP_OF
diff --git a/drivers/mtd/maps/Makefile b/drivers/mtd/maps/Makefile
index c0da86a5d26f..79f018cf412f 100644
--- a/drivers/mtd/maps/Makefile
+++ b/drivers/mtd/maps/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_MTD_CK804XROM)   += ck804xrom.o
 obj-$(CONFIG_MTD_TSUNAMI)  += tsunami_flash.o
 obj-$(CONFIG_MTD_PXA2XX)   += pxa2xx-flash.o
 physmap-objs-y += physmap-core.o
+physmap-objs-$(CONFIG_MTD_PHYSMAP_BT1_ROM) += physmap-bt1-rom.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_VERSATILE) += physmap-versatile.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_GEMINI) += physmap-gemini.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_IXP4XX) += physmap-ixp4xx.o
diff --git a/drivers/mtd/maps/physmap-bt1-rom.c 
b/drivers/mtd/maps/physmap-bt1-rom.c
new file mode 100644
index ..27cfe1c63a2e
--- /dev/null
+++ b/drivers/mtd/maps/physmap-bt1-rom.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+ *
+ * Authors:
+ *   Serge Semin 
+ *
+ * Baikal-T1 Physically Mapped Internal ROM driver
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "physmap-bt1-rom.h"
+
+/*
+ * Baikal-T1 SoC ROMs are only accessible by the dword-aligned instructions.
+ * We have to take this into account when implementing the data read-methods.
+ * Note there is no need in bothering with endianness, since both Baikal-T1
+ * CPU and MMIO are LE.
+ */
+static map_word __xipram bt1_rom_map_read(struct map_info *map,
+ unsigned long ofs)
+{
+   void __iomem *src = map->virt + ofs;
+   unsigned long shift;
+   map_word ret;
+   u32 data;
+
+   /* Read data within offset dword. */
+   shift = (unsigned long)src & 0x3;
+   data = readl_relaxed(src - shift);
+   if (!shift) {
+   ret.x[0] = data;
+   return ret;
+   }
+   ret.x[0] = data >> (shift * BITS_PER_BYTE);
+
+   /* Read data from the next dword. */
+   shift = 4 - shift;
+   if (ofs + shift >= map->size)
+   return ret;
+
+   data = readl_relaxed(src + shift);
+   ret.x[0] |= d

[PATCH v2 11/11] spi: dw-dma: Add one-by-one SG list entries transfer

2020-09-20 Thread Serge Semin

In case if at least one of the requested DMA engine channels doesn't
support the hardware accelerated SG list entries traverse, the DMA driver
will most likely work that around by performing the IRQ-based SG list
entries resubmission. That might and will cause a problem if the DMA Tx
channel is recharged and re-executed before the Rx DMA channel. Due to
non-deterministic IRQ-handler execution latency the DMA Tx channel will
start pushing data to the SPI bus before the Rx DMA channel is even
reinitialized with the next inbound SG list entry. By doing so the DMA
Tx channel will implicitly start filling the DW APB SSI Rx FIFO up, which
while the DMA Rx channel being recharged and re-executed will eventually
be overflown.

In order to solve the problem we have to feed the DMA engine with SG
list entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
synchronized and prevent the Rx FIFO overflow. Since in general the SPI
tx_sg and rx_sg lists may have different number of entries of different
lengths (though total length should match) we virtually split the
SG-lists to the set of DMA transfers, which length is a minimum of the
ordered SG-entries lengths.

The solution described above is only executed if a full-duplex SPI
transfer is requested and the DMA engine hasn't provided channels with
hardware accelerated SG list traverse capability to handle both SG
lists at once.

Signed-off-by: Serge Semin 
Suggested-by: Andy Shevchenko 
---
 drivers/spi/spi-dw-dma.c | 137 ++-
 drivers/spi/spi-dw.h |   1 +
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index f333c2e23bf6..1cbb5a9efbba 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -72,6 +72,23 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
+static void dw_spi_dma_sg_burst_init(struct dw_spi *dws)
+{
+   struct dma_slave_caps tx = {0}, rx = {0};
+
+   dma_get_slave_caps(dws->txchan, &tx);
+   dma_get_slave_caps(dws->rxchan, &rx);
+
+   if (tx.max_sg_burst > 0 && rx.max_sg_burst > 0)
+   dws->dma_sg_burst = min(tx.max_sg_burst, rx.max_sg_burst);
+   else if (tx.max_sg_burst > 0)
+   dws->dma_sg_burst = tx.max_sg_burst;
+   else if (rx.max_sg_burst > 0)
+   dws->dma_sg_burst = rx.max_sg_burst;
+   else
+   dws->dma_sg_burst = 0;
+}
+
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
 {
struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = &dma_tx;
@@ -109,6 +126,8 @@ static int dw_spi_dma_init_mfld(struct device *dev, struct 
dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 
 free_rxchan:
@@ -138,6 +157,8 @@ static int dw_spi_dma_init_generic(struct device *dev, 
struct dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 }
 
@@ -466,11 +487,125 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
return ret;
 }
 
+/*
+ * In case if at least one of the requested DMA channels doesn't support the
+ * hardware accelerated SG list entries traverse, the DMA driver will most
+ * likely work that around by performing the IRQ-based SG list entries
+ * resubmission. That might and will cause a problem if the DMA Tx channel is
+ * recharged and re-executed before the Rx DMA channel. Due to
+ * non-deterministic IRQ-handler execution latency the DMA Tx channel will
+ * start pushing data to the SPI bus before the Rx DMA channel is even
+ * reinitialized with the next inbound SG list entry. By doing so the DMA Tx
+ * channel will implicitly start filling the DW APB SSI Rx FIFO up, which while
+ * the DMA Rx channel being recharged and re-executed will eventually be
+ * overflown.
+ *
+ * In order to solve the problem we have to feed the DMA engine with SG list
+ * entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
+ * synchronized and prevent the Rx FIFO overflow. Since in general the tx_sg
+ * and rx_sg lists may have different number of entries of different lengths
+ * (though total length should match) let's virtually split the SG-lists to the
+ * set of DMA transfers, which length is a minimum of the ordered SG-entries
+ * lengths. An ASCII-sketch of the implemented algo is following:
+ *  xfer->len
+ *|___|
+ * tx_sg list:|___||__|
+ * rx_sg list:|_|||
+ * DMA transfers: |_|_|__|_|__|
+ *
+ * Note in order to have this workaround solving the denoted problem the DMA
+ * engine driver should properly initialize the max_sg_burst capability and set
+ * the DMA device max segment size parameter with maximum data block size the
+ * DMA engine supports.
+ */
+
+static int dw_spi_dma_transfer_one(struct dw_spi *dws,
+  struct s

[PATCH v2 04/11] spi: dw-dma: Check rx_buf availability in the xfer method

2020-09-20 Thread Serge Semin

Checking rx_buf for being NULL and returning NULL from the Rx-channel
preparation method doesn't let us to distinguish that situation from
errors happening during the Rx SG-list preparation. So it's better to make
sure that the rx_buf not-NULL and full-duplex communication is requested
prior calling the Rx preparation method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index da17897b8acb..d2a67dee1a66 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -369,9 +369,6 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 {
struct dma_async_tx_descriptor *rxdesc;
 
-   if (!xfer->rx_buf)
-   return NULL;
-
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
xfer->rx_sg.nents,
@@ -435,10 +432,12 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
return -EINVAL;
 
/* Prepare the RX dma transfer */
-   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   if (xfer->rx_buf) {
+   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   if (!rxdesc)
+   return -EINVAL;
 
-   /* rx must be started before tx due to spi instinct */
-   if (rxdesc) {
+   /* rx must be started before tx due to spi instinct */
set_bit(RX_BUSY, &dws->dma_chan_busy);
dmaengine_submit(rxdesc);
dma_async_issue_pending(dws->rxchan);
@@ -458,7 +457,7 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
return ret;
}
 
-   if (rxdesc && dws->master->cur_msg->status == -EINPROGRESS)
+   if (xfer->rx_buf && dws->master->cur_msg->status == -EINPROGRESS)
ret = dw_spi_dma_wait_rx_done(dws);
 
return ret;
-- 
2.27.0

[PATCH v2 03/11] spi: dw-dma: Configure the DMA channels in dma_setup

2020-09-20 Thread Serge Semin

Mainly this is a preparation patch before adding one-by-one DMA SG entries
transmission. But logically the Tx and Rx DMA channels setup should be
performed in the dma_setup() callback anyway. So we'll move the DMA slave
channels src/dst burst lengths, address and address width configuration
from the Tx/Rx channels preparation methods to the dedicated functions and
then make sure it's called at the DMA setup stage.

Note we now make sure the return value of the dmaengine_slave_config()
method doesn't indicate an error. It has been unnecessary in case if Dw
DMAC is utilized as a DMA engine, since its device_config() callback
always returns zero (though it might change in future). But since DW APB
SSI driver now supports any DMA back-end we must make sure the DMA device
configuration has been successful before proceeding with further setups.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 42 +---
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 1b013ac94a3f..da17897b8acb 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -256,11 +256,9 @@ static void dw_spi_dma_tx_done(void *arg)
complete(&dws->dma_completion);
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_config_tx(struct dw_spi *dws)
 {
struct dma_slave_config txconf;
-   struct dma_async_tx_descriptor *txdesc;
 
memset(&txconf, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
@@ -270,7 +268,13 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txconf.dst_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
txconf.device_fc = false;
 
-   dmaengine_slave_config(dws->txchan, &txconf);
+   return dmaengine_slave_config(dws->txchan, &txconf);
+}
+
+static struct dma_async_tx_descriptor *
+dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *txdesc;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -345,14 +349,9 @@ static void dw_spi_dma_rx_done(void *arg)
complete(&dws->dma_completion);
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_config_rx(struct dw_spi *dws)
 {
struct dma_slave_config rxconf;
-   struct dma_async_tx_descriptor *rxdesc;
-
-   if (!xfer->rx_buf)
-   return NULL;
 
memset(&rxconf, 0, sizeof(rxconf));
rxconf.direction = DMA_DEV_TO_MEM;
@@ -362,7 +361,16 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
rxconf.src_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
rxconf.device_fc = false;
 
-   dmaengine_slave_config(dws->rxchan, &rxconf);
+   return dmaengine_slave_config(dws->rxchan, &rxconf);
+}
+
+static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
+   struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *rxdesc;
+
+   if (!xfer->rx_buf)
+   return NULL;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -381,10 +389,22 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
u16 imr, dma_ctrl;
+   int ret;
 
if (!xfer->tx_buf)
return -EINVAL;
 
+   /* Setup DMA channels */
+   ret = dw_spi_dma_config_tx(dws);
+   if (ret)
+   return ret;
+
+   if (xfer->rx_buf) {
+   ret = dw_spi_dma_config_rx(dws);
+   if (ret)
+   return ret;
+   }
+
/* Set the DMA handshaking interface */
dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0

Re: [PATCH v8 09/20] gpiolib: cdev: support edge detection for uAPI v2

2020-09-20 Thread Kent Gibson

On Tue, Sep 15, 2020 at 01:39:41PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 9, 2020 at 1:33 PM Kent Gibson  wrote:
> >
> > Add support for edge detection to lines requested using
> > GPIO_V2_GET_LINE_IOCTL.
> >
> > The edge_detector implementation is based on the v1 lineevent
> > implementation.
> 
> ...
> 
[snip]
> > +
> > +   /*
> > +* We may be running from a nested threaded interrupt in which case
> > +* we didn't get the timestamp from edge_irq_handler().
> > +*/
> 
> > +   if (!line->timestamp) {
> 
> Can it be positive conditional?
> 

Not sure what you mean - switch the order of the if/else?

> > +   le.timestamp = ktime_get_ns();
> > +   if (lr->num_lines != 1)
> > +   line->req_seqno = atomic_inc_return(&lr->seqno);
> > +   } else {
> > +   le.timestamp = line->timestamp;
> > +   }
> > +   line->timestamp = 0;
> > +
> > +   if (line->eflags == (GPIO_V2_LINE_FLAG_EDGE_RISING |
> > +GPIO_V2_LINE_FLAG_EDGE_FALLING)) {
> > +   int level = gpiod_get_value_cansleep(line->desc);
> > +
> > +   if (level)
> > +   /* Emit low-to-high event */
> > +   le.id = GPIO_V2_LINE_EVENT_RISING_EDGE;
> > +   else
> > +   /* Emit high-to-low event */
> > +   le.id = GPIO_V2_LINE_EVENT_FALLING_EDGE;
> > +   } else if (line->eflags == GPIO_V2_LINE_FLAG_EDGE_RISING) {
> > +   /* Emit low-to-high event */
> > +   le.id = GPIO_V2_LINE_EVENT_RISING_EDGE;
> > +   } else if (line->eflags == GPIO_V2_LINE_FLAG_EDGE_FALLING) {
> > +   /* Emit high-to-low event */
> > +   le.id = GPIO_V2_LINE_EVENT_FALLING_EDGE;
> > +   } else {
> > +   return IRQ_NONE;
> > +   }
> > +   line->line_seqno++;
> > +   le.line_seqno = line->line_seqno;
> > +   le.seqno = (lr->num_lines == 1) ? le.line_seqno : line->req_seqno;
> > +   le.offset = gpio_chip_hwgpio(line->desc);
> > +
> > +   ret = kfifo_in_spinlocked_noirqsave(&lr->events, &le,
> > +   1, &lr->wait.lock);
> > +   if (ret)
> > +   wake_up_poll(&lr->wait, EPOLLIN);
> > +   else
> 
> > +   pr_debug_ratelimited("event FIFO is full - event 
> > dropped\n");
> 
> Oh, can we really avoid printf() in IRQ context?
> 

Even in the IRQ thread?  Would a tracepoint be preferable?

Btw, this is drawn from the v1 implmentation.

> > +
> > +static void edge_detector_stop(struct line *line)
> > +{
> 
> > +   if (line->irq) {
> > +   free_irq(line->irq, line);
> > +   line->irq = 0;
> > +   }
> 
> Perhaps
> 
> if (!line->irq)
>   return;
> 
> ?
> 

No - the function is extended in subsequent patches.  I usually make a
note of cases like this in the commentary, but missed this one.

> > +   if (!eflags)
> > +   return 0;
> > +
> > +   irq = gpiod_to_irq(line->desc);
> > +   if (irq <= 0)
> 
> > +   return -ENODEV;
> 
> Why shadowing actual error code?
> 

Another one drawn from the v1 implementation, so not sure.
gpiod_to_irq() can potentially return EINVAL, which is definitely not
appropriate to return, or ENXIO, which is actually more appropriate??

Cheers,
Kent.

[PATCH v2 02/11] spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified

2020-09-20 Thread Serge Semin

Since commit 46164fde6b78 ("spi: dw: Fix Rx-only DMA transfers") if DMA
interface is enabled, then Tx-buffer must be available in each SPI
transfer. It's required since in order to activate the incoming data
reception either DMA or CPU must be pushing data out to the SPI bus.
But the DW APB SSI DMA driver code is still left in state as if Tx-buffer
might be optional, which is no longer true. Let's fix it so an error would
be returned if no Tx-buffer detected and DMA Tx would be always
enabled.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index a7ff1e357f8b..1b013ac94a3f 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -262,9 +262,6 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
struct dma_slave_config txconf;
struct dma_async_tx_descriptor *txdesc;
 
-   if (!xfer->tx_buf)
-   return NULL;
-
memset(&txconf, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
txconf.dst_addr = dws->dma_addr;
@@ -383,17 +380,19 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   u16 imr = 0, dma_ctrl = 0;
+   u16 imr, dma_ctrl;
 
-   if (xfer->tx_buf)
-   dma_ctrl |= SPI_DMA_TDMAE;
+   if (!xfer->tx_buf)
+   return -EINVAL;
+
+   /* Set the DMA handshaking interface */
+   dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
dma_ctrl |= SPI_DMA_RDMAE;
dw_writel(dws, DW_SPI_DMACR, dma_ctrl);
 
/* Set the interrupt mask */
-   if (xfer->tx_buf)
-   imr |= SPI_INT_TXOI;
+   imr = SPI_INT_TXOI;
if (xfer->rx_buf)
imr |= SPI_INT_RXUI | SPI_INT_RXOI;
spi_umask_intr(dws, imr);
@@ -412,6 +411,8 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
/* Prepare the TX dma transfer */
txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+   if (!txdesc)
+   return -EINVAL;
 
/* Prepare the RX dma transfer */
rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
@@ -423,17 +424,15 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
dma_async_issue_pending(dws->rxchan);
}
 
-   if (txdesc) {
-   set_bit(TX_BUSY, &dws->dma_chan_busy);
-   dmaengine_submit(txdesc);
-   dma_async_issue_pending(dws->txchan);
-   }
+   set_bit(TX_BUSY, &dws->dma_chan_busy);
+   dmaengine_submit(txdesc);
+   dma_async_issue_pending(dws->txchan);
 
ret = dw_spi_dma_wait(dws, xfer);
if (ret)
return ret;
 
-   if (txdesc && dws->master->cur_msg->status == -EINPROGRESS) {
+   if (dws->master->cur_msg->status == -EINPROGRESS) {
ret = dw_spi_dma_wait_tx_done(dws, xfer);
if (ret)
return ret;
-- 
2.27.0

[PATCH v2 10/11] spi: dw-dma: Pass exact data to the DMA submit and wait methods

2020-09-20 Thread Serge Semin

In order to use the DMA submission and waiting methods in both generic
DMA-based SPI transfer and one-by-one DMA SG entries transmission
functions, we need to modify the dw_spi_dma_wait() and
dw_spi_dma_submit_tx()/dw_spi_dma_submit_rx() prototypes. So instead of
getting the SPI transfer object as the second argument they must accept
the exact data structure instances they imply to use. Those are the
current transfer length and the SPI bus frequency in case of
dw_spi_dma_wait(), and SG list together with number of list entries in
case of the DMA Tx/Rx submission methods.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 935f073a3523..f333c2e23bf6 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -188,12 +188,12 @@ static enum dma_slave_buswidth 
dw_spi_dma_convert_width(u8 n_bytes)
return DMA_SLAVE_BUSWIDTH_UNDEFINED;
 }
 
-static int dw_spi_dma_wait(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_wait(struct dw_spi *dws, unsigned int len, u32 speed)
 {
unsigned long long ms;
 
-   ms = xfer->len * MSEC_PER_SEC * BITS_PER_BYTE;
-   do_div(ms, xfer->effective_speed_hz);
+   ms = len * MSEC_PER_SEC * BITS_PER_BYTE;
+   do_div(ms, speed);
ms += ms + 200;
 
if (ms > UINT_MAX)
@@ -268,17 +268,16 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, &txconf);
 }
 
-static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *txdesc;
dma_cookie_t cookie;
int ret;
 
-   txdesc = dmaengine_prep_slave_sg(dws->txchan,
-   xfer->tx_sg.sgl,
-   xfer->tx_sg.nents,
-   DMA_MEM_TO_DEV,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   txdesc = dmaengine_prep_slave_sg(dws->txchan, sgl, nents,
+DMA_MEM_TO_DEV,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
return -ENOMEM;
 
@@ -370,17 +369,16 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, &rxconf);
 }
 
-static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *rxdesc;
dma_cookie_t cookie;
int ret;
 
-   rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
-   xfer->rx_sg.sgl,
-   xfer->rx_sg.nents,
-   DMA_DEV_TO_MEM,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   rxdesc = dmaengine_prep_slave_sg(dws->rxchan, sgl, nents,
+DMA_DEV_TO_MEM,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
return -ENOMEM;
 
@@ -443,13 +441,14 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
int ret;
 
/* Submit the DMA Tx transfer */
-   ret = dw_spi_dma_submit_tx(dws, xfer);
+   ret = dw_spi_dma_submit_tx(dws, xfer->tx_sg.sgl, xfer->tx_sg.nents);
if (ret)
goto err_clear_dmac;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   ret = dw_spi_dma_submit_rx(dws, xfer);
+   ret = dw_spi_dma_submit_rx(dws, xfer->rx_sg.sgl,
+  xfer->rx_sg.nents);
if (ret)
goto err_clear_dmac;
 
@@ -459,7 +458,7 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer->len, xfer->effective_speed_hz);
 
 err_clear_dmac:
dw_writel(dws, DW_SPI_DMACR, 0);
-- 
2.27.0

Re: [PATCH v2 1/3] MAINTAINERS: Add Fabrizio Castro to Renesas DRIF

2020-09-20 Thread Ramesh Shanmugasundaram

Hi Fabrizio,

>
> Renesas are expanding their DRIF support and offering,
> I'll be the internal maintainer for DRIF.
>
> Signed-off-by: Fabrizio Castro 
> Reviewed-by: Laurent Pinchart 

Thank you for volunteering :-).

Reviewed-by: Ramesh Shanmugasundaram 

Please feel free to take my name off the maintainers list as I am not
spending time on this topic for a while now.

Thanks,
Ramesh

> ---
> v1->v2:
> * No change
>
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2575f449139a..d9ebaf0c179b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -10909,6 +10909,7 @@ F:  include/media/drv-intf/renesas-ceu.h
>
>  MEDIA DRIVERS FOR RENESAS - DRIF
>  M: Ramesh Shanmugasundaram 
> +M: Fabrizio Castro 
>  L: linux-me...@vger.kernel.org
>  L: linux-renesas-...@vger.kernel.org
>  S: Supported
> --
> 2.25.1
>

[PATCH v2 00/11] spi: dw-dma: Add max SG entries burst capability support

2020-09-20 Thread Serge Semin

Mainly this series is about fixing a very nasty problem discovered in the
DW APB SSI driver working in a couple with DW DMAC, which doesn't have
multi-block capability support (a.k.a. accelerated SG list entries
traversal/burst, or automatic LLP entries reload, etc.).

DW DMA IP-core and its DMA channel on the synthesize stage can be tuned by
setting a vast number of the model parameters, some of which are provided
to create a more optimized/performant controller. In particular two of
those parameters are DMAH_CHx_MULTI_BLK_EN and DMAH_CHx_HC_LLP. If at
least one of them is set to zero (false) then the target DMA controller
will be smaller and faster but will lack of the DMA blocks chaining
support. In the kernel notation it means that the controller won't be able
to automatically reload a next SG-list entry, when a current one is
finished. Since Linux kernel DMA subsystem interface requires to have the
SG-lists execution supported, the DW DMA engine driver is developed in a
way so to resubmit each SG-list entry one-by-one by software means: each
SG-list entry execution finish is tracked by the DW DMA controller
interrupt, which handler executes a tasklet, which then re-charges a DW
DMA channel with a next SG-list entry if one is available. Such
implementation is a normal design and works well for the most of the DW
DMAC client devices. But it may cause problems for devices, which send and
receive data by means of internal FIFOs. That requires having several DMA
channels working synchronously in order to prevent the FIFOs overflow.

A bright example of such device is the DW APB SSI controller. It has Tx
and Rx FIFOs, which first need to be filled in with data before data
sending out or receiving in. But those FIFOs are inter-dependent because
of the SPI nature and its shift-register design. So each sent over Tx FIFO
byte immediately causes getting a byte from the SPI bus into the Rx FIFO.
It makes a strategy of working with the SPI controller a bit tricky. The
more data we push into the Tx FIFO and the faster the SPI bus is, the more
careful we have to be in pulling data from Rx FIFO since if software or
DMA engine misses a moment when the Rx FIFO is full, for instance, due to
being busy with some other activity or due to being blocked if system bus
is busy with doing something else or just due to being too slow to keep up
with incoming data, then Rx FIFO will be overflown, which consequently
causes data loss. Needless to say that such situation is fatal and mustn't
be tolerated for a bus like SPI.

In application to the DW APB SSI driver the problem above may happen when
DW DMAC is synthesized with no multi-block capability support and it's
enabled to be working with DW APB SSI for full-duplex transfers. DW APB
SSI driver allocates two DW DMAC channels to perform Tx and Rx SPI
transfers, initializes them, submits Tx and Rx SG-lists for execution and
then waits until the DMA transfers are finished. The issue happens when Rx
SG-list consists of more than one entry. Since the multi-block capability
is unsupported the DW DMAC driver will use the software-based SG-list
entries traverse implementation, which by design may cause
non-deterministic latency during the Rx SG-list entries re-charge. During
the entries re-charge procedure the DMA Tx channel will keep pushing data
into the SPI Tx FIFO. DW APB SSI controller in its turn will keep pushing
data out from the Tx FIFO to the SPI bus, and will immediately fill in the
Rx FIFO. So if the latency is big enough, then we'll eventually end up
with Rx FIFO overflow.

One of the possible solution of the problem denoted above is to feed the
DMA engine with the Tx and Rx SG-list entries one-by-one. This
patch-series provides an implementation of that approach. First it moves
the requested DMA channels configuration into the dma_setup() callback,
which should have been there in the first place. Then it's better to move
the DMA transfers submission into the DMA-preparation methods to collect
all the setups in a single method. After that the current implementation
of a straightforward SG-lists DMA transfer should be unpinned into a
dedicated method dw_spi_dma_transfer_all() since we are about to introduce
an alternative DMA-based SPI transfer approach. Since DMA-transfers finish
is now locally detected we can simplify the driver code a bit and move the
DMAC register cleanup to a single place in the dw_spi_dma_transfer_all()
method. In order to re-use the DMA-{wait, submit Tx, submit Rx} methods we
have to alter their prototypes, so they would accept SG-lists instead of
the SPI-transfer structure. Finally we introduce a new DMA-based
SPI-transfer method, which traverses the SG-list entries in a loop and
synchronously submits each of then to the Tx and Rx DMA channels in a way
so the DMA engine wouldn't need to activate the prone to errors in our
case SG-list entries re-charge implementation. That new method is utilized
only for the DMA controllers, which can't handle all T

[PATCH v2 05/11] spi: dw-dma: Move DMA transfers submission to the channels prep methods

2020-09-20 Thread Serge Semin

Indeed we can freely move the dmaengine_submit() method invocation and the
Tx and Rx busy flag setting into the DMA Tx/Rx prepare methods. Since the
Tx/Rx preparation method is now mainly used for the DMA transfers
submission, here we suggest to rename it to have the _submit_{r,t}x suffix
instead.

By having this alteration applied first we implement another code
preparation before adding the one-by-one DMA SG entries transmission,
second we now have the dma_async_tx_descriptor descriptor used locally
only in the new DMA transfers submission methods (this will be cleaned up
a bit later), third we make the generic transfer method more readable,
where now the functionality of submission, execution and wait procedures
is transparently split up instead of having a preparation, intermixed
submission/execution and wait procedures.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index d2a67dee1a66..769d10ca74b4 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -272,7 +272,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
 }
 
 static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
 
@@ -287,6 +287,9 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
 
+   dmaengine_submit(txdesc);
+   set_bit(TX_BUSY, &dws->dma_chan_busy);
+
return txdesc;
 }
 
@@ -364,7 +367,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, &rxconf);
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
+static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws,
struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
@@ -380,6 +383,9 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
 
+   dmaengine_submit(rxdesc);
+   set_bit(RX_BUSY, &dws->dma_chan_busy);
+
return rxdesc;
 }
 
@@ -426,25 +432,21 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
struct dma_async_tx_descriptor *txdesc, *rxdesc;
int ret;
 
-   /* Prepare the TX dma transfer */
-   txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+   /* Submit the DMA Tx transfer */
+   txdesc = dw_spi_dma_submit_tx(dws, xfer);
if (!txdesc)
return -EINVAL;
 
-   /* Prepare the RX dma transfer */
+   /* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   rxdesc = dw_spi_dma_submit_rx(dws, xfer);
if (!rxdesc)
return -EINVAL;
 
/* rx must be started before tx due to spi instinct */
-   set_bit(RX_BUSY, &dws->dma_chan_busy);
-   dmaengine_submit(rxdesc);
dma_async_issue_pending(dws->rxchan);
}
 
-   set_bit(TX_BUSY, &dws->dma_chan_busy);
-   dmaengine_submit(txdesc);
dma_async_issue_pending(dws->txchan);
 
ret = dw_spi_dma_wait(dws, xfer);
-- 
2.27.0

[PATCH v2 01/11] spi: dw-dma: Set DMA Level registers on init

2020-09-20 Thread Serge Semin

Indeed the registers content doesn't get cleared when the SPI controller
is disabled and enabled. Max burst lengths aren't changed since the Rx and
Tx DMA channels are requested on init stage and are kept acquired until
the device is removed. Obviously SPI controller FIFO depth can't be
changed. Due to all of that we can safely move the DMA Transmit and
Receive data level registers initialization to the SPI controller DMA init
stage (when the SPI controller is being probed) instead of doing it for
each SPI transfer when dma_setup is called. This shall speed the DMA-based
SPI transfer initialization up a bit, particularly if the APB bus is
relatively slow.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index bb390ff67d1d..a7ff1e357f8b 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -49,6 +49,7 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
max_burst = RX_BURST_LEVEL;
 
dws->rxburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
 
ret = dma_get_slave_caps(dws->txchan, &caps);
if (!ret && caps.max_burst)
@@ -56,7 +57,19 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
else
max_burst = TX_BURST_LEVEL;
 
+   /*
+* Having a Rx DMA channel serviced with higher priority than a Tx DMA
+* channel might not be enough to provide a well balanced DMA-based
+* SPI transfer interface. There might still be moments when the Tx DMA
+* channel is occasionally handled faster than the Rx DMA channel.
+* That in its turn will eventually cause the SPI Rx FIFO overflow if
+* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
+* cleared by the Rx DMA channel. In order to fix the problem the Tx
+* DMA activity is intentionally slowed down by limiting the SPI Tx
+* FIFO depth with a value twice bigger than the Tx burst length.
+*/
dws->txburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
@@ -372,21 +385,6 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 {
u16 imr = 0, dma_ctrl = 0;
 
-   /*
-* Having a Rx DMA channel serviced with higher priority than a Tx DMA
-* channel might not be enough to provide a well balanced DMA-based
-* SPI transfer interface. There might still be moments when the Tx DMA
-* channel is occasionally handled faster than the Rx DMA channel.
-* That in its turn will eventually cause the SPI Rx FIFO overflow if
-* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
-* cleared by the Rx DMA channel. In order to fix the problem the Tx
-* DMA activity is intentionally slowed down by limiting the SPI Tx
-* FIFO depth with a value twice bigger than the Tx burst length
-* calculated earlier by the dw_spi_dma_maxburst_init() method.
-*/
-   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
-   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
-
if (xfer->tx_buf)
dma_ctrl |= SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0

[PATCH v2 07/11] spi: dw-dma: Remove DMA Tx-desc passing around

2020-09-20 Thread Serge Semin

It's pointless to pass the Rx and Tx transfers DMA Tx-descriptors, since
they are used in the Tx/Rx submit method only. Instead just return the
submission status from these methods. This alteration will make the code
less complex.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 31 ++-
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index aa3900809126..9f70818acce6 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -271,8 +271,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, &txconf);
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
dma_cookie_t cookie;
@@ -284,7 +283,7 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
-   return NULL;
+   return -ENOMEM;
 
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
@@ -293,12 +292,12 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
ret = dma_submit_error(cookie);
if (ret) {
dmaengine_terminate_sync(dws->txchan);
-   return NULL;
+   return ret;
}
 
set_bit(TX_BUSY, &dws->dma_chan_busy);
 
-   return txdesc;
+   return 0;
 }
 
 static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
@@ -375,8 +374,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, &rxconf);
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
dma_cookie_t cookie;
@@ -388,7 +386,7 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
-   return NULL;
+   return -ENOMEM;
 
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
@@ -397,12 +395,12 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
ret = dma_submit_error(cookie);
if (ret) {
dmaengine_terminate_sync(dws->rxchan);
-   return NULL;
+   return ret;
}
 
set_bit(RX_BUSY, &dws->dma_chan_busy);
 
-   return rxdesc;
+   return 0;
 }
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -445,19 +443,18 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   struct dma_async_tx_descriptor *txdesc, *rxdesc;
int ret;
 
/* Submit the DMA Tx transfer */
-   txdesc = dw_spi_dma_submit_tx(dws, xfer);
-   if (!txdesc)
-   return -EINVAL;
+   ret = dw_spi_dma_submit_tx(dws, xfer);
+   if (ret)
+   return ret;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   rxdesc = dw_spi_dma_submit_rx(dws, xfer);
-   if (!rxdesc)
-   return -EINVAL;
+   ret = dw_spi_dma_submit_rx(dws, xfer);
+   if (ret)
+   return ret;
 
/* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
-- 
2.27.0

[PATCH v2 06/11] spi: dw-dma: Check DMA Tx-desc submission status

2020-09-20 Thread Serge Semin

We suggest to add the dmaengine_submit() return value test for errors.  It
has been unnecessary while the driver was expected to be utilized in pair
with DW DMAC. But since now the driver can be used with any DMA engine, it
might be useful to track the errors on DMA submissions so not miss them
and get into an unpredictable driver behaviour.

Signed-off-by: Serge Semin 

---

Changelog v2:
- Replace negative conditional statements with the positive ones.
- Terminate the prepared descriptors on submission errors.
---
 drivers/spi/spi-dw-dma.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 769d10ca74b4..aa3900809126 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -275,6 +275,8 @@ static struct dma_async_tx_descriptor *
 dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
+   dma_cookie_t cookie;
+   int ret;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -287,7 +289,13 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
 
-   dmaengine_submit(txdesc);
+   cookie = dmaengine_submit(txdesc);
+   ret = dma_submit_error(cookie);
+   if (ret) {
+   dmaengine_terminate_sync(dws->txchan);
+   return NULL;
+   }
+
set_bit(TX_BUSY, &dws->dma_chan_busy);
 
return txdesc;
@@ -371,6 +379,8 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
+   dma_cookie_t cookie;
+   int ret;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -383,7 +393,13 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
 
-   dmaengine_submit(rxdesc);
+   cookie = dmaengine_submit(rxdesc);
+   ret = dma_submit_error(cookie);
+   if (ret) {
+   dmaengine_terminate_sync(dws->rxchan);
+   return NULL;
+   }
+
set_bit(RX_BUSY, &dws->dma_chan_busy);
 
return rxdesc;
-- 
2.27.0

[PATCH v2 08/11] spi: dw-dma: Detach DMA transfer into a dedicated method

2020-09-20 Thread Serge Semin

In order to add an alternative method of DMA-based SPI transfer first we
need to detach the currently available one from the common code. Here we
move the normal DMA-based SPI transfer execution functionality into a
dedicated method. It will be utilized if either the DMA engine supports
an unlimited number SG entries or Tx-only SPI transfer is requested. But
currently just use it for any SPI transfer.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 9f70818acce6..f2baefcae9ae 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -441,7 +441,8 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
return 0;
 }
 
-static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_transfer_all(struct dw_spi *dws,
+  struct spi_transfer *xfer)
 {
int ret;
 
@@ -462,7 +463,14 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   return dw_spi_dma_wait(dws, xfer);
+}
+
+static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   int ret;
+
+   ret = dw_spi_dma_transfer_all(dws, xfer);
if (ret)
return ret;
 
-- 
2.27.0

[PATCH v2 09/11] spi: dw-dma: Move DMAC register cleanup to DMA transfer method

2020-09-20 Thread Serge Semin

DW APB SSI DMA driver doesn't use the native SPI core wait API since
commit bdbdf0f06337 ("spi: dw: Locally wait for the DMA transfers
completion"). Due to that the driver can now clear the DMAC register
in a single place synchronously with the DMA transactions completion
or failure. After that all the possible code paths are still covered:
1) DMA completion callbacks are executed in case if the corresponding DMA
transactions are finished. When they are, one of them will eventually wake
the SPI messages pump kernel thread and dw_spi_dma_transfer_all() method
will clean the DMAC register as implied by this patch.
2) dma_stop is called when the SPI core detects an error either returned
from the transfer_one() callback or set in the SPI message status field.
Both types of errors will be noticed by the dw_spi_dma_transfer_all()
method.
3) dma_exit is called when either SPI controller driver or the
corresponding device is removed. In any case the SPI core will first
flush the SPI messages pump kernel thread, so any pending or in-fly
SPI transfers will be finished before that.

Due to all of that let's simplify the DW APB SSI DMA driver a bit and
move the DMAC register cleanup to a single place in the
dw_spi_dma_transfer_all() method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index f2baefcae9ae..935f073a3523 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -152,8 +152,6 @@ static void dw_spi_dma_exit(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
dma_release_channel(dws->rxchan);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws)
@@ -252,7 +250,6 @@ static void dw_spi_dma_tx_done(void *arg)
if (test_bit(RX_BUSY, &dws->dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(&dws->dma_completion);
 }
 
@@ -355,7 +352,6 @@ static void dw_spi_dma_rx_done(void *arg)
if (test_bit(TX_BUSY, &dws->dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(&dws->dma_completion);
 }
 
@@ -449,13 +445,13 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
/* Submit the DMA Tx transfer */
ret = dw_spi_dma_submit_tx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
ret = dw_spi_dma_submit_rx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
@@ -463,7 +459,12 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   return dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer);
+
+err_clear_dmac:
+   dw_writel(dws, DW_SPI_DMACR, 0);
+
+   return ret;
 }
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -496,8 +497,6 @@ static void dw_spi_dma_stop(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
clear_bit(RX_BUSY, &dws->dma_chan_busy);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static const struct dw_spi_dma_ops dw_spi_dma_mfld_ops = {
-- 
2.27.0

[PATCH 01/30] spi: dw: Discard IRQ threshold macro

2020-09-20 Thread Serge Semin

The macro has been unused since a half of FIFO length was defined to be a
marker of the IRQ. Let's remove it definition.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 90dfd21622d6..51bab30b9f85 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -92,9 +92,6 @@
 #define SPI_DMA_RDMAE  (1 << 0)
 #define SPI_DMA_TDMAE  (1 << 1)
 
-/* TX RX interrupt level threshold, max can be 256 */
-#define SPI_INT_THRESHOLD  32
-
 enum dw_ssi_type {
SSI_MOTO_SPI = 0,
SSI_TI_SSP,
-- 
2.27.0

[PATCH 12/30] spi: dw: Detach SPI device specific CR0 config method

2020-09-20 Thread Serge Semin

Indeed there is no point in detecting the SPI peripheral device parameters
and initializing the CR0 register fields each time an SPI transfer is
executed. Instead let's define a dedicated CR0 chip-data member, which
will be initialized in accordance with the SPI device settings at the
moment of setting it up.

By doing so we'll finally make the SPI device chip_data serving as it's
supposed to - to preserve the SPI device specific DW SPI configuration.
See spi-fsl-dspi.c, spi-pl022.c, spi-pxa2xx.c drivers for example of the
way the chip data is utilized.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index c21641a485ce..a9644351d75f 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -27,6 +27,7 @@ struct chip_data {
u16 clk_div;/* baud rate divider */
u32 speed_hz;   /* baud rate */
 
+   u32 cr0;
u32 rx_sample_dly;  /* RX sample delay */
 };
 
@@ -228,31 +229,41 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
return dws->transfer_handler(dws);
 }
 
-static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
- struct spi_transfer *transfer)
+static u32 dw_spi_get_cr0(struct dw_spi *dws, struct spi_device *spi)
 {
-   struct chip_data *chip = spi_get_ctldata(spi);
-   u32 cr0;
-
-   cr0 = (transfer->bits_per_word - 1);
+   u32 cr0 = 0;
 
if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) {
cr0 |= SSI_MOTO_SPI << SPI_FRF_OFFSET;
cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET;
cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET;
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET;
-   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
} else {
cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPOL_OFFSET;
cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPH_OFFSET;
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << 
DWC_SSI_CTRLR0_SRL_OFFSET;
-   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
 
if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
}
 
+   return cr0;
+}
+
+static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
+ struct spi_transfer *transfer)
+{
+   struct chip_data *chip = spi_get_ctldata(spi);
+   u32 cr0 = chip->cr0;
+
+   cr0 |= (transfer->bits_per_word - 1);
+
+   if (!(dws->caps & DW_SPI_CAP_DWC_SSI))
+   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
+   else
+   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
+
dw_writel(dws, DW_SPI_CTRLR0, cr0);
 }
 
@@ -350,6 +361,7 @@ static void dw_spi_handle_err(struct spi_controller *master,
 /* This may be called twice for each spi dev */
 static int dw_spi_setup(struct spi_device *spi)
 {
+   struct dw_spi *dws = spi_controller_get_devdata(spi->controller);
struct chip_data *chip;
 
/* Only alloc on first setup */
@@ -373,6 +385,13 @@ static int dw_spi_setup(struct spi_device *spi)
dws->max_freq);
}
 
+   /*
+* Update CR0 data each time the setup callback is invoked since
+* the device parameters could have been changed, for instance, by
+* the MMC SPI driver or something else.
+*/
+   chip->cr0 = dw_spi_get_cr0(dws, spi);
+
chip->tmode = SPI_TMOD_TR;
 
return 0;
-- 
2.27.0

[PATCH 07/30] spi: dw: Use relaxed IO-methods to access FIFOs

2020-09-20 Thread Serge Semin

In accordance with [1] the relaxed methods are guaranteed to be ordered
with respect to other accesses from the same CPU thread to the same
peripheral.  This is what we need during the data read/write from/to the
controller FIFOs being executed within a single IRQ handler or a kernel
task.

Such optimization shall significantly speed the data reader and writer up.
For instance, the relaxed IO-accessors utilization on Baikal-T1 lets the
driver to support the SPI memory operations with bus frequency three-fold
faster than if normal IO-accessors would be used.

[1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index ff77f39047ce..063fa1b1352d 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -161,29 +161,19 @@ static inline u32 dw_readl(struct dw_spi *dws, u32 offset)
return __raw_readl(dws->regs + offset);
 }
 
-static inline u16 dw_readw(struct dw_spi *dws, u32 offset)
-{
-   return __raw_readw(dws->regs + offset);
-}
-
 static inline void dw_writel(struct dw_spi *dws, u32 offset, u32 val)
 {
__raw_writel(val, dws->regs + offset);
 }
 
-static inline void dw_writew(struct dw_spi *dws, u32 offset, u16 val)
-{
-   __raw_writew(val, dws->regs + offset);
-}
-
 static inline u32 dw_read_io_reg(struct dw_spi *dws, u32 offset)
 {
switch (dws->reg_io_width) {
case 2:
-   return dw_readw(dws, offset);
+   return readw_relaxed(dws->regs + offset);
case 4:
default:
-   return dw_readl(dws, offset);
+   return readl_relaxed(dws->regs + offset);
}
 }
 
@@ -191,11 +181,11 @@ static inline void dw_write_io_reg(struct dw_spi *dws, 
u32 offset, u32 val)
 {
switch (dws->reg_io_width) {
case 2:
-   dw_writew(dws, offset, val);
+   writew_relaxed(val, dws->regs + offset);
break;
case 4:
default:
-   dw_writel(dws, offset, val);
+   writel_relaxed(val, dws->regs + offset);
break;
}
 }
-- 
2.27.0

[PATCH 10/30] spi: dw: Add KeemBay Master capability

2020-09-20 Thread Serge Semin

In a further commit we'll have to get rid of the update_cr0() callback and
define a DW SSI capability instead. Since Keem Bay master/slave
functionality is controller by the CTRL0 register bitfield, we need to
first move the master mode selection into the internal corresponding
update_cr0 method, which would be activated by means of the dedicated
DW_SPI_CAP_KEEMBAY_MST capability setup.

Note this will be also useful if the driver will be ever altered to
support the DW SPI slave interface.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c |  4 
 drivers/spi/spi-dw-mmio.c | 20 +++-
 drivers/spi/spi-dw.h  |  8 
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 78e5af8ed173..8f9737640ec1 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -252,6 +252,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
 struct spi_device *spi,
 struct spi_transfer *transfer)
 {
+   struct dw_spi *dws = spi_controller_get_devdata(master);
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0;
 
@@ -275,6 +276,9 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
/* CTRLR0[13] Shift Register Loop */
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET;
 
+   if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
+   cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
+
return cr0;
 }
 EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a);
diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
index 7111cb7ca23b..c0d351fde782 100644
--- a/drivers/spi/spi-dw-mmio.c
+++ b/drivers/spi/spi-dw-mmio.c
@@ -48,13 +48,6 @@ struct dw_spi_mmio {
 #define SPARX5_FORCE_ENA   0xa4
 #define SPARX5_FORCE_VAL   0xa8
 
-/*
- * For Keem Bay, CTRLR0[31] is used to select controller mode.
- * 0: SSI is slave
- * 1: SSI is master
- */
-#define KEEMBAY_CTRLR0_SSIC_IS_MST BIT(31)
-
 struct dw_spi_mscc {
struct regmap   *syscon;
void __iomem*spi_mst; /* Not sparx5 */
@@ -234,20 +227,13 @@ static int dw_spi_dwc_ssi_init(struct platform_device 
*pdev,
return 0;
 }
 
-static u32 dw_spi_update_cr0_keembay(struct spi_controller *master,
-struct spi_device *spi,
-struct spi_transfer *transfer)
-{
-   u32 cr0 = dw_spi_update_cr0_v1_01a(master, spi, transfer);
-
-   return cr0 | KEEMBAY_CTRLR0_SSIC_IS_MST;
-}
-
 static int dw_spi_keembay_init(struct platform_device *pdev,
   struct dw_spi_mmio *dwsmmio)
 {
+   dwsmmio->dws.caps = DW_SPI_CAP_KEEMBAY_MST;
+
/* Register hook to configure CTRLR0 */
-   dwsmmio->dws.update_cr0 = dw_spi_update_cr0_keembay;
+   dwsmmio->dws.update_cr0 = dw_spi_update_cr0_v1_01a;
 
return 0;
 }
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 4c748b2853a8..da9b543322c9 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -71,6 +71,13 @@
 #define DWC_SSI_CTRLR0_FRF_OFFSET  6
 #define DWC_SSI_CTRLR0_DFS_OFFSET  0
 
+/*
+ * For Keem Bay, CTRLR0[31] is used to select controller mode.
+ * 0: SSI is slave
+ * 1: SSI is master
+ */
+#define DWC_SSI_CTRLR0_KEEMBAY_MST BIT(31)
+
 /* Bit fields in SR, 7 bits */
 #define SR_MASK0x7f/* cover 7 bits 
*/
 #define SR_BUSY(1 << 0)
@@ -101,6 +108,7 @@ enum dw_ssi_type {
 
 /* DW SPI capabilities */
 #define DW_SPI_CAP_CS_OVERRIDE BIT(0)
+#define DW_SPI_CAP_KEEMBAY_MST BIT(1)
 
 struct dw_spi;
 struct dw_spi_dma_ops {
-- 
2.27.0

[PATCH 03/30] spi: dw: Initialize n_bytes before the memory barrier

2020-09-20 Thread Serge Semin

Since n_bytes field of the DW SPI private data is also utilized by the
IRQ handler, we need to make sure it' initialization is done before the
memory barrier.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 8b3ce5a0378a..1af74362461d 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -299,6 +299,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
dws->dma_mapped = 0;
spin_lock_irqsave(&dws->buf_lock, flags);
+   dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
dws->tx = (void *)transfer->tx_buf;
dws->tx_end = dws->tx + transfer->len;
dws->rx = transfer->rx_buf;
@@ -323,7 +324,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
}
 
transfer->effective_speed_hz = dws->max_freq / chip->clk_div;
-   dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
 
cr0 = dws->update_cr0(master, spi, transfer);
dw_writel(dws, DW_SPI_CTRLR0, cr0);
-- 
2.27.0

[PATCH 09/30] spi: dw: Convert CS-override to DW SPI capabilities

2020-09-20 Thread Serge Semin

There are several vendor-specific versions of the DW SPI controllers,
each of which may have some peculiarities with respect to the original
IP-core. Seeing it has already caused adding flags and a callback into the
DW SPI private data, let's introduce a generic capabilities interface to
tune the generic DW SPI controller driver up in accordance with the
particular controller specifics. It's done by converting a simple
Alpine-specific CS-override capability into the DW SPI controller
capability activated by setting the DW_SPI_CAP_CS_OVERRIDE flag.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 4 ++--
 drivers/spi/spi-dw-mmio.c | 2 +-
 drivers/spi/spi-dw.h  | 7 ++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index dd9574f9fafc..78e5af8ed173 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -104,7 +104,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable)
 */
if (cs_high == enable)
dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select));
-   else if (dws->cs_override)
+   else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
dw_writel(dws, DW_SPI_SER, 0);
 }
 EXPORT_SYMBOL_GPL(dw_spi_set_cs);
@@ -435,7 +435,7 @@ static void spi_hw_init(struct device *dev, struct dw_spi 
*dws)
}
 
/* enable HW fixup for explicit CS deselect for Amazon's alpine chip */
-   if (dws->cs_override)
+   if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
dw_writel(dws, DW_SPI_CS_OVERRIDE, 0xF);
 }
 
diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
index 18772c0c9220..7111cb7ca23b 100644
--- a/drivers/spi/spi-dw-mmio.c
+++ b/drivers/spi/spi-dw-mmio.c
@@ -204,7 +204,7 @@ static int dw_spi_mscc_sparx5_init(struct platform_device 
*pdev,
 static int dw_spi_alpine_init(struct platform_device *pdev,
  struct dw_spi_mmio *dwsmmio)
 {
-   dwsmmio->dws.cs_override = 1;
+   dwsmmio->dws.caps = DW_SPI_CAP_CS_OVERRIDE;
 
/* Register hook to configure CTRLR0 */
dwsmmio->dws.update_cr0 = dw_spi_update_cr0;
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 4420b4d29bac..4c748b2853a8 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -2,6 +2,7 @@
 #ifndef DW_SPI_HEADER_H
 #define DW_SPI_HEADER_H
 
+#include 
 #include 
 #include 
 #include 
@@ -98,6 +99,9 @@ enum dw_ssi_type {
SSI_NS_MICROWIRE,
 };
 
+/* DW SPI capabilities */
+#define DW_SPI_CAP_CS_OVERRIDE BIT(0)
+
 struct dw_spi;
 struct dw_spi_dma_ops {
int (*dma_init)(struct device *dev, struct dw_spi *dws);
@@ -118,7 +122,8 @@ struct dw_spi {
u32 fifo_len;   /* depth of the FIFO buffer */
u32 max_freq;   /* max bus freq supported */
 
-   int cs_override;
+   u32 caps;   /* DW SPI capabilities */
+
u32 reg_io_width;   /* DR I/O width in bytes */
u16 bus_num;
u16 num_cs; /* supported slave numbers */
-- 
2.27.0

[PATCH 00/30] spi: dw: Add full Baikal-T1 SPI Controllers support

2020-09-20 Thread Serge Semin

Originally I intended to merge a dedicated Baikal-T1 System Boot SPI
Controller driver into the kernel and leave the DW APB SSI driver
untouched. But after a long discussion (see the link at the bottom of the
letter) Mark and Andy persuaded me to integrate what we developed there
into the DW APB SSI core driver to be useful for another controllers,
which may have got the same peculiarities/problems as ours:
- No IRQ.
- No DMA.
- No GPIO CS, so a native CS is utilized.
- small Tx/Rx FIFO depth.
- Automatic CS assertion/de-assertion.
- Slow system bus.
All of them have been fixed in the framework of this patchset in some
extent at least for the SPI memory operations. As I expected it wasn't
that easy and the integration took that many patches as you can see from
the subject. Though some of them are mere cleanups or weakly related with
the subject fixes, but we just couldn't leave the code as is at some
places since we were working with the DW APB SSI driver anyway. Here is
what we did to fix the original DW APB SSI driver, to make it less messy.

First two patches are just cleanups to simplify the DW APB SSI device
initialization a bit. We suggest to discard the IRQ threshold macro as
unused and use a ternary operator to initialize the set_cs callback
instead of assigning-and-updating it.

Then we've discovered that the n_bytes field of the driver private data is
used by the DW APB SSI IRQ handler, which requires it to be initialized
before the SMP memory barrier and to be visible from another CPUs. Speaking
about the SMP memory barrier. Having one right after the shared resources
initialization is enough and there is no point in using the spin-lock to
protect the Tx/Rx buffer pointers. The protection functionality is
redundant there by the driver design. (Though I have a doubt whether the
SMP memory barrier is also required there because the normal IO-methods
like readl/writel implies a full memory barrier. So any memory operations
performed before them are supposed to be seen by devices and another CPUs.
See the patch log for details of my concern.)

Thirdly we've found out that there is some confusion in the IRQs
masking/unmasking/clearing in the SPI-transfer procedure. Multiple interrupts
are unmasked on the SPI-transfer initialization, but just TXEI is only
masked back on completion. Similarly IRQ status isn't cleared on the
controller reset, which actually makes the reset being not full and errors
prone in the controller probe procedure.

Another very important optimization is using the IO-relaxed accessors in
the dw_read_io_reg()/dw_write_io_reg() methods. Since the Tx/Rx FIFO data
registers are the most frequently accessible controller resource, using
relaxed accessors there will significantly improve the data read/write
performance. At least on Baikal-T1 SoC such modification opens up a way to
have the DW APB SSI controller working with higher SPI bus speeds, than
without it.

Fifthly we've made an effort to cleanup the code using the SPI-device
private data - chip_data. We suggest to remove the chip type from there
since it isn't used and isn't implemented right anyway. Then instead of
having a bus speed, clock divider, transfer mode preserved there, and
recalculating the CR0 fields of the SPI-device-specific phase, polarity
and frame format each time the SPI transfer is requested, we can save it
in the chip_data instance. By doing so we'll make that structure finally
used as it was supposed to by design (see the spi-fsl-dspi.c, spi-pl022.c,
spi-pxa2xx.c drivers for examples).

Sixthly instead of having the SPI-transfer specific CR0-update callback,
we suggest to implement the DW APB SSI controller capabilities approach.
By doing so we can now inject the vendor-specific peculiarities in
different parts of the DW APB SSI core driver (which is required to
implement both SPI-transfers and the SPI memory operations). This will
also make the code less confusing like defining a callback in the core
driver, setting it up in the glue layer, then calling it from the core
driver again. Seeing the small capabilities implementation embedded
in-situ is more readable than tracking the callbacks assignments. This
will concern the CS-override, Keembay master setup, DW SSI-specific CR0
registers layout capabilities.

Seventhly since there are going to be two types of the transfers
implemented in the DW APB SSI core driver, we need a common method to set
the controller configuration like, Tx/Rx-mode, bus speed, data frame size
and number of data frames to read in case of the memory operations. So we
just detached the corresponding code from the SPI-transfer-one method and
made it to be a part of the new dw_spi_update_config() function, which is
former update_cr0(). Note that the new method will be also useful for the
glue drivers, which due to the hardware design need to create their own
memory operations (for instance, for the dirmap-operations provided in the
Baikal-T System Boot SPI controller driver).

Eighthly

[PATCH 02/30] spi: dw: Use ternary op to init set_cs callback

2020-09-20 Thread Serge Semin

Simplify the dw_spi_add_host() method a bit by replacing the set_cs
callback overwrite procedure with direct setting the callback if a custom
version of one is specified.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 55afdcee7d2b..8b3ce5a0378a 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -482,7 +482,7 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
master->num_chipselect = dws->num_cs;
master->setup = dw_spi_setup;
master->cleanup = dw_spi_cleanup;
-   master->set_cs = dw_spi_set_cs;
+   master->set_cs = dws->set_cs ?: dw_spi_set_cs;
master->transfer_one = dw_spi_transfer_one;
master->handle_err = dw_spi_handle_err;
master->max_speed_hz = dws->max_freq;
@@ -491,9 +491,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
master->flags = SPI_MASTER_GPIO_SS;
master->auto_runtime_pm = true;
 
-   if (dws->set_cs)
-   master->set_cs = dws->set_cs;
-
/* Get default rx sample delay */
device_property_read_u32(dev, "rx-sample-delay-ns",
 &dws->def_rx_sample_dly_ns);
-- 
2.27.0

[PATCH 14/30] spi: dw: Simplify the SPI bus speed config procedure

2020-09-20 Thread Serge Semin

The code currently responsible for the SPI communication speed setting up
is a bit messy. Most likely for some historical reason the bus frequency
is saved in the peripheral chip private data. It's pointless now since the
custom communication speed is a SPI-transfer-specific thing and only if
there is no SPI transfer data specified (like during the SPI memory
operations) it can be taken from the SPI device structure. But even in the
later case there is no point in having the clock divider and the SPI bus
frequency saved in the chip data, because the controller can be used for
both SPI-transfer-based and SPI-transfer-less communications. From
software point of view keeping the current clock divider in an SPI-device
specific storage may give a small performance gain (to avoid sometimes a
round-up division), but in comparison to the total SPI transfer time it
just doesn't worth saving a few CPU cycles in comparison to the total SPI
transfer time while having the harder to read code. The only optimization,
which could worth preserving in the code is to avoid unnecessary DW SPI
controller registers update if it's possible. So to speak let's simplify
the SPI communication speed update procedure by removing the clock-related
fields from the peripheral chip data and update the DW SPI clock divider
only if it's really changed. The later change is reached by keeping the
effective SPI bus speed in the internal DW SPI private data.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 23 ++-
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 0b80a16d9872..92138a6ada12 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -24,9 +24,6 @@
 struct chip_data {
u8 tmode;   /* TR/TO/RO/EEPROM */
 
-   u16 clk_div;/* baud rate divider */
-   u32 speed_hz;   /* baud rate */
-
u32 cr0;
u32 rx_sample_dly;  /* RX sample delay */
 };
@@ -256,6 +253,8 @@ static void dw_spi_update_config(struct dw_spi *dws, struct 
spi_device *spi,
 {
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0 = chip->cr0;
+   u32 speed_hz;
+   u16 clk_div;
 
cr0 |= (transfer->bits_per_word - 1);
 
@@ -266,15 +265,13 @@ static void dw_spi_update_config(struct dw_spi *dws, 
struct spi_device *spi,
 
dw_writel(dws, DW_SPI_CTRLR0, cr0);
 
-   /* Handle per transfer options for bpw and speed */
-   if (transfer->speed_hz != dws->current_freq) {
-   if (transfer->speed_hz != chip->speed_hz) {
-   /* clk_div doesn't support odd number */
-   chip->clk_div = (DIV_ROUND_UP(dws->max_freq, 
transfer->speed_hz) + 1) & 0xfffe;
-   chip->speed_hz = transfer->speed_hz;
-   }
-   dws->current_freq = transfer->speed_hz;
-   spi_set_clk(dws, chip->clk_div);
+   /* Note DW APB SSI clock divider doesn't support odd numbers */
+   clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 
0xfffe;
+   speed_hz = dws->max_freq / clk_div;
+
+   if (dws->current_freq != speed_hz) {
+   spi_set_clk(dws, clk_div);
+   dws->current_freq = speed_hz;
}
 }
 
@@ -302,7 +299,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
dw_spi_update_config(dws, spi, transfer);
 
-   transfer->effective_speed_hz = dws->max_freq / chip->clk_div;
+   transfer->effective_speed_hz = dws->current_freq;
 
/* Check if current transfer is a DMA transaction */
if (master->can_dma && master->can_dma(master, spi, transfer))
-- 
2.27.0

[PATCH 08/30] spi: dw: Discard DW SSI chip type storages

2020-09-20 Thread Serge Semin

Keeping SPI peripheral devices type is pointless since first it hasn't
been functionally utilized by any of the client drivers/code and second it
won't work for Microwire type at the very least. Moreover there is no
point in setting up the type by means of the chip-data in the modern
kernel. The peripheral devices with specific interface type need to be
detected in order to activate the corresponding frame format. It most
likely will require some peripheral device specific DT property or
whatever to find out the interface protocol. So let's remove the serial
interface type fields from the DW APB SSI controller and the SPI
peripheral device private data.

Note we'll preserve the explicit SSI_MOTO_SPI interface type setting up to
signify the only currently supported interface protocol.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 6 ++
 drivers/spi/spi-dw.h  | 1 -
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index be94ed5bb896..dd9574f9fafc 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -23,7 +23,6 @@
 /* Slave spi_dev related */
 struct chip_data {
u8 tmode;   /* TR/TO/RO/EEPROM */
-   u8 type;/* SPI/SSP/MicroWire */
 
u16 clk_div;/* baud rate divider */
u32 speed_hz;   /* baud rate */
@@ -238,7 +237,7 @@ u32 dw_spi_update_cr0(struct spi_controller *master, struct 
spi_device *spi,
 
/* Default SPI mode is SCPOL = 0, SCPH = 0 */
cr0 = (transfer->bits_per_word - 1)
-   | (chip->type << SPI_FRF_OFFSET)
+   | (SSI_MOTO_SPI << SPI_FRF_OFFSET)
| spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) |
   (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) |
   (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET))
@@ -260,7 +259,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
cr0 = (transfer->bits_per_word - 1);
 
/* CTRLR0[ 7: 6] Frame Format */
-   cr0 |= chip->type << DWC_SSI_CTRLR0_FRF_OFFSET;
+   cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
 
/*
 * SPI mode (SCPOL|SCPH)
@@ -453,7 +452,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
return -ENOMEM;
 
dws->master = master;
-   dws->type = SSI_MOTO_SPI;
dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR);
 
spi_controller_set_devdata(master, dws);
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 063fa1b1352d..4420b4d29bac 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -111,7 +111,6 @@ struct dw_spi_dma_ops {
 
 struct dw_spi {
struct spi_controller   *master;
-   enum dw_ssi_typetype;
 
void __iomem*regs;
unsigned long   paddr;
-- 
2.27.0

[PATCH 16/30] spi: dw: Add DW SPI controller config structure

2020-09-20 Thread Serge Semin

DW APB SSI controller can be used by the two SPI core interfaces:
traditional SPI transfers and SPI memory operations. The controller needs
to be accordingly configured at runtime when the corresponding operations
are executed. In order to do that for the both interfaces from a single
function we introduce a new data wrapper for the transfer mode, data
width, number of data frames (for the automatic data transfer) and the bus
frequency. It will be used by the update_config() method to tune the DW
APB SSI up.

The update_config() method is made exported to be used not only by the DW
SPI core driver, but by the glue layer drivers too. This will be required
in a coming further commit.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 29 +
 drivers/spi/spi-dw.h  | 10 ++
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index f00fc4828480..9102685c1523 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -20,10 +20,8 @@
 #include 
 #endif
 
-/* Slave spi_dev related */
+/* Slave spi_device related */
 struct chip_data {
-   u8 tmode;   /* TR/TO/RO/EEPROM */
-
u32 cr0;
u32 rx_sample_dly;  /* RX sample delay */
 };
@@ -248,25 +246,28 @@ static u32 dw_spi_get_cr0(struct dw_spi *dws, struct 
spi_device *spi)
return cr0;
 }
 
-static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi,
-struct spi_transfer *transfer)
+void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi,
+ struct dw_spi_cfg *cfg)
 {
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0 = chip->cr0;
u32 speed_hz;
u16 clk_div;
 
-   cr0 |= (transfer->bits_per_word - 1);
+   cr0 |= (cfg->dfs - 1);
 
if (!(dws->caps & DW_SPI_CAP_DWC_SSI))
-   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
+   cr0 |= cfg->tmode << SPI_TMOD_OFFSET;
else
-   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
+   cr0 |= cfg->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
 
dw_writel(dws, DW_SPI_CTRLR0, cr0);
 
+   if (cfg->tmode == SPI_TMOD_EPROMREAD || cfg->tmode == SPI_TMOD_RO)
+   dw_writel(dws, DW_SPI_CTRLR1, cfg->ndf ? cfg->ndf - 1 : 0);
+
/* Note DW APB SSI clock divider doesn't support odd numbers */
-   clk_div = (DIV_ROUND_UP(dws->max_freq, transfer->speed_hz) + 1) & 
0xfffe;
+   clk_div = (DIV_ROUND_UP(dws->max_freq, cfg->freq) + 1) & 0xfffe;
speed_hz = dws->max_freq / clk_div;
 
if (dws->current_freq != speed_hz) {
@@ -280,11 +281,17 @@ static void dw_spi_update_config(struct dw_spi *dws, 
struct spi_device *spi,
dws->cur_rx_sample_dly = chip->rx_sample_dly;
}
 }
+EXPORT_SYMBOL_GPL(dw_spi_update_config);
 
 static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
 {
struct dw_spi *dws = spi_controller_get_devdata(master);
+   struct dw_spi_cfg cfg = {
+   .tmode = SPI_TMOD_TR,
+   .dfs = transfer->bits_per_word,
+   .freq = transfer->speed_hz,
+   };
u8 imask = 0;
u16 txlevel = 0;
int ret;
@@ -302,7 +309,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
spi_enable_chip(dws, 0);
 
-   dw_spi_update_config(dws, spi, transfer);
+   dw_spi_update_config(dws, spi, &cfg);
 
transfer->effective_speed_hz = dws->current_freq;
 
@@ -388,8 +395,6 @@ static int dw_spi_setup(struct spi_device *spi)
 */
chip->cr0 = dw_spi_get_cr0(dws, spi);
 
-   chip->tmode = SPI_TMOD_TR;
-
return 0;
 }
 
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index c02351cf2f99..2a2346438564 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -111,6 +111,14 @@ enum dw_ssi_type {
 #define DW_SPI_CAP_KEEMBAY_MST BIT(1)
 #define DW_SPI_CAP_DWC_SSI BIT(2)
 
+/* Slave spi_transfer/spi_mem_op related */
+struct dw_spi_cfg {
+   u8 tmode;
+   u8 dfs;
+   u32 ndf;
+   u32 freq;
+};
+
 struct dw_spi;
 struct dw_spi_dma_ops {
int (*dma_init)(struct device *dev, struct dw_spi *dws);
@@ -249,6 +257,8 @@ static inline void spi_shutdown_chip(struct dw_spi *dws)
 }
 
 extern void dw_spi_set_cs(struct spi_device *spi, bool enable);
+extern void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi,
+struct dw_spi_cfg *cfg);
 extern int dw_spi_add_host(struct device *dev, struct dw_spi *dws);
 extern void dw_spi_remove_host(struct dw_spi *dws);
 extern int dw_spi_suspend_host(struct dw_spi *dws);
-- 
2.27.0

[PATCH 27/30] spi: dw: Introduce max mem-ops SPI bus frequency setting

2020-09-20 Thread Serge Semin

In some circumstances the current implementation of the SPI memory
operations may occasionally fail even though they are executed in the
atomic context. This may happen if the system bus is relatively slow in
comparison to the SPI bus frequency, or there is a concurrent access to
it, which makes the MMIO-operations occasionally stalling before
push-pulling data from the DW APB SPI FIFOs. These two problems we've
discovered on the Baikal-T1 SoC. In order to fix them we have no choice
but to set an artificial limitation on the SPI bus speed.

Note currently this limitation will be only applicable for the memory
operations, since the standard SPI core interface is implemented with an
assumption that there is no problem with the automatic CS toggling.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 4 +++-
 drivers/spi/spi-dw.h  | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index ca22f427d82d..7b901226fd38 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -608,7 +608,7 @@ static int dw_spi_exec_mem_op(struct spi_mem *mem, const 
struct spi_mem_op *op)
 * operation. Transmit-only mode is suitable for the rest of them.
 */
cfg.dfs = 8;
-   cfg.freq = mem->spi->max_speed_hz;
+   cfg.freq = clamp(mem->spi->max_speed_hz, 0U, dws->max_mem_freq);
if (op->data.dir == SPI_MEM_DATA_IN) {
cfg.tmode = SPI_TMOD_EPROMREAD;
cfg.ndf = op->data.nbytes;
@@ -695,6 +695,8 @@ static void dw_spi_init_mem_ops(struct dw_spi *dws)
dws->mem_ops.adjust_op_size = dw_spi_adjust_mem_op_size;
dws->mem_ops.supports_op = dw_spi_supports_mem_op;
dws->mem_ops.exec_op = dw_spi_exec_mem_op;
+   if (!dws->max_mem_freq)
+   dws->max_mem_freq = dws->max_freq;
}
 }
 
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 4b08fe34a85d..dc5781236cc6 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -148,6 +148,7 @@ struct dw_spi {
unsigned long   paddr;
int irq;
u32 fifo_len;   /* depth of the FIFO buffer */
+   u32 max_mem_freq;   /* max mem-ops bus freq */
u32 max_freq;   /* max bus freq supported */
 
u32 caps;   /* DW SPI capabilities */
-- 
2.27.0

[PATCH 18/30] spi: dw: Refactor IRQ-based SPI transfer procedure

2020-09-20 Thread Serge Semin

Current IRQ-based SPI transfer execution procedure doesn't work well at
the final stage of the execution. If all the Tx data is sent out (written
to the Tx FIFO) but there is some data left to receive, the Tx FIFO Empty
IRQ will constantly happen until all of the requested inbound data is
received. Though for a short period of time, but it will make the system
less responsive. In order to fix that let's refactor the SPI transfer
execution procedure by taking the Rx FIFO Full IRQ into account. We'll read
and write SPI transfer data each time the IRQ happens as before. If all
the outbound data is sent out, we'll disable the Tx FIFO Empty IRQ. If
there is still some data to receive, we'll adjust the Rx FIFO Threshold
level, so the next IRQ would be raised at the moment of all incoming data
being available in the Rx FIFO.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 84c1fdfd6e52..682463b2f68b 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -189,17 +189,30 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws)
return IRQ_HANDLED;
}
 
+   /*
+* Read data from the Rx FIFO every time we've got a chance executing
+* this method. If there is nothing left to receive, terminate the
+* procedure. Otherwise adjust the Rx FIFO Threshold level if it's a
+* final stage of the transfer. By doing so we'll get the next IRQ
+* right when the leftover incoming data is received.
+*/
dw_reader(dws);
if (!dws->rx_len) {
spi_mask_intr(dws, 0xff);
spi_finalize_current_transfer(dws->master);
-   return IRQ_HANDLED;
+   } else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
+   dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
}
+
+   /*
+* Send data out if Tx FIFO Empty IRQ is received. The IRQ will be
+* disabled after the data transmission is finished so not to
+* have the TXE IRQ flood at the final stage of the transfer.
+*/
if (irq_status & SPI_INT_TXEI) {
-   spi_mask_intr(dws, SPI_INT_TXEI);
dw_writer(dws);
-   /* Enable TX irq always, it will be disabled when RX finished */
-   spi_umask_intr(dws, SPI_INT_TXEI);
+   if (!dws->tx_len)
+   spi_mask_intr(dws, SPI_INT_TXEI);
}
 
return IRQ_HANDLED;
@@ -317,10 +330,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
/* For poll mode just disable all interrupts */
spi_mask_intr(dws, 0xff);
 
-   /*
-* Interrupt mode
-* we only need set the TXEI IRQ, as TX/RX always happen syncronizely
-*/
if (dws->dma_mapped) {
ret = dws->dma_ops->dma_setup(dws, transfer);
if (ret < 0) {
@@ -328,12 +337,18 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
return ret;
}
} else {
+   /*
+* Originally Tx and Rx data lengths match. Rx FIFO Threshold 
level
+* will be adjusted at the final stage of the IRQ-based SPI 
transfer
+* execution so not to lose the leftover of the incoming data.
+*/
txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len);
dw_writel(dws, DW_SPI_TXFTLR, txlevel);
+   dw_writel(dws, DW_SPI_RXFTLR, txlevel - 1);
 
/* Set the interrupt mask */
imask |= SPI_INT_TXEI | SPI_INT_TXOI |
-SPI_INT_RXUI | SPI_INT_RXOI;
+SPI_INT_RXUI | SPI_INT_RXOI | SPI_INT_RXFI;
spi_umask_intr(dws, imask);
 
dws->transfer_handler = interrupt_transfer;
-- 
2.27.0

[PATCH 19/30] spi: dw: Perform IRQ setup in a dedicated function

2020-09-20 Thread Serge Semin

In order to make the transfer_one() callback method more readable and
for unification with the DMA-based transfer, let's detach the IRQ setup
procedure into a dedicated function. While at it rename the IRQ-based
transfer handler function to be dw_spi-prefixe and looking more like the
DMA-related one.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 41 ++-
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 682463b2f68b..08bc53b9de88 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -178,7 +178,7 @@ static void int_error_stop(struct dw_spi *dws, const char 
*msg)
spi_finalize_current_transfer(dws->master);
 }
 
-static irqreturn_t interrupt_transfer(struct dw_spi *dws)
+static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
 {
u16 irq_status = dw_readl(dws, DW_SPI_ISR);
 
@@ -294,6 +294,27 @@ void dw_spi_update_config(struct dw_spi *dws, struct 
spi_device *spi,
 }
 EXPORT_SYMBOL_GPL(dw_spi_update_config);
 
+static void dw_spi_irq_setup(struct dw_spi *dws)
+{
+   u16 level;
+   u8 imask;
+
+   /*
+* Originally Tx and Rx data lengths match. Rx FIFO Threshold level
+* will be adjusted at the final stage of the IRQ-based SPI transfer
+* execution so not to lose the leftover of the incoming data.
+*/
+   level = min_t(u16, dws->fifo_len / 2, dws->tx_len);
+   dw_writel(dws, DW_SPI_TXFTLR, level);
+   dw_writel(dws, DW_SPI_RXFTLR, level - 1);
+
+   imask = SPI_INT_TXEI | SPI_INT_TXOI | SPI_INT_RXUI | SPI_INT_RXOI |
+   SPI_INT_RXFI;
+   spi_umask_intr(dws, imask);
+
+   dws->transfer_handler = dw_spi_transfer_handler;
+}
+
 static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
 {
@@ -303,8 +324,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
.dfs = transfer->bits_per_word,
.freq = transfer->speed_hz,
};
-   u8 imask = 0;
-   u16 txlevel = 0;
int ret;
 
dws->dma_mapped = 0;
@@ -337,21 +356,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
return ret;
}
} else {
-   /*
-* Originally Tx and Rx data lengths match. Rx FIFO Threshold 
level
-* will be adjusted at the final stage of the IRQ-based SPI 
transfer
-* execution so not to lose the leftover of the incoming data.
-*/
-   txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len);
-   dw_writel(dws, DW_SPI_TXFTLR, txlevel);
-   dw_writel(dws, DW_SPI_RXFTLR, txlevel - 1);
-
-   /* Set the interrupt mask */
-   imask |= SPI_INT_TXEI | SPI_INT_TXOI |
-SPI_INT_RXUI | SPI_INT_RXOI | SPI_INT_RXFI;
-   spi_umask_intr(dws, imask);
-
-   dws->transfer_handler = interrupt_transfer;
+   dw_spi_irq_setup(dws);
}
 
spi_enable_chip(dws, 1);
-- 
2.27.0

[PATCH 05/30] spi: dw: Clear IRQ status on DW SPI controller reset

2020-09-20 Thread Serge Semin

It turns out the IRQ status isn't cleared after switching the controller
off and getting it back on, which may cause raising false error interrupts
if controller has been unsuccessfully used by, for instance, a bootloader
before the driver is loaded. Let's explicitly clear the interrupts status
in the dedicated controller reset method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 1ab704d1ebd8..ff77f39047ce 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -229,14 +229,15 @@ static inline void spi_umask_intr(struct dw_spi *dws, u32 
mask)
 }
 
 /*
- * This does disable the SPI controller, interrupts, and re-enable the
- * controller back. Transmit and receive FIFO buffers are cleared when the
- * device is disabled.
+ * This disables the SPI controller, interrupts, clears the interrupts status,
+ * and re-enable the controller back. Transmit and receive FIFO buffers are
+ * cleared when the device is disabled.
  */
 static inline void spi_reset_chip(struct dw_spi *dws)
 {
spi_enable_chip(dws, 0);
spi_mask_intr(dws, 0xff);
+   dw_readl(dws, DW_SPI_ICR);
spi_enable_chip(dws, 1);
 }
 
-- 
2.27.0

[PATCH 17/30] spi: dw: Refactor data IO procedure

2020-09-20 Thread Serge Semin

The Tx and Rx data write/read procedure can be significantly simplified by
using Tx/Rx transfer lengths instead of the end pointers. By having the
Tx/Rx data leftover lengths (in the number of transfer words) we can get
rid of all subtraction and division operations utilized here and there in
the tx_max(), rx_max(), dw_writer() and dw_reader() methods. Such
modification will not only give us the more optimized IO procedures, but
will make the data IO methods much more readable than before.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 37 +
 drivers/spi/spi-dw.h  |  5 ++---
 2 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 9102685c1523..84c1fdfd6e52 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -108,9 +108,8 @@ EXPORT_SYMBOL_GPL(dw_spi_set_cs);
 /* Return the max entries we can fill into tx fifo */
 static inline u32 tx_max(struct dw_spi *dws)
 {
-   u32 tx_left, tx_room, rxtx_gap;
+   u32 tx_room, rxtx_gap;
 
-   tx_left = (dws->tx_end - dws->tx) / dws->n_bytes;
tx_room = dws->fifo_len - dw_readl(dws, DW_SPI_TXFLR);
 
/*
@@ -121,18 +120,15 @@ static inline u32 tx_max(struct dw_spi *dws)
 * shift registers. So a control from sw point of
 * view is taken.
 */
-   rxtx_gap =  ((dws->rx_end - dws->rx) - (dws->tx_end - dws->tx))
-   / dws->n_bytes;
+   rxtx_gap = dws->fifo_len - (dws->rx_len - dws->tx_len);
 
-   return min3(tx_left, tx_room, (u32) (dws->fifo_len - rxtx_gap));
+   return min3((u32)dws->tx_len, tx_room, rxtx_gap);
 }
 
 /* Return the max entries we should read out of rx fifo */
 static inline u32 rx_max(struct dw_spi *dws)
 {
-   u32 rx_left = (dws->rx_end - dws->rx) / dws->n_bytes;
-
-   return min_t(u32, rx_left, dw_readl(dws, DW_SPI_RXFLR));
+   return min_t(u32, dws->rx_len, dw_readl(dws, DW_SPI_RXFLR));
 }
 
 static void dw_writer(struct dw_spi *dws)
@@ -141,15 +137,16 @@ static void dw_writer(struct dw_spi *dws)
u16 txw = 0;
 
while (max--) {
-   /* Set the tx word if the transfer's original "tx" is not null 
*/
-   if (dws->tx_end - dws->len) {
+   if (dws->tx) {
if (dws->n_bytes == 1)
txw = *(u8 *)(dws->tx);
else
txw = *(u16 *)(dws->tx);
+
+   dws->tx += dws->n_bytes;
}
dw_write_io_reg(dws, DW_SPI_DR, txw);
-   dws->tx += dws->n_bytes;
+   --dws->tx_len;
}
 }
 
@@ -160,14 +157,15 @@ static void dw_reader(struct dw_spi *dws)
 
while (max--) {
rxw = dw_read_io_reg(dws, DW_SPI_DR);
-   /* Care rx only if the transfer's original "rx" is not null */
-   if (dws->rx_end - dws->len) {
+   if (dws->rx) {
if (dws->n_bytes == 1)
*(u8 *)(dws->rx) = rxw;
else
*(u16 *)(dws->rx) = rxw;
+
+   dws->rx += dws->n_bytes;
}
-   dws->rx += dws->n_bytes;
+   --dws->rx_len;
}
 }
 
@@ -192,7 +190,7 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws)
}
 
dw_reader(dws);
-   if (dws->rx_end == dws->rx) {
+   if (!dws->rx_len) {
spi_mask_intr(dws, 0xff);
spi_finalize_current_transfer(dws->master);
return IRQ_HANDLED;
@@ -299,12 +297,11 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
dws->dma_mapped = 0;
dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
dws->tx = (void *)transfer->tx_buf;
-   dws->tx_end = dws->tx + transfer->len;
+   dws->tx_len = transfer->len / dws->n_bytes;
dws->rx = transfer->rx_buf;
-   dws->rx_end = dws->rx + transfer->len;
-   dws->len = transfer->len;
+   dws->rx_len = dws->tx_len;
 
-   /* Ensure dw->rx and dw->rx_end are visible */
+   /* Ensure the data above is visible for all CPUs */
smp_mb();
 
spi_enable_chip(dws, 0);
@@ -331,7 +328,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
return ret;
}
} else {
-   txlevel = min_t(u16, dws->fifo_len / 2, dws->len / 
dws->n_bytes);
+   txlevel = min_t(u16, dws->fifo_len / 2, dws->tx_len);
dw_writel(dws, DW_SPI_TXFTLR, txlevel);
 
/* Set the interrupt mask */
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 2a2346438564..cfc9f63acde4 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -147,11 +147,10 @@ struct dw_spi {
void (*set_cs)(struct spi_device *spi, bool en

[PATCH 22/30] spi: dw: De-assert chip-select on reset

2020-09-20 Thread Serge Semin

SPI memory operations implementation will require to have the CS register
cleared before executing the operation in order not to have the
transmission automatically started prior the Tx FIFO is pre-initialized.
Let's clear the register then on explicit controller reset to fulfil the
requirements in case of an error or having the CS left set by a bootloader
or another software.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index cfc9f63acde4..eb1d46983319 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -237,15 +237,16 @@ static inline void spi_umask_intr(struct dw_spi *dws, u32 
mask)
 }
 
 /*
- * This disables the SPI controller, interrupts, clears the interrupts status,
- * and re-enable the controller back. Transmit and receive FIFO buffers are
- * cleared when the device is disabled.
+ * This disables the SPI controller, interrupts, clears the interrupts status
+ * and CS, then re-enables the controller back. Transmit and receive FIFO
+ * buffers are cleared when the device is disabled.
  */
 static inline void spi_reset_chip(struct dw_spi *dws)
 {
spi_enable_chip(dws, 0);
spi_mask_intr(dws, 0xff);
dw_readl(dws, DW_SPI_ICR);
+   dw_writel(dws, DW_SPI_SER, 0);
spi_enable_chip(dws, 1);
 }
 
-- 
2.27.0

[PATCH 29/30] dt-bindings: spi: dw: Add Baikal-T1 SPI Controllers

2020-09-20 Thread Serge Semin

These controllers are based on the DW APB SSI IP-core and embedded into
the SoC, so two of them are equipped with IRQ, DMA, 64 words FIFOs and 4
native CS, while another one as being utilized by the Baikal-T1 System
Boot Controller has got a very limited resources: no IRQ, no DMA, only a
single native chip-select and just 8 bytes Tx/Rx FIFOs available. That's
why we have to mark the IRQ to be optional for the later interface.

The SPI controller embedded into the Baikal-T1 System Boot Controller can
be also used to directly access an external SPI flash by means of a
dedicated FSM. The corresponding MMIO region availability is switchable by
the embedded multiplexor, which phandle can be specified in the dts node.

* We added a new example to test out the non-standard Baikal-T1 System
Boot SPI Controller DT binding.

Co-developed-by: Ramil Zaripov 
Signed-off-by: Ramil Zaripov 
Signed-off-by: Serge Semin 
---
 .../bindings/spi/snps,dw-apb-ssi.yaml | 33 +--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml 
b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
index c62cbe79f00d..d6ae35777dac 100644
--- a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
+++ b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
@@ -22,6 +22,21 @@ allOf:
   properties:
 reg:
   minItems: 2
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - baikal,bt1-sys-ssi
+then:
+  properties:
+mux-controls:
+  maxItems: 1
+  required:
+- mux-controls
+else:
+  required:
+- interrupts
 
 properties:
   compatible:
@@ -44,12 +59,16 @@ properties:
   - const: snps,dw-apb-ssi
   - description: Intel Keem Bay SPI Controller
 const: intel,keembay-ssi
+  - description: Baikal-T1 SPI Controller
+const: baikal,bt1-ssi
+  - description: Baikal-T1 System Boot SPI Controller
+const: baikal,bt1-sys-ssi
 
   reg:
 minItems: 1
 items:
   - description: DW APB SSI controller memory mapped registers
-  - description: SPI MST region map
+  - description: SPI MST region map or directly mapped SPI ROM
 
   interrupts:
 maxItems: 1
@@ -114,7 +133,6 @@ required:
   - reg
   - "#address-cells"
   - "#size-cells"
-  - interrupts
   - clocks
 
 examples:
@@ -130,4 +148,15 @@ examples:
   cs-gpios = <&gpio0 13 0>,
  <&gpio0 14 0>;
 };
+  - |
+spi@1f040100 {
+  compatible = "baikal,bt1-sys-ssi";
+  reg = <0x1f040100 0x900>,
+<0x1c00 0x100>;
+  #address-cells = <1>;
+  #size-cells = <0>;
+  mux-controls = <&boot_mux>;
+  clocks = <&ccu_sys>;
+  clock-names = "ssi_clk";
+};
 ...
-- 
2.27.0

[PATCH 13/30] spi: dw: Update SPI bus speed in a config function

2020-09-20 Thread Serge Semin

The SPI bus speed update functionality will be useful in another parts of
the driver too (like to implement the SPI memory operations and from the
DW SPI glue layers). Let's move it to the update_cr0() method then and
since the later is now updating not only the CTRLR0 register alter its
prototype to have a generic function name not related to CR0.

Leave the too long line with the chip->clk_div setting as is for now,
since it's going to be changed later anyway.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index a9644351d75f..0b80a16d9872 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -251,8 +251,8 @@ static u32 dw_spi_get_cr0(struct dw_spi *dws, struct 
spi_device *spi)
return cr0;
 }
 
-static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
- struct spi_transfer *transfer)
+static void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi,
+struct spi_transfer *transfer)
 {
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0 = chip->cr0;
@@ -265,6 +265,17 @@ static void dw_spi_update_cr0(struct dw_spi *dws, struct 
spi_device *spi,
cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
 
dw_writel(dws, DW_SPI_CTRLR0, cr0);
+
+   /* Handle per transfer options for bpw and speed */
+   if (transfer->speed_hz != dws->current_freq) {
+   if (transfer->speed_hz != chip->speed_hz) {
+   /* clk_div doesn't support odd number */
+   chip->clk_div = (DIV_ROUND_UP(dws->max_freq, 
transfer->speed_hz) + 1) & 0xfffe;
+   chip->speed_hz = transfer->speed_hz;
+   }
+   dws->current_freq = transfer->speed_hz;
+   spi_set_clk(dws, chip->clk_div);
+   }
 }
 
 static int dw_spi_transfer_one(struct spi_controller *master,
@@ -289,21 +300,10 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
spi_enable_chip(dws, 0);
 
-   /* Handle per transfer options for bpw and speed */
-   if (transfer->speed_hz != dws->current_freq) {
-   if (transfer->speed_hz != chip->speed_hz) {
-   /* clk_div doesn't support odd number */
-   chip->clk_div = (DIV_ROUND_UP(dws->max_freq, 
transfer->speed_hz) + 1) & 0xfffe;
-   chip->speed_hz = transfer->speed_hz;
-   }
-   dws->current_freq = transfer->speed_hz;
-   spi_set_clk(dws, chip->clk_div);
-   }
+   dw_spi_update_config(dws, spi, transfer);
 
transfer->effective_speed_hz = dws->max_freq / chip->clk_div;
 
-   dw_spi_update_cr0(dws, spi, transfer);
-
/* Check if current transfer is a DMA transaction */
if (master->can_dma && master->can_dma(master, spi, transfer))
dws->dma_mapped = master->cur_msg_mapped;
-- 
2.27.0

[PATCH 26/30] spi: dw: Add memory operations support

2020-09-20 Thread Serge Semin

Aside from the synchronous Tx-Rx mode, which has been utilized to create
the normal SPI transfers in the framework of the DW SSI driver, DW SPI
controller supports Tx-only and EEPROM-read modes. The former one just
enables the controller to transmit all the data from the Tx FIFO ignoring
anything retrieved from the MISO lane. The later mode is so called
write-then-read operation: DW SPI controller first pushes out all the data
from the Tx FIFO, after that it'll automatically receive as much data as
has been specified by means of the CTRLR1 register. Both of those modes
can be used to implement the memory operations supported by the SPI-memory
subsystem.

The memory operation implementation is pretty much straightforward, except
a few peculiarities we have had to take into account to make things
working. Since DW SPI controller doesn't provide a way to directly set and
clear the native CS lane level, but instead automatically de-asserts it
when a transfer going on, we have to make sure the Tx FIFO isn't empty
during entire Tx procedure. In addition we also need to read data from the
Rx FIFO as fast as possible to prevent it' overflow with automatically
fetched incoming traffic. The denoted peculiarities get to cause even more
problems if DW SSI controller is equipped with relatively small FIFO and
is connected to a relatively slow system bus (APB) (with respect to the
SPI bus speed). In order to workaround the problems for as much as it's
possible, the memory operation execution procedure collects all the Tx
data into a single buffer and disables the local IRQs to speed the
write-then-optionally-read method up.

Note the provided memory operations are utilized by default only if
a glue driver hasn't provided a custom version of ones and this is not
a DW APB SSI controller with fixed automatic CS toggle functionality.

Co-developed-by: Ramil Zaripov 
Signed-off-by: Ramil Zaripov 
Signed-off-by: Serge Semin 
---
 drivers/spi/Kconfig   |   1 +
 drivers/spi/spi-dw-core.c | 300 ++
 drivers/spi/spi-dw.h  |  13 ++
 3 files changed, 314 insertions(+)

diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
index c6ea760ea5f0..1f70bb1e7fa9 100644
--- a/drivers/spi/Kconfig
+++ b/drivers/spi/Kconfig
@@ -235,6 +235,7 @@ config SPI_DAVINCI
 
 config SPI_DESIGNWARE
tristate "DesignWare SPI controller core support"
+   imply SPI_MEM
help
  general driver for SPI controller core from DesignWare
 
diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 77d61ded9256..ca22f427d82d 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -8,10 +8,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #include "spi-dw.h"
@@ -401,6 +404,300 @@ static void dw_spi_handle_err(struct spi_controller 
*master,
spi_reset_chip(dws);
 }
 
+static int dw_spi_adjust_mem_op_size(struct spi_mem *mem, struct spi_mem_op 
*op)
+{
+   if (op->data.dir == SPI_MEM_DATA_IN)
+   op->data.nbytes = clamp_val(op->data.nbytes, 0, SPI_NDF_MASK + 
1);
+
+   return 0;
+}
+
+static bool dw_spi_supports_mem_op(struct spi_mem *mem,
+  const struct spi_mem_op *op)
+{
+   if (op->data.buswidth > 1 || op->addr.buswidth > 1 ||
+   op->dummy.buswidth > 1 || op->cmd.buswidth > 1)
+   return false;
+
+   return spi_mem_default_supports_op(mem, op);
+}
+
+static int dw_spi_init_mem_buf(struct dw_spi *dws, const struct spi_mem_op *op)
+{
+   unsigned int i, j, len;
+   u8 *out;
+
+   /*
+* Calculate the total length of the EEPROM command transfer and
+* either use the pre-allocated buffer or create a temporary one.
+*/
+   len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
+   if (op->data.dir == SPI_MEM_DATA_OUT)
+   len += op->data.nbytes;
+
+   if (len <= SPI_BUF_SIZE) {
+   out = dws->buf;
+   } else {
+   out = kzalloc(len, GFP_KERNEL);
+   if (!out)
+   return -ENOMEM;
+   }
+
+   /*
+* Collect the operation code, address and dummy bytes into the single
+* buffer. If it's a transfer with data to be sent, also copy it into 
the
+* single buffer in order to speed the data transmission up.
+*/
+   for (i = 0; i < op->cmd.nbytes; ++i)
+   out[i] = SPI_GET_BYTE(op->cmd.opcode, op->cmd.nbytes - i - 1);
+   for (j = 0; j < op->addr.nbytes; ++i, ++j)
+   out[i] = SPI_GET_BYTE(op->addr.val, op->addr.nbytes - j - 1);
+   for (j = 0; j < op->dummy.nbytes; ++i, ++j)
+   out[i] = 0x0;
+
+   if (op->data.dir == SPI_MEM_DATA_OUT)
+   memcpy(&out[i], op->data.buf.out, op->data.nbytes);
+
+   dws->n_bytes = 1;
+   dws->tx = out;
+   dws->tx_len = len;
+   if (op->data.dir

[PATCH 24/30] spi: dw: Move num-of retries parameter to the header file

2020-09-20 Thread Serge Semin

The parameter will be needed for another wait-done method being added in
the framework of the SPI memory operation modification in a further
commit.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 5 ++---
 drivers/spi/spi-dw.h | 2 ++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index bb390ff67d1d..9db119dc5554 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -17,7 +17,6 @@
 
 #include "spi-dw.h"
 
-#define WAIT_RETRIES   5
 #define RX_BUSY0
 #define RX_BURST_LEVEL 16
 #define TX_BUSY1
@@ -208,7 +207,7 @@ static inline bool dw_spi_dma_tx_busy(struct dw_spi *dws)
 static int dw_spi_dma_wait_tx_done(struct dw_spi *dws,
   struct spi_transfer *xfer)
 {
-   int retry = WAIT_RETRIES;
+   int retry = SPI_WAIT_RETRIES;
struct spi_delay delay;
u32 nents;
 
@@ -283,7 +282,7 @@ static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
 
 static int dw_spi_dma_wait_rx_done(struct dw_spi *dws)
 {
-   int retry = WAIT_RETRIES;
+   int retry = SPI_WAIT_RETRIES;
struct spi_delay delay;
unsigned long ns, us;
u32 nents;
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index eb1d46983319..946065201c9c 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -100,6 +100,8 @@
 #define SPI_DMA_RDMAE  (1 << 0)
 #define SPI_DMA_TDMAE  (1 << 1)
 
+#define SPI_WAIT_RETRIES   5
+
 enum dw_ssi_type {
SSI_MOTO_SPI = 0,
SSI_TI_SSP,
-- 
2.27.0

[PATCH 23/30] spi: dw: Explicitly de-assert CS on SPI transfer completion

2020-09-20 Thread Serge Semin

By design of the currently available native set_cs callback, the CS
de-assertion will be done only if it's required by the corresponding
controller capability. But in order to pre-fill the Tx FIFO buffer with
data during the SPI memory ops execution the SER register needs to be left
cleared before that. We'll also need a way to explicitly set and clear the
corresponding CS bit at a certain moment of the operation. Let's alter
the set_cs function then to also de-activate the CS, when it's required.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 65db4dd3ea8a..7a25ea6f4af6 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -100,7 +100,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable)
 */
if (cs_high == enable)
dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select));
-   else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
+   else
dw_writel(dws, DW_SPI_SER, 0);
 }
 EXPORT_SYMBOL_GPL(dw_spi_set_cs);
-- 
2.27.0

[PATCH 28/30] spi: dw: Add poll-based SPI transfers support

2020-09-20 Thread Serge Semin

A functionality of the poll-based transfer has been removed by
commit 1ceb09717e98 ("spi: dw: remove cs_control and poll_mode members
from chip_data") with a justification that "there is no user of one
anymore". It turns out one of our DW APB SSI core is synthesized with no
IRQ line attached and the only possible way of using it is to implement a
poll-based SPI transfer procedure. So we have to get the removed
functionality back, but with some alterations described below.

First of all the poll-based transfer is activated only if the DW SPI
controller doesn't have an IRQ line attached and the Linux IRQ number is
initialized with the IRQ_NOTCONNECTED value. Secondly the transfer
procedure is now executed with a delay performed between writer and reader
methods. The delay value is calculated based on the number of data words
expected to be received on the current iteration. Finally the errors
status is checked on each iteration.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 40 ++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 7b901226fd38..cd01225a0f81 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -343,6 +343,42 @@ static void dw_spi_irq_setup(struct dw_spi *dws)
dws->transfer_handler = dw_spi_transfer_handler;
 }
 
+/*
+ * The iterative procedure of the poll-based transfer is simple: write as much
+ * as possible to the Tx FIFO, wait until the pending to receive data is ready
+ * to be read, read it from the Rx FIFO and check whether the performed
+ * procedure has been successful.
+ *
+ * Note this method the same way as the IRQ-based transfer won't work well for
+ * the SPI devices connected to the controller with native CS due to the
+ * automatic CS assertion/de-assertion.
+ */
+static int dw_spi_poll_transfer(struct dw_spi *dws,
+   struct spi_transfer *transfer)
+{
+   struct spi_delay delay;
+   u16 nbits;
+   int ret;
+
+   delay.unit = SPI_DELAY_UNIT_SCK;
+   nbits = dws->n_bytes * BITS_PER_BYTE;
+
+   do {
+   dw_writer(dws);
+
+   delay.value = nbits * (dws->rx_len - dws->tx_len);
+   spi_delay_exec(&delay, transfer);
+
+   dw_reader(dws);
+
+   ret = dw_spi_check_status(dws, true);
+   if (ret)
+   return ret;
+   } while (dws->rx_len);
+
+   return 0;
+}
+
 static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
 {
@@ -387,6 +423,8 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
if (dws->dma_mapped)
return dws->dma_ops->dma_transfer(dws, transfer);
+   else if (dws->irq == IRQ_NOTCONNECTED)
+   return dw_spi_poll_transfer(dws, transfer);
 
dw_spi_irq_setup(dws);
 
@@ -795,7 +833,7 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
 
ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev),
  master);
-   if (ret < 0) {
+   if (ret < 0 && ret != -ENOTCONN) {
dev_err(dev, "can not get IRQ\n");
goto err_free_master;
}
-- 
2.27.0

[PATCH 20/30] spi: dw: Unmask IRQs after enabling the chip

2020-09-20 Thread Serge Semin

It's theoretically erroneous to enable IRQ before the chip is turned on.
If IRQ handler gets executed before the chip is enabled, then any data
written to the Tx FIFO will be just ignored.

I say "theoretically" because we haven't noticed any problem with that,
but let's fix it anyway just in case...

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 08bc53b9de88..8dbe11c1821c 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -355,8 +355,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
spi_enable_chip(dws, 1);
return ret;
}
-   } else {
-   dw_spi_irq_setup(dws);
}
 
spi_enable_chip(dws, 1);
@@ -364,6 +362,8 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
if (dws->dma_mapped)
return dws->dma_ops->dma_transfer(dws, transfer);
 
+   dw_spi_irq_setup(dws);
+
return 1;
 }
 
-- 
2.27.0

[PATCH 04/30] Revert: spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls

2020-09-20 Thread Serge Semin

There is no point in having the commit 19b61392c5a8 ("spi: spi-dw: Add
lock protect dw_spi rx/tx to prevent concurrent calls") applied. The
commit author made an assumption that the problem with the rx data
mismatch was due to the lack of the data protection. While most likely it
was caused by the lack of the memory barrier. So having the
commit bfda044533b2 ("spi: dw: use "smp_mb()" to avoid sending spi data
error") applied would be enough to fix the problem.

Indeed the spin unlock operation makes sure each memory operation issued
before the release will be completed before it's completed. In other words
it works as an implicit one way memory barrier. So having both smp_mb()
and the spin_unlock_irqrestore() here is just redundant. One of them would
be enough. It's better to leave the smp_mb() since the Tx/Rx buffers
consistency is provided by the data transfer algorithm implementation:
first we initialize the buffers pointers, then make sure the assignments
are visible by the other CPUs by calling the smp_mb(), only after that
enable the interrupt, which handler uses the buffers.

Signed-off-by: Serge Semin 

---

Folks. I have also a doubt whether the SMP memory barrier is required there
because the normal IO-methods like readl/writel imply a full memory barrier. So
any memory operation performed before them are supposed to be seen by devices
and another CPUs [1]. So most likely there could have been a problem with those
IOs implementation on the subject platform or a spurious interrupt could
have been raised during the data initialization. What do you think?
Am I missing something?

[1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"
---
 drivers/spi/spi-dw-core.c | 14 ++
 drivers/spi/spi-dw.h  |  1 -
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 1af74362461d..18411f5b9954 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -142,11 +142,9 @@ static inline u32 rx_max(struct dw_spi *dws)
 
 static void dw_writer(struct dw_spi *dws)
 {
-   u32 max;
+   u32 max = tx_max(dws);
u16 txw = 0;
 
-   spin_lock(&dws->buf_lock);
-   max = tx_max(dws);
while (max--) {
/* Set the tx word if the transfer's original "tx" is not null 
*/
if (dws->tx_end - dws->len) {
@@ -158,16 +156,13 @@ static void dw_writer(struct dw_spi *dws)
dw_write_io_reg(dws, DW_SPI_DR, txw);
dws->tx += dws->n_bytes;
}
-   spin_unlock(&dws->buf_lock);
 }
 
 static void dw_reader(struct dw_spi *dws)
 {
-   u32 max;
+   u32 max = rx_max(dws);
u16 rxw;
 
-   spin_lock(&dws->buf_lock);
-   max = rx_max(dws);
while (max--) {
rxw = dw_read_io_reg(dws, DW_SPI_DR);
/* Care rx only if the transfer's original "rx" is not null */
@@ -179,7 +174,6 @@ static void dw_reader(struct dw_spi *dws)
}
dws->rx += dws->n_bytes;
}
-   spin_unlock(&dws->buf_lock);
 }
 
 static void int_error_stop(struct dw_spi *dws, const char *msg)
@@ -291,21 +285,18 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 {
struct dw_spi *dws = spi_controller_get_devdata(master);
struct chip_data *chip = spi_get_ctldata(spi);
-   unsigned long flags;
u8 imask = 0;
u16 txlevel = 0;
u32 cr0;
int ret;
 
dws->dma_mapped = 0;
-   spin_lock_irqsave(&dws->buf_lock, flags);
dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
dws->tx = (void *)transfer->tx_buf;
dws->tx_end = dws->tx + transfer->len;
dws->rx = transfer->rx_buf;
dws->rx_end = dws->rx + transfer->len;
dws->len = transfer->len;
-   spin_unlock_irqrestore(&dws->buf_lock, flags);
 
/* Ensure dw->rx and dw->rx_end are visible */
smp_mb();
@@ -464,7 +455,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
dws->master = master;
dws->type = SSI_MOTO_SPI;
dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR);
-   spin_lock_init(&dws->buf_lock);
 
spi_controller_set_devdata(master, dws);
 
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 51bab30b9f85..1ab704d1ebd8 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -131,7 +131,6 @@ struct dw_spi {
size_t  len;
void*tx;
void*tx_end;
-   spinlock_t  buf_lock;
void*rx;
void*rx_end;
int dma_mapped;
-- 
2.27.0

[PATCH 06/30] spi: dw: Disable all IRQs when controller is unused

2020-09-20 Thread Serge Semin

It's a good practice to disable all IRQs if a device is fully unused. In
our case it is supposed to be done before requesting the IRQ and after the
last byte of an SPI transfer is received. In the former case it's required
to prevent the IRQ handler invocation before the driver data is fully
initialized (which may happen if the IRQs status has been left uncleared
before the device is probed). So we just moved the spi_hw_init() method
invocation to the earlier stage before requesting the IRQ. In the later
case there is just no point in having any of the IRQs enabled between SPI
transfers and when there is no SPI message currently being processed.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 18411f5b9954..be94ed5bb896 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -198,7 +198,7 @@ static irqreturn_t interrupt_transfer(struct dw_spi *dws)
 
dw_reader(dws);
if (dws->rx_end == dws->rx) {
-   spi_mask_intr(dws, SPI_INT_TXEI);
+   spi_mask_intr(dws, 0xff);
spi_finalize_current_transfer(dws->master);
return IRQ_HANDLED;
}
@@ -222,7 +222,7 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
return IRQ_NONE;
 
if (!master->cur_msg) {
-   spi_mask_intr(dws, SPI_INT_TXEI);
+   spi_mask_intr(dws, 0xff);
return IRQ_HANDLED;
}
 
@@ -458,6 +458,9 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
 
spi_controller_set_devdata(master, dws);
 
+   /* Basic HW init */
+   spi_hw_init(dev, dws);
+
ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev),
  master);
if (ret < 0) {
@@ -485,9 +488,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
device_property_read_u32(dev, "rx-sample-delay-ns",
 &dws->def_rx_sample_dly_ns);
 
-   /* Basic HW init */
-   spi_hw_init(dev, dws);
-
if (dws->dma_ops && dws->dma_ops->dma_init) {
ret = dws->dma_ops->dma_init(dev, dws);
if (ret) {
-- 
2.27.0

[PATCH 25/30] spi: dw: Add generic DW SSI status-check method

2020-09-20 Thread Serge Semin

The DW SSI errors handling method can be generically implemented for all
types of the transfers: IRQ, DMA and poll-based ones. It will be a
function which checks the overflow/underflow error flags and resets the
controller if any of them is set. In the framework of this commit we make
use of the new method to detect the errors in the IRQ- and DMA-based SPI
transfer execution procedures.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 43 +++
 drivers/spi/spi-dw-dma.c  | 11 ++
 drivers/spi/spi-dw.h  |  1 +
 3 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 7a25ea6f4af6..77d61ded9256 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -169,23 +169,48 @@ static void dw_reader(struct dw_spi *dws)
}
 }
 
-static void int_error_stop(struct dw_spi *dws, const char *msg)
+int dw_spi_check_status(struct dw_spi *dws, bool raw)
 {
-   spi_reset_chip(dws);
+   u32 irq_status;
+   int ret = 0;
+
+   if (raw)
+   irq_status = dw_readl(dws, DW_SPI_RISR);
+   else
+   irq_status = dw_readl(dws, DW_SPI_ISR);
+
+   if (irq_status & SPI_INT_RXOI) {
+   dev_err(&dws->master->dev, "RX FIFO overflow detected\n");
+   ret = -EIO;
+   }
+
+   if (irq_status & SPI_INT_RXUI) {
+   dev_err(&dws->master->dev, "RX FIFO underflow detected\n");
+   ret = -EIO;
+   }
 
-   dev_err(&dws->master->dev, "%s\n", msg);
-   dws->master->cur_msg->status = -EIO;
-   spi_finalize_current_transfer(dws->master);
+   if (irq_status & SPI_INT_TXOI) {
+   dev_err(&dws->master->dev, "TX FIFO overflow detected\n");
+   ret = -EIO;
+   }
+
+   /* Generically handle the erroneous situation */
+   if (ret) {
+   spi_reset_chip(dws);
+   if (dws->master->cur_msg)
+   dws->master->cur_msg->status = ret;
+   }
+
+   return ret;
 }
+EXPORT_SYMBOL_GPL(dw_spi_check_status);
 
 static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
 {
u16 irq_status = dw_readl(dws, DW_SPI_ISR);
 
-   /* Error handling */
-   if (irq_status & (SPI_INT_TXOI | SPI_INT_RXOI | SPI_INT_RXUI)) {
-   dw_readl(dws, DW_SPI_ICR);
-   int_error_stop(dws, "interrupt_transfer: fifo 
overrun/underrun");
+   if (dw_spi_check_status(dws, false)) {
+   spi_finalize_current_transfer(dws->master);
return IRQ_HANDLED;
}
 
diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 9db119dc5554..1969b09b4f5e 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -144,17 +144,10 @@ static void dw_spi_dma_exit(struct dw_spi *dws)
 
 static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws)
 {
-   u16 irq_status = dw_readl(dws, DW_SPI_ISR);
+   dw_spi_check_status(dws, false);
 
-   if (!irq_status)
-   return IRQ_NONE;
-
-   dw_readl(dws, DW_SPI_ICR);
-   spi_reset_chip(dws);
-
-   dev_err(&dws->master->dev, "%s: FIFO overrun/underrun\n", __func__);
-   dws->master->cur_msg->status = -EIO;
complete(&dws->dma_completion);
+
return IRQ_HANDLED;
 }
 
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 946065201c9c..5eb98ece2f2a 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -261,6 +261,7 @@ static inline void spi_shutdown_chip(struct dw_spi *dws)
 extern void dw_spi_set_cs(struct spi_device *spi, bool enable);
 extern void dw_spi_update_config(struct dw_spi *dws, struct spi_device *spi,
 struct dw_spi_cfg *cfg);
+extern int dw_spi_check_status(struct dw_spi *dws, bool raw);
 extern int dw_spi_add_host(struct device *dev, struct dw_spi *dws);
 extern void dw_spi_remove_host(struct dw_spi *dws);
 extern int dw_spi_suspend_host(struct dw_spi *dws);
-- 
2.27.0

[PATCH 15/30] spi: dw: Update Rx sample delay in the config function

2020-09-20 Thread Serge Semin

Rx sample delay can be SPI device specific, and should be synchronously
initialized with the rest of the communication and peripheral device
related controller setups. So let's move the Rx-sample delay setup into
the DW APB SSI configuration update method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 92138a6ada12..f00fc4828480 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -273,13 +273,18 @@ static void dw_spi_update_config(struct dw_spi *dws, 
struct spi_device *spi,
spi_set_clk(dws, clk_div);
dws->current_freq = speed_hz;
}
+
+   /* Update RX sample delay if required */
+   if (dws->cur_rx_sample_dly != chip->rx_sample_dly) {
+   dw_writel(dws, DW_SPI_RX_SAMPLE_DLY, chip->rx_sample_dly);
+   dws->cur_rx_sample_dly = chip->rx_sample_dly;
+   }
 }
 
 static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
 {
struct dw_spi *dws = spi_controller_get_devdata(master);
-   struct chip_data *chip = spi_get_ctldata(spi);
u8 imask = 0;
u16 txlevel = 0;
int ret;
@@ -305,12 +310,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
if (master->can_dma && master->can_dma(master, spi, transfer))
dws->dma_mapped = master->cur_msg_mapped;
 
-   /* Update RX sample delay if required */
-   if (dws->cur_rx_sample_dly != chip->rx_sample_dly) {
-   dw_writel(dws, DW_SPI_RX_SAMPLE_DLY, chip->rx_sample_dly);
-   dws->cur_rx_sample_dly = chip->rx_sample_dly;
-   }
-
/* For poll mode just disable all interrupts */
spi_mask_intr(dws, 0xff);
 
-- 
2.27.0

[PATCH 21/30] spi: dw: Discard chip enabling on DMA setup error

2020-09-20 Thread Serge Semin

It's pointless to enable the chip back if the DMA setup procedure fails,
since we'll disable it on the next transfer anyway. For the same reason We
don't do that in case of a failure detected in any other methods called
from the transfer_one() method.

While at it consider any non-zero value returned from the dma_setup
callback to be erroneous as it's supposed to be in the kernel.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 8dbe11c1821c..65db4dd3ea8a 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -351,10 +351,8 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
if (dws->dma_mapped) {
ret = dws->dma_ops->dma_setup(dws, transfer);
-   if (ret < 0) {
-   spi_enable_chip(dws, 1);
+   if (ret)
return ret;
-   }
}
 
spi_enable_chip(dws, 1);
-- 
2.27.0

[PATCH 11/30] spi: dw: Add DWC SSI capability

2020-09-20 Thread Serge Semin

Currently DWC SSI core is supported by means of setting up the
core-specific update_cr0() callback. It isn't suitable for multiple
reasons. First of all having exported several methods doing the same thing
but for different chips makes the code harder to maintain. Secondly the
spi-dw-core driver exports the methods, then the spi-dw-mmio driver sets
the private data callback with one of them so to be called by the core
driver again. That makes the code logic too complicated. Thirdly using
callbacks for just updating the CR0 register is problematic, since in case
if the register needed to be updated from different parts of the code,
we'd have to create another callback (for instance the SPI device-specific
parameters don't need to be calculated each time the SPI transfer is
submitted, so it's better to pre-calculate the CR0 data at the SPI-device
setup stage).

So keeping all the above in mind let's discard the update_cr0() callbacks,
define a generic and static dw_spi_update_cr0() method and create the
DW_SPI_CAP_DWC_SSI capability, which when enabled would activate the
alternative CR0 register layout.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 69 ---
 drivers/spi/spi-dw-mmio.c | 20 ++--
 drivers/spi/spi-dw.h  |  9 +
 3 files changed, 23 insertions(+), 75 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 8f9737640ec1..c21641a485ce 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -228,60 +228,33 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
return dws->transfer_handler(dws);
 }
 
-/* Configure CTRLR0 for DW_apb_ssi */
-u32 dw_spi_update_cr0(struct spi_controller *master, struct spi_device *spi,
- struct spi_transfer *transfer)
+static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
+ struct spi_transfer *transfer)
 {
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0;
 
-   /* Default SPI mode is SCPOL = 0, SCPH = 0 */
-   cr0 = (transfer->bits_per_word - 1)
-   | (SSI_MOTO_SPI << SPI_FRF_OFFSET)
-   | spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) |
-  (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) |
-  (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET))
-   | (chip->tmode << SPI_TMOD_OFFSET);
-
-   return cr0;
-}
-EXPORT_SYMBOL_GPL(dw_spi_update_cr0);
-
-/* Configure CTRLR0 for DWC_ssi */
-u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
-struct spi_device *spi,
-struct spi_transfer *transfer)
-{
-   struct dw_spi *dws = spi_controller_get_devdata(master);
-   struct chip_data *chip = spi_get_ctldata(spi);
-   u32 cr0;
-
-   /* CTRLR0[ 4: 0] Data Frame Size */
cr0 = (transfer->bits_per_word - 1);
 
-   /* CTRLR0[ 7: 6] Frame Format */
-   cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
-
-   /*
-* SPI mode (SCPOL|SCPH)
-* CTRLR0[ 8] Serial Clock Phase
-* CTRLR0[ 9] Serial Clock Polarity
-*/
-   cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << DWC_SSI_CTRLR0_SCPOL_OFFSET;
-   cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << DWC_SSI_CTRLR0_SCPH_OFFSET;
-
-   /* CTRLR0[11:10] Transfer Mode */
-   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
-
-   /* CTRLR0[13] Shift Register Loop */
-   cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET;
-
-   if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
-   cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
+   if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) {
+   cr0 |= SSI_MOTO_SPI << SPI_FRF_OFFSET;
+   cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET;
+   cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET;
+   cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET;
+   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
+   } else {
+   cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
+   cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPOL_OFFSET;
+   cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPH_OFFSET;
+   cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << 
DWC_SSI_CTRLR0_SRL_OFFSET;
+   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
+
+   if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
+   cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
+   }
 
-   return cr0;
+   dw_writel(dws, DW_SPI_CTRLR0, cr0);
 }
-EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a);
 
 static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
@@ -290,7 +263,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
struct chip_data *chip = spi_get

[PATCH 1/2] lib: devres: provide devm_iounremap_resource()

2020-09-20 Thread pierre Kuo

Driver doesn't have a single helper function to release memroy
allocated by devm_ioremap_resource(). That mean it needs respectively
to call devm_release_mem_region() and devm_iounmap() for memory release.

This patch creates a helper, devm_iounremap_resource(), to combine above
operations.

Signed-off-by: pierre Kuo 
---
 include/linux/device.h |  2 ++
 lib/devres.c   | 25 +
 2 files changed, 27 insertions(+)

diff --git a/include/linux/device.h b/include/linux/device.h
index 9e6ea8931a52..33ec7e54c1a9 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -240,6 +240,8 @@ void devm_free_pages(struct device *dev, unsigned long 
addr);
 
 void __iomem *devm_ioremap_resource(struct device *dev,
const struct resource *res);
+void devm_iounremap_resource(struct device *dev,
+const struct resource *res, void __iomem *addr);
 void __iomem *devm_ioremap_resource_wc(struct device *dev,
   const struct resource *res);
 
diff --git a/lib/devres.c b/lib/devres.c
index ebb1573d9ae3..cdda0cd0a263 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -113,6 +113,31 @@ void devm_iounmap(struct device *dev, void __iomem *addr)
 }
 EXPORT_SYMBOL(devm_iounmap);
 
+/**
+ * devm_iounremap_resource() - release mem region, and unremap address
+ * @dev: generic device to handle the resource for
+ * @res: resource of mem region to be release
+ * @addr: address to unmap
+ *
+ * Release memory region and unmap address.
+ */
+void devm_iounremap_resource(struct device *dev,
+const struct resource *res, void __iomem *addr)
+{
+   resource_size_t size;
+
+   BUG_ON(!dev);
+   if (!res || resource_type(res) != IORESOURCE_MEM) {
+   dev_err(dev, "invalid resource\n");
+   return;
+   }
+
+   size = resource_size(res);
+   devm_release_mem_region(dev, res->start, size);
+   devm_iounmap(dev, addr);
+}
+EXPORT_SYMBOL(devm_iounremap_resource);
+
 static void __iomem *
 __devm_ioremap_resource(struct device *dev, const struct resource *res,
enum devm_ioremap_type type)
-- 
2.17.1

[PATCH 2/2] driver core: platform: provide devm_platform_iounremap_resource

2020-09-20 Thread pierre Kuo

Combine platform_get_resource() and devm_iounremap_resource() to release
the iomem allocated by devm_platform_get_and_ioremap_resource().

Signed-off-by: pierre Kuo 
---
 drivers/base/platform.c | 24 
 include/linux/platform_device.h |  4 
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index e5d8a0503b4f..e2655c00873f 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -84,6 +84,30 @@ devm_platform_get_and_ioremap_resource(struct 
platform_device *pdev,
 }
 EXPORT_SYMBOL_GPL(devm_platform_get_and_ioremap_resource);
 
+/**
+ * devm_platform_iounremap_resource - call devm_iounremap_resource() for a
+ *   platform device with memory that addr 
points to.
+ *
+ * @pdev: platform device to use both for memory resource lookup as well as
+ *resource management
+ * @index: resource index
+ * @addr: address to be unmap.
+ */
+void
+devm_platform_iounremap_resource(struct platform_device *pdev,
+unsigned int index, void __iomem *addr)
+{
+   struct resource *r;
+
+   r = platform_get_resource(pdev, IORESOURCE_MEM, index);
+   if (!r)
+   dev_err(&pdev->dev,
+   "MEM resource index %d not found\n", index);
+   else
+   devm_iounremap_resource(&pdev->dev, r, addr);
+}
+EXPORT_SYMBOL_GPL(devm_platform_iounremap_resource);
+
 /**
  * devm_platform_ioremap_resource - call devm_ioremap_resource() for a platform
  * device
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 77a2aada106d..75da15937679 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -67,6 +67,10 @@ devm_platform_ioremap_resource_wc(struct platform_device 
*pdev,
 extern void __iomem *
 devm_platform_ioremap_resource_byname(struct platform_device *pdev,
  const char *name);
+extern void
+devm_platform_iounremap_resource(struct platform_device *pdev,
+unsigned int index,
+void __iomem *addr);
 extern int platform_get_irq(struct platform_device *, unsigned int);
 extern int platform_get_irq_optional(struct platform_device *, unsigned int);
 extern int platform_irq_count(struct platform_device *);
-- 
2.17.1

[PATCH 30/30] spi: dw: Add Baikal-T1 SPI Controller glue driver

2020-09-20 Thread Serge Semin

Baikal-T1 is equipped with three DW APB SSI-based MMIO SPI controllers.
Two of them are pretty much normal: with IRQ, DMA, FIFOs of 64 words
depth, 4x CSs, but the third one as being a part of the Baikal-T1 System
Boot Controller has got a very limited resources: no IRQ, no DMA, only a
single native chip-select and Tx/Rx FIFO with just 8 words depth
available. In order to provide a transparent initial boot code execution
the Boot SPI controller is also utilized by an vendor-specific IP-block,
which exposes an SPI flash direct mapping interface. Since both direct
mapping and SPI controller normal utilization are mutual exclusive only
one of these interfaces can be used to access an external SPI slave
device. That's why a dedicated mux is embedded into the System Boot
Controller. All of that is taken into account in the Baikal-T1-specific DW
APB SSI glue driver implemented by means of the DW SPI core module.

Co-developed-by: Ramil Zaripov 
Signed-off-by: Ramil Zaripov 
Signed-off-by: Serge Semin 
---
 drivers/spi/Kconfig  |  28 
 drivers/spi/Makefile |   1 +
 drivers/spi/spi-dw-bt1.c | 339 +++
 3 files changed, 368 insertions(+)
 create mode 100644 drivers/spi/spi-dw-bt1.c

diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
index 1f70bb1e7fa9..415d57b2057f 100644
--- a/drivers/spi/Kconfig
+++ b/drivers/spi/Kconfig
@@ -252,6 +252,34 @@ config SPI_DW_MMIO
tristate "Memory-mapped io interface driver for DW SPI core"
depends on HAS_IOMEM
 
+config SPI_DW_BT1
+   tristate "Baikal-T1 SPI driver for DW SPI core"
+   depends on MIPS_BAIKAL_T1 || COMPILE_TEST
+   help
+ Baikal-T1 SoC is equipped with three DW APB SSI-based MMIO SPI
+ controllers. Two of them are pretty much normal: with IRQ, DMA,
+ FIFOs of 64 words depth, 4x CSs, but the third one as being a
+ part of the Baikal-T1 System Boot Controller has got a very
+ limited resources: no IRQ, no DMA, only a single native
+ chip-select and Tx/Rx FIFO with just 8 words depth available.
+ The later one is normally connected to an external SPI-nor flash
+ of 128Mb (in general can be of bigger size).
+
+config SPI_DW_BT1_DIRMAP
+   bool "Directly mapped Baikal-T1 Boot SPI flash support"
+   depends on SPI_DW_BT1
+   select MULTIPLEXER
+   select MUX_MMIO
+   help
+ Directly mapped SPI flash memory is an interface specific to the
+ Baikal-T1 System Boot Controller. It is a 16MB MMIO region, which
+ can be used to access a peripheral memory device just by
+ reading/writing data from/to it. Note that the system APB bus
+ will stall during each IO from/to the dirmap region until the
+ operation is finished. So try not to use it concurrently with
+ time-critical tasks (like the SPI memory operations implemented
+ in this driver).
+
 endif
 
 config SPI_DLN2
diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
index cf955ea803cd..21dc75842aca 100644
--- a/drivers/spi/Makefile
+++ b/drivers/spi/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_SPI_DLN2)+= spi-dln2.o
 obj-$(CONFIG_SPI_DESIGNWARE)   += spi-dw.o
 spi-dw-y   := spi-dw-core.o
 spi-dw-$(CONFIG_SPI_DW_DMA)+= spi-dw-dma.o
+obj-$(CONFIG_SPI_DW_BT1)   += spi-dw-bt1.o
 obj-$(CONFIG_SPI_DW_MMIO)  += spi-dw-mmio.o
 obj-$(CONFIG_SPI_DW_PCI)   += spi-dw-pci.o
 obj-$(CONFIG_SPI_EFM32)+= spi-efm32.o
diff --git a/drivers/spi/spi-dw-bt1.c b/drivers/spi/spi-dw-bt1.c
new file mode 100644
index ..f382dfad7842
--- /dev/null
+++ b/drivers/spi/spi-dw-bt1.c
@@ -0,0 +1,339 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//
+// Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+//
+// Authors:
+//   Ramil Zaripov 
+//   Serge Semin 
+//
+// Baikal-T1 DW APB SPI and System Boot SPI driver
+//
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "spi-dw.h"
+
+#define BT1_BOOT_DIRMAP0
+#define BT1_BOOT_REGS  1
+
+struct dw_spi_bt1 {
+   struct dw_spi   dws;
+   struct clk  *clk;
+   struct mux_control  *mux;
+
+#ifdef CONFIG_SPI_DW_BT1_DIRMAP
+   void __iomem*map;
+   resource_size_t map_len;
+#endif
+};
+#define to_dw_spi_bt1(_ctlr) \
+   container_of(spi_controller_get_devdata(_ctlr), struct dw_spi_bt1, dws)
+
+typedef int (*dw_spi_bt1_init_cb)(struct platform_device *pdev,
+   struct dw_spi_bt1 *dwsbt1);
+
+#ifdef CONFIG_SPI_DW_BT1_DIRMAP
+
+static int dw_spi_bt1_dirmap_create(struct spi_mem_dirmap_desc *desc)
+{
+   struct dw_spi_bt1 *dwsbt1 = to_dw_spi_bt1(desc->mem->spi->controller);
+
+   if (!dwsbt1->map ||
+   !dwsbt1->dws.mem

outside repository fatal error

2020-09-20 Thread Julia Lawall

The commit bcf4271d4bc3 ("checkpatch: allow not using -f with files that
are in git") in linux-next seems to cause checkpatch to fail on a file
containing a patch if that file is not in the directory containing the
Linux kernel.  Is that intentional?

thanks,
julia

[PATCH 13/14] misc: mic: drop double zeroing

2020-09-20 Thread Julia Lawall

sg_init_table zeroes its first argument, so the allocation of that argument
doesn't have to.

the semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression x,n,flags;
@@

x = 
- kcalloc
+ kmalloc_array
  (n,sizeof(struct scatterlist),flags)
...
sg_init_table(x,n)
// 

Signed-off-by: Julia Lawall 

---
 drivers/misc/mic/scif/scif_nodeqp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -u -p a/drivers/misc/mic/scif/scif_nodeqp.c 
b/drivers/misc/mic/scif/scif_nodeqp.c
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -363,7 +363,7 @@ scif_p2p_setsg(phys_addr_t pa, int page_
struct page *page;
int i;
 
-   sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
+   sg = kmalloc_array(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
if (!sg)
return NULL;
sg_init_table(sg, page_cnt);

[PATCH 08/14] xprtrdma: drop double zeroing

2020-09-20 Thread Julia Lawall

sg_init_table zeroes its first argument, so the allocation of that argument
doesn't have to.

the semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression x,n,flags;
@@

x = 
- kcalloc
+ kmalloc_array
  (n,sizeof(*x),flags)
...
sg_init_table(x,n)
// 

Signed-off-by: Julia Lawall 

---
 net/sunrpc/xprtrdma/frwr_ops.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -u -p a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -124,7 +124,7 @@ int frwr_mr_init(struct rpcrdma_xprt *r_
if (IS_ERR(frmr))
goto out_mr_err;
 
-   sg = kcalloc(depth, sizeof(*sg), GFP_NOFS);
+   sg = kmalloc_array(depth, sizeof(*sg), GFP_NOFS);
if (!sg)
goto out_list_err;

[PATCH 06/14] block: drop double zeroing

2020-09-20 Thread Julia Lawall

sg_init_table zeroes its first argument, so the allocation of that argument
doesn't have to.

the semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression x;
@@

x =
- kzalloc
+ kmalloc
 (...)
...
sg_init_table(x,...)
// 

Signed-off-by: Julia Lawall 

---
 block/bsg-lib.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -u -p a/block/bsg-lib.c b/block/bsg-lib.c
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -207,7 +207,7 @@ static int bsg_map_buffer(struct bsg_buf
 
BUG_ON(!req->nr_phys_segments);
 
-   buf->sg_list = kzalloc(sz, GFP_KERNEL);
+   buf->sg_list = kmalloc(sz, GFP_KERNEL);
if (!buf->sg_list)
return -ENOMEM;
sg_init_table(buf->sg_list, req->nr_phys_segments);

1 2 3 4 5 >

1 - 100 of 499 matches

Mail list logo