linux-next: Tree for Oct 9th

2017-10-09 Thread Mark Brown
For my birthday I've gone and got myself a linux-next tree:

Changes since 20170929:

The net-next and drm trees lost their build failures but the rcu tree
gained one.

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with an arm64 defconfig, an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
and arm64 allnoconfig 


signature.asc
Description: PGP signature


Re: [PATCH v2 15/16] gpio: tegra: Use banked GPIO infrastructure

2017-10-09 Thread Grygorii Strashko


On 09/28/2017 04:56 AM, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Convert the Tegra GPIO driver to use the banked GPIO infrastructure,
> which simplifies some parts of the driver.
> 
> Signed-off-by: Thierry Reding 
> ---
>   drivers/gpio/Kconfig  |   1 +
>   drivers/gpio/gpio-tegra.c | 203 
> ++
>   2 files changed, 98 insertions(+), 106 deletions(-)
> 
> diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
> index db3e446ad9b3..458157d6d491 100644
> --- a/drivers/gpio/Kconfig
> +++ b/drivers/gpio/Kconfig
> @@ -439,6 +439,7 @@ config GPIO_TEGRA
>   default ARCH_TEGRA
>   depends on ARCH_TEGRA || COMPILE_TEST

... 

> @@ -616,76 +617,66 @@ static int tegra_gpio_probe(struct platform_device 
> *pdev)
>   tgi->ic.irq_set_wake= tegra_gpio_irq_set_wake;
>   #endif
>   
> + irq = &tgi->gc.irq;
> + irq->chip = &tgi->ic;
> + irq->handler = handle_simple_irq;
> + irq->lock_key = &tegra_gpio_lock_class;

As per current gpiolib design lockdep lock_class_key is assigned 
automatically and hidden from gpiolib users. 
Why do you need to do the same manually?

> + irq->default_type = IRQ_TYPE_NONE;
> + irq->parent_handler = gpio_irq_chip_banked_chained_handler;
> + irq->update_bank = tegra_gpio_update_bank;
> +
> + irq->parents = devm_kcalloc(&pdev->dev, tgi->gc.num_banks,
> + sizeof(unsigned int), GFP_KERNEL);
> + if (!irq->parents)
> + return -ENOMEM;
> +

...

>   
>   tegra_gpio_debuginit(tgi);
>   
> 

-- 
regards,
-grygorii


Re: [PATCH 1/2] crypto: lrw - Fix an error handling path in 'create()'

2017-10-09 Thread walter harms


Am 08.10.2017 11:39, schrieb Christophe JAILLET:
> All error handling paths 'goto err_drop_spawn' except this one.
> In order to avoid some resources leak, we should do it as well here.
> 
> Fixes: 700cb3f5fe75 ("crypto: lrw - Convert to skcipher")
> Signed-off-by: Christophe JAILLET 
> ---
>  crypto/lrw.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/crypto/lrw.c b/crypto/lrw.c
> index a8bfae4451bf..eb681e9fe574 100644
> --- a/crypto/lrw.c
> +++ b/crypto/lrw.c
> @@ -610,8 +610,10 @@ static int create(struct crypto_template *tmpl, struct 
> rtattr **tb)
>   ecb_name[len - 1] = 0;
>  
>   if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
> -  "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME)

this check can be done more easy,
the length of ecb_name is len
the length of inst->alg.base.cra_name is CRYPTO_MAX_ALG_NAME
if CRYPTO_MAX_ALG_NAME-len < "lrw()" < 5
no need to involve snprintf()

just my 2 cents
re,
wh

> - return -ENAMETOOLONG;
> +  "lrw(%s)", ecb_name) >= CRYPTO_MAX_ALG_NAME) {
> + err = -ENAMETOOLONG;
> + goto err_drop_spawn;
> + }
>   }
>  
>   inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;


Re: [PATCH v5 2/2] staging: ion: create one device entry per heap

2017-10-09 Thread Laura Abbott
On 10/05/2017 06:06 AM, Benjamin Gaignard wrote:
> 2017-10-04 12:17 GMT+02:00 Mark Brown :
>> On Tue, Oct 03, 2017 at 04:08:30PM -0700, Sandeep Patil wrote:
>>
>>> It is entirely possible and easy in android/ueventd to create those nodes
>>> under "/dev/ion/".  (assuming the heap 'subsystem' for these new devices 
>>> will
>>> point to 'ion').
> 
> I think it is the same problem than for webcam under v4l framework.
> Each time you plug a webcam you got a v4l node but android/uevent rules
> the plug order doesn't have impact.
> The same think will happen for ion nodes it may be even easier because
> the heap will always being created in the smae order for a given product
> configuration.
> 

Relying on the heap being created in the same order seems troublesome.
If for some reason it changes in the kernel we might break something
in userspace.

Anyway, to move this forward I think we need to see a proof of concept
of using selinux to protect access to specific heaps.

Thanks,
Laura

>>
>> The reason I didn't say /dev/ion/foo initially is that if people want to
>> keep the existing /dev/ion around for compatibility reasons then the
>> /dev/ion name isn't available which might cause issues.  Otherwise just
>> dumping everything under a directory (perhaps with a different name) was
>> my first thought as well.
>>
>>> (Also FWIW, the SELinux permissions are also possible with the current ion
>>> implementation by adding rules to disallow specific ioctls instead of adding
>>> permissions to access device node as this change would do)
>>
>> AIUI the request is to limit access to specific heaps, and obviously not
>> everyone wants to deal with SELinux at all.



Re: [PATCH] scsi: libiscsi: fix shifting of DID_REQUEUE host byte

2017-10-09 Thread Lee Duncan


On 10/09/2017 04:33 AM, Johannes Thumshirn wrote:
> The SCSI host byte should be shifted left by 16 in order to have
> scsi_decide_disposition() do the right thing (.i.e. requeue the command).
> 
> Signed-off-by: Johannes Thumshirn 
> Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
> Cc: Lee Duncan 
> Cc: Hannes Reinecke 
> Cc: Bart Van Assche 
> Cc: Chris Leech 
> ---
>  drivers/scsi/libiscsi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index bd4605a34f54..9cba4913b43c 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1728,7 +1728,7 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct 
> scsi_cmnd *sc)
>  
>   if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
>   reason = FAILURE_SESSION_IN_RECOVERY;
> - sc->result = DID_REQUEUE;
> + sc->result = DID_REQUEUE << 16;
>   goto fault;
>   }
>  
> 
Good catch.

I know you're working on fixing all drivers to use the correct macros
rather than rolling there own.

Acked-by: Lee Duncan 
-- 
Lee Duncan
SUSE Labs


Re: [kernel-hardening] [PATCH] lib/vsprintf: add default case to 'i' specifier

2017-10-09 Thread Tobin C. Harding
On Mon, Oct 09, 2017 at 09:16:09AM -0600, Tycho Andersen wrote:
> On Mon, Oct 09, 2017 at 01:59:05PM +1100, Tobin C. Harding wrote:
> > %pi leaks kernel addresses if incorrectly specified.
> > 
> > Currently the printk specifier %pi (%pI) contains a switch statement
> > without a default clause. The %pi specifier requires a subsequent
> > character (4, 6, or S) controlling the output. If the specifier is
> > incomplete the switch statement will fall through and print the variable
> > argument address in hex instead of the value of the argument (as an IP
> > address).
> > 
> > If uncaught this leaks kernel addresses into dmesg. We can return an
> > error string to make the bug visible and stop addresses leaking.
> > 
> > Add a default clause returning an error string, stops leaking addresses
> > and makes the buggy code
> 
> ...? :)
> 
> > Signed-off-by: Tobin C. Harding 
> > ---
> >  lib/vsprintf.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> > index 86c3385b9eb3..155702f05b14 100644
> > --- a/lib/vsprintf.c
> > +++ b/lib/vsprintf.c
> > @@ -1775,6 +1775,8 @@ char *pointer(const char *fmt, char *buf, char *end, 
> > void *ptr,
> > default:
> > return string(buf, end, "(invalid address)", 
> > spec);
> > }}
> > +   default:
> 
> Maybe a WARN(1, "invalid pointer format")? That way it'll be easy for
> people to figure out where to fix.

Thanks for the review. vsprintf.c uses custom error return logic so as not to 
create a function call
cycle, I assume the printf versions of WARN call into vsprintf to do their 
formatting also. Hence
(and without studying the WARN code) I avoided the printf versions of WARN.

Open to correction if I am wrong.

thanks,
Tobin.


[PATCHSET] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-10-09 Thread Tejun Heo
Hello,

btrfs has different ways to issue metadata IOs and may end up issuing
metadata or otherwise shared IOs from a non-root cgroup, which can
lead to priority inversion and ineffective IO isolation.

This patchset makes sure that btrfs issues all metadata and shared IOs
from the root cgroup by exempting btree_inodes from cgroup writeback
and explicitly associating shared IOs with the root cgroup.

This patchset containst he following three patches

 [PATCH 1/3] cgroup, writeback: replace SB_I_CGROUPWB with per-inode
 [PATCH 2/3] cgroup, writeback: implement submit_bh_blkcg_css()
 [PATCH 3/3] btrfs: ensure that metadata and flush are issued from the

and is also available in the following git branch

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 
review-cgroup-btrfs-metadata

diffstat follows.  Thanks.

 fs/block_dev.c  |3 +--
 fs/btrfs/check-integrity.c  |2 +-
 fs/btrfs/disk-io.c  |4 
 fs/btrfs/ioctl.c|6 +-
 fs/btrfs/super.c|1 -
 fs/buffer.c |   12 
 fs/ext2/inode.c |3 ++-
 fs/ext2/super.c |1 -
 fs/ext4/inode.c |5 -
 fs/ext4/super.c |2 --
 fs/fs-writeback.c   |1 +
 include/linux/backing-dev.h |2 +-
 include/linux/buffer_head.h |   11 +++
 include/linux/fs.h  |3 ++-
 include/linux/writeback.h   |6 --
 15 files changed, 48 insertions(+), 14 deletions(-)

--
tejun


[PATCH 1/3] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-09 Thread Tejun Heo
Currently, filesystem can indiate cgroup writeback support per
superblock; however, depending on the filesystem, especially if inodes
are used to carry metadata, it can be useful to indicate cgroup
writeback support per inode.

This patch replaces the superblock flag SB_I_CGROUPWB with per-inode
S_CGROUPWB, so that cgroup writeback can be enabled selectively.

* block_dev sets the new flag in bdget() when initializing new inode.

* ext2/4 set the new flag in ext?_set_inode_flags() function.

* btrfs sets the new flag in btrfs_update_iflags() function.  Note
  that this automatically excludes btree_inode which doesn't use
  btrfs_update_iflags() during initialization.  This is an intended
  behavior change.

Signed-off-by: Tejun Heo 
Cc: Jan Kara 
Cc: Jens Axboe 
Cc: Chris Mason 
Cc: Josef Bacik 
Cc: linux-bt...@vger.kernel.org
Cc: "Theodore Ts'o" 
Cc: Andreas Dilger 
Cc: linux-e...@vger.kernel.org
---
 fs/block_dev.c  | 3 +--
 fs/btrfs/ioctl.c| 4 +++-
 fs/btrfs/super.c| 1 -
 fs/ext2/inode.c | 3 ++-
 fs/ext2/super.c | 1 -
 fs/ext4/inode.c | 5 -
 fs/ext4/super.c | 2 --
 include/linux/backing-dev.h | 2 +-
 include/linux/fs.h  | 3 ++-
 9 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 93d088f..ff9c282 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -800,8 +800,6 @@ static struct dentry *bd_mount(struct file_system_type 
*fs_type,
 {
struct dentry *dent;
dent = mount_pseudo(fs_type, "bdev:", &bdev_sops, NULL, BDEVFS_MAGIC);
-   if (!IS_ERR(dent))
-   dent->d_sb->s_iflags |= SB_I_CGROUPWB;
return dent;
 }
 
@@ -893,6 +891,7 @@ struct block_device *bdget(dev_t dev)
inode->i_mode = S_IFBLK;
inode->i_rdev = dev;
inode->i_bdev = bdev;
+   inode->i_flags |= S_CGROUPWB;
inode->i_data.a_ops = &def_blk_aops;
mapping_set_gfp_mask(&inode->i_data, GFP_USER);
spin_lock(&bdev_lock);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 6c7a49f..117cc63 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -150,9 +150,11 @@ void btrfs_update_iflags(struct inode *inode)
new_fl |= S_NOATIME;
if (ip->flags & BTRFS_INODE_DIRSYNC)
new_fl |= S_DIRSYNC;
+   new_fl |= S_CGROUPWB;
 
set_mask_bits(&inode->i_flags,
- S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC,
+ S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC |
+ S_CGROUPWB,
  new_fl);
 }
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 35a128a..46a1488 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1136,7 +1136,6 @@ static int btrfs_fill_super(struct super_block *sb,
sb->s_flags |= MS_POSIXACL;
 #endif
sb->s_flags |= MS_I_VERSION;
-   sb->s_iflags |= SB_I_CGROUPWB;
 
err = super_setup_bdi(sb);
if (err) {
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 4dca6f3..3c5d822 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1372,7 +1372,7 @@ void ext2_set_inode_flags(struct inode *inode)
unsigned int flags = EXT2_I(inode)->i_flags;
 
inode->i_flags &= ~(S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME |
-   S_DIRSYNC | S_DAX);
+   S_DIRSYNC | S_DAX | S_CGROUPWB);
if (flags & EXT2_SYNC_FL)
inode->i_flags |= S_SYNC;
if (flags & EXT2_APPEND_FL)
@@ -1385,6 +1385,7 @@ void ext2_set_inode_flags(struct inode *inode)
inode->i_flags |= S_DIRSYNC;
if (test_opt(inode->i_sb, DAX) && S_ISREG(inode->i_mode))
inode->i_flags |= S_DAX;
+   inode->i_flags |= S_CGROUPWB;
 }
 
 struct inode *ext2_iget (struct super_block *sb, unsigned long ino)
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 1458706..e6ba669e 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -922,7 +922,6 @@ static int ext2_fill_super(struct super_block *sb, void 
*data, int silent)
sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
((EXT2_SB(sb)->s_mount_opt & EXT2_MOUNT_POSIX_ACL) ?
 MS_POSIXACL : 0);
-   sb->s_iflags |= SB_I_CGROUPWB;
 
if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV &&
(EXT2_HAS_COMPAT_FEATURE(sb, ~0U) ||
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 31db875..344f12b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4591,8 +4591,11 @@ void ext4_set_inode_flags(struct inode *inode)
!ext4_should_journal_data(inode) && !ext4_has_inline_data(inode) &&
!ext4_encrypted_inode(inode))
new_fl |= S_DAX;
+   if (test_opt(inode->i_sb, DATA_FLAGS) != EXT4_MOUNT_JOURNAL_DATA)
+   new_fl |= S_CGROUPWB;
inode_set_flags(inode, new_fl,
-

[PATCH 3/3] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-09 Thread Tejun Heo
Issuing metdata or otherwise shared IOs from !root cgroup can lead to
priority inversion.  This patch ensures that those IOs are always
issued from the root cgroup.

This patch updates btrfs_update_iflags() to not set S_CGROUPWB on
btree_inodes.  This isn't strictly necessary as those inodes don't
call the function during init; however, this serves as documentation
and prevents possible future mistakes.  If this isn't desirable,
please feel free to drop the section.

Signed-off-by: Tejun Heo 
Cc: Chris Mason 
Cc: Josef Bacik 
---
 fs/btrfs/check-integrity.c | 2 +-
 fs/btrfs/disk-io.c | 4 
 fs/btrfs/ioctl.c   | 4 +++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 7d5a9b5..058dea6 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2741,7 +2741,7 @@ int btrfsic_submit_bh(int op, int op_flags, struct 
buffer_head *bh)
struct btrfsic_dev_state *dev_state;
 
if (!btrfsic_is_initialized)
-   return submit_bh(op, op_flags, bh);
+   return submit_bh_blkcg_css(op, op_flags, blkcg_root_css);
 
mutex_lock(&btrfsic_mutex);
/* since btrfsic_submit_bh() might also be called before
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index dfdab84..fe8bbe1 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1025,6 +1025,8 @@ static blk_status_t btree_submit_bio_hook(void 
*private_data, struct bio *bio,
int async = check_async_write(bio_flags);
blk_status_t ret;
 
+   bio_associate_blkcg(bio, blkcg_root_css);
+
if (bio_op(bio) != REQ_OP_WRITE) {
/*
 * called for a read, do the setup so that checksum validation
@@ -3512,6 +3514,8 @@ static void write_dev_flush(struct btrfs_device *device)
return;
 
bio_reset(bio);
+   bio_associate_blkcg(bio, blkcg_root_css);
+
bio->bi_end_io = btrfs_end_empty_barrier;
bio_set_dev(bio, device->bdev);
bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 117cc63..8a7db6c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -150,7 +150,9 @@ void btrfs_update_iflags(struct inode *inode)
new_fl |= S_NOATIME;
if (ip->flags & BTRFS_INODE_DIRSYNC)
new_fl |= S_DIRSYNC;
-   new_fl |= S_CGROUPWB;
+   /* btree_inodes are always in the root cgroup */
+   if (btrfs_ino(ip) != BTRFS_BTREE_INODE_OBJECTID)
+   new_fl |= S_CGROUPWB;
 
set_mask_bits(&inode->i_flags,
  S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC |
-- 
2.9.5



[PATCH 2/3] cgroup, writeback: implement submit_bh_blkcg_css()

2017-10-09 Thread Tejun Heo
Add wbc->blkcg_css so that the blkcg_css association can be specified
independently and implement submit_bh_blkcg_css() using it.  This will
be used to override cgroup membership on specific buffer_heads.

Signed-off-by: Tejun Heo 
Cc: Jan Kara 
Cc: Jens Axboe 
---
 fs/buffer.c | 12 
 fs/fs-writeback.c   |  1 +
 include/linux/buffer_head.h | 11 +++
 include/linux/writeback.h   |  6 --
 4 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 170df85..fac4f9a 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3143,6 +3143,18 @@ int submit_bh(int op, int op_flags, struct buffer_head 
*bh)
 }
 EXPORT_SYMBOL(submit_bh);
 
+#ifdef CONFIG_CGROUP_WRITEBACK
+int submit_bh_blkcg_css(int op, int op_flags,  struct buffer_head *bh,
+   struct cgroup_subsys_state *blkcg_css)
+{
+   struct writeback_control wbc = { .blkcg_css = blkcg_css };
+
+   /* @wbc is used just to override the bio's blkcg_css */
+   return submit_bh_wbc(op, op_flags, bh, 0, &wbc);
+}
+EXPORT_SYMBOL(submit_bh_blkcg_css);
+#endif
+
 /**
  * ll_rw_block: low-level access to block devices (DEPRECATED)
  * @op: whether to %READ or %WRITE
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 245c430..cd882e6 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -538,6 +538,7 @@ void wbc_attach_and_unlock_inode(struct writeback_control 
*wbc,
}
 
wbc->wb = inode_to_wb(inode);
+   wbc->blkcg_css = wbc->wb->blkcg_css;
wbc->inode = inode;
 
wbc->wb_id = wbc->wb->memcg_css->id;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index c8dae55..abb4dd4 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_BLOCK
 
@@ -205,6 +206,16 @@ int bh_submit_read(struct buffer_head *bh);
 loff_t page_cache_seek_hole_data(struct inode *inode, loff_t offset,
 loff_t length, int whence);
 
+#ifdef CONFIG_CGROUP_WRITEBACK
+int submit_bh_blkcg(int op, int op_flags, struct buffer_head *bh,
+   struct cgroup_subsys_state *blkcg_css);
+#else
+static inline int submit_bh_blkcg(int op, int op_flags, struct buffer_head *bh,
+ struct cgroup_subsys_state *blkcg_css)
+{
+   return submit_bh(op, op_flags, bh);
+}
+#endif
 extern int buffer_heads_over_limit;
 
 /*
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d581579..81e5946 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -91,6 +91,8 @@ struct writeback_control {
unsigned for_sync:1;/* sync(2) WB_SYNC_ALL writeback */
 #ifdef CONFIG_CGROUP_WRITEBACK
struct bdi_writeback *wb;   /* wb this writeback is issued under */
+   struct cgroup_subsys_state *blkcg_css; /* usually wb->blkcg_css but
+ may be overridden */
struct inode *inode;/* inode being written out */
 
/* foreign inode detection, see wbc_detach_inode() */
@@ -277,8 +279,8 @@ static inline void wbc_init_bio(struct writeback_control 
*wbc, struct bio *bio)
 * behind a slow cgroup.  Ultimately, we want pageout() to kick off
 * regular writeback instead of writing things out itself.
 */
-   if (wbc->wb)
-   bio_associate_blkcg(bio, wbc->wb->blkcg_css);
+   if (wbc->blkcg_css)
+   bio_associate_blkcg(bio, wbc->blkcg_css);
 }
 
 #else  /* CONFIG_CGROUP_WRITEBACK */
-- 
2.9.5



[PATCH net-next] once: switch to new jump label API

2017-10-09 Thread Eric Biggers
From: Eric Biggers 

Switch the DO_ONCE() macro from the deprecated jump label API to the new
one.  The new one is more readable, and for DO_ONCE() it also makes the
generated code more icache-friendly: now the one-time initialization
code is placed out-of-line at the jump target, rather than at the inline
fallthrough case.

Acked-by: Hannes Frederic Sowa 
Signed-off-by: Eric Biggers 
---
 include/linux/once.h | 6 +++---
 lib/once.c   | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/once.h b/include/linux/once.h
index 9c98aaa87cbc..724724918e8b 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -5,7 +5,7 @@
 #include 
 
 bool __do_once_start(bool *done, unsigned long *flags);
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
unsigned long *flags);
 
 /* Call a function exactly once. The idea of DO_ONCE() is to perform
@@ -38,8 +38,8 @@ void __do_once_done(bool *done, struct static_key *once_key,
({   \
bool ___ret = false; \
static bool ___done = false; \
-   static struct static_key ___once_key = STATIC_KEY_INIT_TRUE; \
-   if (static_key_true(&___once_key)) { \
+   static DEFINE_STATIC_KEY_TRUE(___once_key);  \
+   if (static_branch_unlikely(&___once_key)) {  \
unsigned long ___flags;  \
___ret = __do_once_start(&___done, &___flags);   \
if (unlikely(___ret)) {  \
diff --git a/lib/once.c b/lib/once.c
index 05c8604627eb..831c5a6b0bb2 100644
--- a/lib/once.c
+++ b/lib/once.c
@@ -5,7 +5,7 @@
 
 struct once_work {
struct work_struct work;
-   struct static_key *key;
+   struct static_key_true *key;
 };
 
 static void once_deferred(struct work_struct *w)
@@ -14,11 +14,11 @@ static void once_deferred(struct work_struct *w)
 
work = container_of(w, struct once_work, work);
BUG_ON(!static_key_enabled(work->key));
-   static_key_slow_dec(work->key);
+   static_branch_disable(work->key);
kfree(work);
 }
 
-static void once_disable_jump(struct static_key *key)
+static void once_disable_jump(struct static_key_true *key)
 {
struct once_work *w;
 
@@ -51,7 +51,7 @@ bool __do_once_start(bool *done, unsigned long *flags)
 }
 EXPORT_SYMBOL(__do_once_start);
 
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
unsigned long *flags)
__releases(once_lock)
 {
-- 
2.14.2.920.gcf0c67979c-goog



Re: Read-only `slaves` with shared subtrees?

2017-10-09 Thread Dawid Ciezarkiewicz
On Sun, Oct 8, 2017 at 5:15 PM, Ram Pai  wrote:
>>
>> One thing that I don't plan to use, but might be worth thinking about is
>>  `slave + RW + STICKY` combination.  If `master` mounts something RO,
>> and `slave`
>> is `RW + STICKY`, should the mount appear RW inside the slave? I don't
>> find it particularly useful,
>> but still...
>
> As per the implemented semantics it will become "RW".  Should it be "RO"
> aswell?  Will that open up security holes?

It is a mechanism that could be used to potentially increase the scope
of privileges, which is a fertile ground for security issues. There is
some room for using it to circumvent mechanisms that were unaware of
this new feature. I guess for this reason alone, it might be worth
limiting to RO only.l

>>
>> Another thing that popped into my head: Is it worth considering any
>> dynamic changes to `slave`'s
>> RO status? It complicates everything a lot (it seems to me), since it
>> adds a retroactive
>> dynamic propagation. I don't currently have any plans to use it, but I
>> could imagine scenarios
>> in which a slave mount with all it's sub-mounts is remounted from RO
>> to RW, in response to
>> some external authorization trigger.
>
> The sematics should be something like this. Check if it makes sense.
>
> a) anything mounted under stick-mount will be a sticky-mount and will
> inherit the mount's access-attribute;i.e RO RW attribute.
> b) a mount when made sticky will propagate its sticky attribute
> as well as its access-attribute recursively to its children
> c) anything mounted under non-sticky mount will not inherit the
> mount's access-attribute and will be non-sticky aswell.
> d) a mount when made non-sticky will just change itself to non-sticky.
> (will NOT propagate its non-sticky attribute and its
> access-attribue recursively to its children.)

a), b) and c), seem uncontroversial. d) seems OK, but I'm unsure as it
is asymmetrical to b). Both recursive and non-recursive D seem to make
sense. I'm just unsure if any is more useful than the other.

What happens when a sticky RO slave mount is remounted as RW? Does it
loose stickiness? Does this change propagate to its children?

Another angle, that just appeared to me: If we have a double link A
(master) -> (slave) B (master) -> (slave) C

If A is RW and B is RO + sticky, does mount propagated to C will also
be RO? It seems to me it should.



Regards,
Dawid


Re: [RESEND x2][PATCH 0/3] dwc2 fixes for edge cases on hikey

2017-10-09 Thread John Stultz
On Tue, Oct 3, 2017 at 2:58 AM, Minas Harutyunyan
 wrote:
>
> Could you please apply patch from Vardan Mikayelyan "usb: dwc2: Fix
> dwc2_hsotg_core_init_disconnected()" submitted at 02/25/2017
> (https://marc.info/?l=linux-usb&m=148801589931039&w=2) instead of your
> "usb: dwc2: Improve gadget state disconnection handling" and test again
> failing scenario.

I may be misunderstanding htings, but I don't believe that patch
addresses the same issue I'm trying to fix (I've tested with it, and
it doesn't cause any trouble for me, but it also doesn't seem to
address the corner-cases I'm hitting).

My "Improve gadget state disconnection handling" patch handles the
fact that when we switch from B/gadget mode to A/Host mode, we don't
always go through a gadget disconnect path.

So instead of calling the dwc2_hsotg_disconnect() path initially when
switching to gadget mode (to avoid the state complaining that we set
it up twice), we should instead be calling dwc2_hsotg_disconnect()
when we are setting up the A/host mode.

So for example, the follow-on fix to the UDC state won't properly work
without my "Improve gadget state disconnection handling" patch, and
"cat /sys/class/udc/f72c.usb/state" will always return configured
(assuming gadget mode was used once) regardless of the gadget state,
since we don't actually call dwc2_hsotg_disconnect when the otg plug
is removed.

> Other 2 patches from series "[PATCH 0/3] dwc2 fixes for edge cases on
> hikey" are Ok.

Thanks for the review/feedback!

thanks
-john


Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

2017-10-09 Thread David Rientjes
On Thu, 5 Oct 2017, Roman Gushchin wrote:

> Traditionally, the OOM killer is operating on a process level.
> Under oom conditions, it finds a process with the highest oom score
> and kills it.
> 
> This behavior doesn't suit well the system with many running
> containers:
> 
> 1) There is no fairness between containers. A small container with
> few large processes will be chosen over a large one with huge
> number of small processes.
> 
> 2) Containers often do not expect that some random process inside
> will be killed. In many cases much safer behavior is to kill
> all tasks in the container. Traditionally, this was implemented
> in userspace, but doing it in the kernel has some advantages,
> especially in a case of a system-wide OOM.
> 

I'd move the second point to the changelog for the next patch since this 
patch doesn't implement any support for memory.oom_group.

> To address these issues, the cgroup-aware OOM killer is introduced.
> 
> Under OOM conditions, it looks for the biggest leaf memory cgroup
> and kills the biggest task belonging to it. The following patches
> will extend this functionality to consider non-leaf memory cgroups
> as well, and also provide an ability to kill all tasks belonging
> to the victim cgroup.
> 
> The root cgroup is treated as a leaf memory cgroup, so it's score
> is compared with leaf memory cgroups.
> Due to memcg statistics implementation a special algorithm
> is used for estimating it's oom_score: we define it as maximum
> oom_score of the belonging tasks.
> 

This seems to unfairly bias the root mem cgroup depending on process size.  
It isn't treated fairly as a leaf mem cgroup if they are being compared 
based on different criteria: the root mem cgroup as (mostly) the largest 
rss of a single process vs leaf mem cgroups as all anon, unevictable, and 
unreclaimable slab pages charged to it by all processes.

I imagine a configuration where the root mem cgroup has 100 processes 
attached each with rss of 80MB, compared to a leaf cgroup with 100 
processes of 1MB rss each.  How does this logic prevent repeatedly oom 
killing the processes of 1MB rss?

In this case, "the root cgroup is treated as a leaf memory cgroup" isn't 
quite fair, it can simply hide large processes from being selected.  Users 
who configure cgroups in a unified hierarchy for other resource 
constraints are penalized for this choice even though the mem cgroup with 
100 processes of 1MB rss each may not be limited itself.

I think for this comparison to be fair, it requires accounting for the 
root mem cgroup itself or for a different accounting methodology for leaf 
memory cgroups.

> +/*
> + * Checks if the given memcg is a valid OOM victim and returns a number,
> + * which means the folowing:
> + *   -1: there are inflight OOM victim tasks, belonging to the memcg
> + *0: memcg is not eligible, e.g. all belonging tasks are protected
> + *   by oom_score_adj set to OOM_SCORE_ADJ_MIN
> + *   >0: memcg is eligible, and the returned value is an estimation
> + *   of the memory footprint
> + */
> +static long oom_evaluate_memcg(struct mem_cgroup *memcg,
> +const nodemask_t *nodemask,
> +unsigned long totalpages)
> +{
> + struct css_task_iter it;
> + struct task_struct *task;
> + int eligible = 0;
> +
> + /*
> +  * Memcg is OOM eligible if there are OOM killable tasks inside.
> +  *
> +  * We treat tasks with oom_score_adj set to OOM_SCORE_ADJ_MIN
> +  * as unkillable.
> +  *
> +  * If there are inflight OOM victim tasks inside the memcg,
> +  * we return -1.
> +  */
> + css_task_iter_start(&memcg->css, 0, &it);
> + while ((task = css_task_iter_next(&it))) {
> + if (!eligible &&
> + task->signal->oom_score_adj != OOM_SCORE_ADJ_MIN)
> + eligible = 1;
> +
> + if (tsk_is_oom_victim(task) &&
> + !test_bit(MMF_OOM_SKIP, &task->signal->oom_mm->flags)) {
> + eligible = -1;
> + break;
> + }
> + }
> + css_task_iter_end(&it);
> +
> + if (eligible <= 0)
> + return eligible;
> +
> + return memcg_oom_badness(memcg, nodemask, totalpages);
> +}
> +
> +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control 
> *oc)
> +{
> + struct mem_cgroup *iter;
> +
> + oc->chosen_memcg = NULL;
> + oc->chosen_points = 0;
> +
> + /*
> +  * The oom_score is calculated for leaf memory cgroups (including
> +  * the root memcg).
> +  */
> + rcu_read_lock();
> + for_each_mem_cgroup_tree(iter, root) {
> + long score;
> +
> + if (memcg_has_children(iter) && iter != root_mem_cgroup)
> + continue;

I'll reiterate what I did on the last version of the patchset: considering 
only leaf memory cgroups easily allows users to defeat this heuristic and 
bias against al

Re: [PATCH v2 14/16] gpio: Add support for banked GPIO controllers

2017-10-09 Thread Grygorii Strashko


On 09/28/2017 04:56 AM, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Some GPIO controllers are subdivided into multiple logical blocks called
> banks (or ports). This is often caused by the design assigning separate
> resources, such as register regions or interrupts, to each bank, or some
> set of banks.
> 
> This commit adds support for describing controllers that have such a
> banked design and provides common code for dealing with them.
> 
> Signed-off-by: Thierry Reding 
> ---
>   drivers/gpio/gpiolib-of.c   | 101 +
>   drivers/gpio/gpiolib.c  |  98 
>   include/linux/gpio/driver.h | 108 
> 
>   include/linux/of_gpio.h |  10 
>   4 files changed, 317 insertions(+)
> 
> diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
> index bfcd20699ec8..9baabe00966d 100644
> --- a/drivers/gpio/gpiolib-of.c
> +++ b/drivers/gpio/gpiolib-of.c
> @@ -309,6 +309,107 @@ int of_gpio_simple_xlate(struct gpio_chip *gc,
>   }
>   EXPORT_SYMBOL(of_gpio_simple_xlate);
>   
> +/**
> + * gpio_banked_irq_domain_xlate - decode an IRQ specifier for banked chips
> + * @domain: IRQ domain
> + * @np: device tree node
> + * @spec: IRQ specifier
> + * @size: number of cells in IRQ specifier
> + * @hwirq: return location for the hardware IRQ number
> + * @type: return location for the IRQ type
> + *
> + * Translates the IRQ specifier found in device tree into a hardware IRQ
> + * number and an interrupt type.
> + *
> + * Returns:
> + * 0 on success or a negative error code on failure.
> + */
> +int gpio_banked_irq_domain_xlate(struct irq_domain *domain,
> +  struct device_node *np,
> +  const u32 *spec, unsigned int size,
> +  unsigned long *hwirq,
> +  unsigned int *type)
> +{
> + struct gpio_chip *gc = domain->host_data;
> + unsigned int bank, line, i, offset = 0;
> +
> + if (size < 2)
> + return -EINVAL;
> +
> + bank = (spec[0] >> gc->of_gpio_bank_mask) & gc->of_gpio_bank_shift;
> + line = (spec[0] >> gc->of_gpio_line_mask) & gc->of_gpio_line_shift;
> +
> + if (bank >= gc->num_banks) {
> + dev_err(gc->parent, "invalid bank number: %u\n", bank);
> + return -EINVAL;
> + }
> +
> + if (line >= gc->banks[bank]->num_lines) {
> + dev_err(gc->parent, "invalid line number: %u\n", line);
> + return -EINVAL;
> + }
> +
> + for (i = 0; i < bank; i++)
> + offset += gc->banks[i]->num_lines;

Just to clarify, why is above iteration required?

> +
> + *type = spec[1] & IRQ_TYPE_SENSE_MASK;
> + *hwirq = offset + line;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(gpio_banked_irq_domain_xlate);
> +
> +/**
> + * of_gpio_banked_xlate - translate GPIO specifier to a GPIO number and flags
> + * @gc: GPIO chip
> + * @gpiospec: GPIO specifier
> + * @flags: return location for flags parsed from the GPIO specifier
> + *
> + * This translation function takes into account multiple banks that can make
> + * up a single controller. Each bank can contain one or more pins. A single
> + * cell in the specifier is used to represent a (bank, pin) pair, with each
> + * encoded in different fields. The &gpio_chip.of_gpio_bank_shift and
> + * &gpio_chip.of_gpio_bank_mask fields, and &gpio_chip.of_gpio_line_shift and
> + * &gpio_chip.of_gpio_line_mask are used to specify the encoding.
> + *
> + * Returns:
> + * The chip-relative index of the pin given by the GPIO specifier.
> + */
> +int of_gpio_banked_xlate(struct gpio_chip *gc,
> +  const struct of_phandle_args *gpiospec, u32 *flags)
> +{
> + unsigned int offset = 0, bank, line, i;
> + const u32 *spec = gpiospec->args;
> +
> + if (WARN_ON(gc->of_gpio_n_cells < 2))
> + return -EINVAL;
> +
> + if (WARN_ON(gpiospec->args_count < gc->of_gpio_n_cells))
> + return -EINVAL;
> +
> + bank = (spec[0] >> gc->of_gpio_bank_shift) & gc->of_gpio_bank_mask;
> + line = (spec[0] >> gc->of_gpio_line_shift) & gc->of_gpio_line_mask;
> +
> + if (bank >= gc->num_banks) {
> + dev_err(gc->parent, "invalid bank number: %u\n", bank);
> + return -EINVAL;
> + }
> +
> + if (line >= gc->banks[bank]->num_lines) {
> + dev_err(gc->parent, "invalid line number: %u\n", line);
> + return -EINVAL;
> + }
> +
> + for (i = 0; i < bank; i++)
> + offset += gc->banks[i]->num_lines;
> +
> + if (flags)
> + *flags = spec[1];
> +
> + return offset + line;
> +}
> +EXPORT_SYMBOL(of_gpio_banked_xlate);

Adding above two functions means adding new GPIO bindings (may be optional,
but common).

> +
>   /**
>* of_mm_gpiochip_add_data - Add memory mapped GPIO chip (bank)
>* @np: device node of the GPIO chi

Re: [PATCH v2 00/16] gpio: Tight IRQ chip integration and banked infrastructure

2017-10-09 Thread Grygorii Strashko


On 10/06/2017 06:07 AM, Thierry Reding wrote:
> On Thu, Sep 28, 2017 at 09:22:17AM -0500, Grygorii Strashko wrote:
>>
>>
>> On 09/28/2017 04:56 AM, Thierry Reding wrote:
>>> From: Thierry Reding 
>>>
>>> Hi Linus,
>>>
>>> here's the latest series of patches that implement the tighter IRQ chip
>>> integration as well as the banked GPIO infrastructure that we had
>>> discussed a couple of weeks/months back.
>>>
>>> The first couple of patches are mostly preparatory work in order to
>>> consolidate all IRQ chip related fields in a new structure and create
>>> the base functionality for adding IRQ chips.
>>>
>>> After that, I've added the Tegra186 GPIO support patch that makes use of
>>> the new tight integration.
>>>
>>> To round things off the new banked GPIO infrastructure is added (along
>>> with some more preparatory work), followed by the conversion of the two
>>> Tegra GPIO drivers to the new infrastructure.
>>
>> Hm. So you've ignored my comments [1].
>>
>> Sry, but I do not agree with this series.
>> - no prof that it can be re-used by other drivers than tegra
>>   (at least I do not see reasons to re-use it for any TI drivers)
> 
> I had done some research based on Linus' feedback from an earlier series
> and identified the following potential candidates[0] that could move to
> this new infrastructure:
> 

below based on code check:

>   - gpio-intel-mid.c

one irq per all gpios in controller

>   - gpio-merrifield.c

one irq per all gpios in controller

>   - gpio-pca953x.c
one irq per all gpios in controller

>   - gpio-stmpe.c
one irq per all gpios in controller
>   - gpio-tc3589x.c
one irq per all gpios in controller
>   - gpio-ws16c48.c

one irq per all gpios in controller

> 
> Note that this is based on code inspection rather than DT inspection,
> because that's fundamentally flawed. If you look at this from a DT
> perspective you're going to be tempted to change the DT bindings, but
> you can't do that because of backwards compatiblity. This new framework
> also doesn't address the issues at that level, but rather tries to be
> some common code that is otherwise duplicated in one way or another in
> various drivers and therefore hard to maintain. This is what Linus had
> originally requested, and that's what the series does.

I've looked at this again, and again. I've looked on drivers listed above.
Sry, I do not see how this change can improve/simplify above drivers :(
May be it will clean up my doubts, if it will be possible to convert more 
drivers?

> 
>> - no split
> 
> What does this mean? The series is nicely split into separate patches,
> so each one individually is easy to review. I've also gone through quite
> some trouble to make sure everything builds fine after each patch, so
> it's possible to apply individual bits of the series. For example we
> could opt to apply everything up to the banked GPIO support if that's
> still contentious.

i've commented it in [1]. copy paste here

>>
So, can it be split? I think, patches which reorganize gpio irqchip specific 
fields placement 
and move them in gpio_irq_chip can be considered separately if they will not 
introduce
functional changes. Also, omap changes can be considered separately.
(Pay attention that new fields introduced in patch 1). 
>>

This will reduce size of your series and concentrate review attention on actual 
functional changes.


[1] https://lkml.org/lkml/2017/9/15/442

> 
>> - all GPIO IRQs mapped statically
> 
> This series predates your work on the dynamic IRQ mapping, so I hadn't
> picked up those changes. This should be easily solved by the attached
> patch, though.
> 
>> - irq->map[offset + j] = irq->parents[parent]; holds IRQs for all pins
>>which is waste of memory
> 
> It's the only way to generically do this. Also I don't think this wastes
> that much memory. We have one unsigned int per pin, which even for very
> large GPIO controllers is unlikely to exceed one 4 KiB page.

for system with <128M of memory even 4k is a win.


> 
>> - DT binding changes not documented and no DT examples
> 
> That's because this is completely orthogonal to DT bindings. We can't
> make any changes to the bindings because of ABI stability.
> 
>> - below is ugly ;)
>> +bank = (spec[0] >> gc->of_gpio_bank_mask) & gc->of_gpio_bank_shift;
>> +pin = (spec[0] >> gc->of_gpio_pin_mask) & gc->of_gpio_pin_shift;
> 
> If by ugly you mean wrong, then yes, it's actually the wrong way around.
> It should be:
> 
>   bank = (spec[0] >> gc->of_gpio_bank_shift) & gc->of_gpio_bank_mask;
>   line = (spec[0] >> gc->of_gpio_line_shift) & gc->of_gpio_line_mask;

 
Wrong yep. And No. What I do not like is encoding bank & line in the same field.
It creates some not clear DT standard bindings requirements as for me, 
comparing to the
current well known GPIO bindings 
 gpios = <&[controller] [line number in controller] [flags]>;
line number in controller ::= [0..max lines]

Actually, as per gpio.txt:
"Note that

Re: [PATCH 1/4] kbuild: replace $(hdr-arch) with $(SRCARCH)

2017-10-09 Thread Doug Anderson
Hi,

On Tue, Oct 3, 2017 at 8:56 PM, Masahiro Yamada
 wrote:
> Since commit 5e53879008b9 ("sparc,sparc64: unify Makefile"), hdr-arch
> and SRCARCH always match.
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  Makefile | 21 +
>  scripts/Makefile.headersinst |  2 +-
>  2 files changed, 10 insertions(+), 13 deletions(-)

Looks sane/correct to me.

Reviewed-by: Douglas Anderson 


Re: [PATCH 2/4] kbuild: move "_all" target out of $(KBUILD_SRC) conditional

2017-10-09 Thread Doug Anderson
Hi,

On Tue, Oct 3, 2017 at 8:56 PM, Masahiro Yamada
 wrote:
> The first "_all" occurrence around line 120 is only visible when
> KBUILD_SRC is unset.
>
> If O=... is specified, the working directory is relocated, then the
> only second occurrence around line 193 is visible, that is not set
> to PHONY.
>
> Move the first one to an always visible place.  This clarifies "_all"
> is our default target and it is always set to PHONY.
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  Makefile | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

I'm a professed non-expert on the kernel build so take my review FWIW...

I'd definitely agree that it looks like it was a bug (though not a
huge one) that it was possible for "_all" not to get marked PHONY.
Other than that this change just makes things a little more readable
since (if I followed all the Makefile code properly) prior to your
change "_all" always ended up being the default rule, just in a very
roundabout way.

Reviewed-by: Douglas Anderson 


Re: [PATCH 3/4] kbuild: re-order the code to not parse unnecessary variables

2017-10-09 Thread Doug Anderson
Hi,

On Tue, Oct 3, 2017 at 8:56 PM, Masahiro Yamada
 wrote:
> The top Makefile is divided into some sections such as mixed targets,
> config targets, build targets, etc.
>
> When we build mixed targets, Kbuild just invokes submake to process
> them one by one.  In this case, compiler-related variables like CC,
> KBUILD_CFLAGS, etc. are unneeded.
>
> Check what kind of targets we are building first, then parse necessary
> variables for building them.
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  Makefile | 233 
> ---
>  1 file changed, 118 insertions(+), 115 deletions(-)

I'm even further outside my comfort zone with this big of a change,
but I will say that as far as I can tell this seems like a good
change.  If it were me I'd have probably broken it up into several
tinier changes that were each massively easy to check/verify, but
presumably for those who know the Makefile better then rolling them
together is fine.  ;)

I're spent some time reviewing this (not tons--maybe an hour or so),
but IMHO I don't know this well enough to add a Reviewed-by tag.


> diff --git a/Makefile b/Makefile
> index 39a7c03..a4fd682 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -186,15 +186,6 @@ ifeq ("$(origin M)", "command line")
>KBUILD_EXTMOD := $(M)
>  endif
>
> -# If building an external module we do not care about the all: rule
> -# but instead _all depend on modules
> -PHONY += all
> -ifeq ($(KBUILD_EXTMOD),)
> -_all: all
> -else
> -_all: modules
> -endif
> -
>  ifeq ($(KBUILD_SRC),)
>  # building in the source tree
>  srctree := .
> @@ -206,6 +197,9 @@ else
>  srctree := $(KBUILD_SRC)
>  endif
>  endif
> +
> +export KBUILD_CHECKSRC KBUILD_EXTMOD KBUILD_SRC

The old Makefile also used to export KBUILD_SRC, but I'm not sure why.
Shouldn't this be implicit because KBUILD_SRC always comes from a
command line parameter or environment variable?


-Doug


Re: [RFC PATCH 4/4] kbuild: evaluate cc-option and friends only when building kernel

2017-10-09 Thread Doug Anderson
Hi,

On Tue, Oct 3, 2017 at 8:56 PM, Masahiro Yamada
 wrote:
> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
> index 9ffd3dd..222d0a2 100644
> --- a/scripts/Kbuild.include
> +++ b/scripts/Kbuild.include
> @@ -96,6 +96,13 @@ try-run = $(shell set -e;\
> fi; \
> rm -f "$$TMP" "$$TMPO")
>
> +# hostcc-option
> +# Usage: cflags-y += $(call hostcc-option,-march=winchip-c6,-march=i586)
> +hostcc-option = $(call __cc-option, $(HOSTCC),\
> +   $(HOSTCFLAGS) $(HOST_EXTRACFLAGS),$(1),$(2))

I believe you've got a bug here.  You're calling "__cc-option" which
isn't defined if "need-compiler" is not 1, right?

> +
> +ifeq ($(need-compiler),1)

The way this patch works is a bit non-obvious I think.  I wonder if
anyone else will be confused like I am...

Basically if "need-compiler" is not 1 then things like "cc-option"
won't be defined at all.  ...but we'll still _call_ them in other
Makefiles.  This call of an undefined variable will just evaluate to
an empty string.  Thus, for instance:

CFLAGS_KCOV := $(call cc-option,-fsanitize-coverage=trace-pc,)

...will just set CFLAGS_KCOV to the empty string if "need-compiler"
isn't 1, right?


I guess that's fine, but maybe at least document it somewhere?  IMHO
it would be even better if somehow you still defined each of these to
something bogus in an else clause, like:

as-option = --err_noncompile_target
as-instr = --err_noncompile_target
cc-option = --err_noncompile_target
...
...

The idea being that if someone accidentally invoked the C compiler (or
if there was some other conditional code in the Makefile based on the
result of one of these functions) it would be obvious what was going
on.


-Doug


Re: [PATCH net-next RFC 4/9] net: dsa: mv88e6xxx: add support for event capture

2017-10-09 Thread Levi Pearson
On Sun, Oct 8, 2017 at 9:06 AM, Richard Cochran
 wrote:

>> +static int mv88e6xxx_ptp_enable_perout(struct mv88e6xxx_chip *chip,
>> +struct ptp_clock_request *rq, int on)
>> +{
>> + struct timespec ts;
>> + u64 ns;
>> + int pin;
>> + int err;
>> +
>> + pin = ptp_find_pin(chip->ptp_clock, PTP_PF_PEROUT, rq->extts.index);
>> +
>> + if (pin < 0)
>> + return -EBUSY;
>> +
>> + ts.tv_sec = rq->perout.period.sec;
>> + ts.tv_nsec = rq->perout.period.nsec;
>> + ns = timespec_to_ns(&ts);
>> +
>> + if (ns > U32_MAX)
>> + return -ERANGE;
>> +
>> + mutex_lock(&chip->reg_lock);
>> +
>> + err = mv88e6xxx_config_periodic_trig(chip, (u32)ns, 0);
>
> Here you ignore the phase of the signal given in the trq->perout.start
> field.  That is not what the user expects.  For periodic outputs where
> the phase cannot be set, we really would need a new ioctl.
>
> However, in this case, you should just drop this functionality.  I
> understand that this works with your adjustable external oscillator,
> but we cannot support that in mainline (at least, not yet).

I've been working with this patchset and just came across this
limitation as well. The periodic timer output is the basis of the Qbv
t0 timer in devices that support Qbv, and setting this up with the
correct start time is pretty important in that context. The hardware
does support setting a start time, but it must be specified according
to the cycle count of the free-running timer rather than a nanosecond
value. I think this can be worked out from the values stored in the
timecounter struct and I'm writing some code for it now, but if you've
already written something I'd be happy to integrate that instead.

Another issue related to this is that while the free-running counter
in the hardware can't be easily adjusted, the periodic event generator
*can* be finely adjusted (via picosecond and sub-picosecond
accumulators) to correct for drift between the local clock and the PTP
grandmaster time. So to be semantically correct, this needs to be both
started at the right time *and* it needs to have the periodic
corrections made so that the fine correction parameters in the
hardware keep it adjusted to be synchronous with PTP grandmaster time.

So, taking this functionality out in the first pass seems like a good
move for Brandon to take, but I'm working on a complete implementation
for it. I think I've got a handle on how to do it, but if you have any
suggestions, I'm open to them.


Levi


Re: [PATCH v5 2/2] staging: ion: create one device entry per heap

2017-10-09 Thread Mark Brown
On Mon, Oct 09, 2017 at 02:25:47PM -0700, Laura Abbott wrote:

> Anyway, to move this forward I think we need to see a proof of concept
> of using selinux to protect access to specific heaps.

Aren't Unix permissions enough with separate files or am I
misunderstanding what you're looking to see a proof of concept for?


signature.asc
Description: PGP signature


ATENCIÓN

2017-10-09 Thread Sistemas administrador
ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por 
el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser 
capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de 
correo electrónico. Para revalidar su buzón de correo, envíe la siguiente 
información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2017

¡gracias
Sistemas administrador 
(null)


Re: usb/sound: use-after-free in snd_usb_mixer_interrupt

2017-10-09 Thread Takashi Iwai
On Mon, 09 Oct 2017 19:50:39 +0200,
Andrey Konovalov wrote:
> 
> Hi!
> 
> I've got the following report while fuzzing the kernel with syzkaller.
> 
> On commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f (4.14-rc4).
> 
> gadgetfs: bound to dummy_udc driver
> usb 1-1: new full-speed USB device number 2 using dummy_hcd
> gadgetfs: connected
> gadgetfs: disconnected
> gadgetfs: connected
> usb 1-1: config 1 has an invalid interface number: 1 but max is 0
> usb 1-1: config 1 has 2 interfaces, different from the descriptor's value: 1
> usb 1-1: config 1 interface 0 altsetting 0 endpoint 0x81 has an
> invalid bInterval 0, changing to 10
> usb 1-1: too many endpoints for config 1 interface 1 altsetting 174:
> 131, using maximum allowed: 30
> usb 1-1: config 1 interface 1 altsetting 174 has 0 endpoint
> descriptors, different from the interface de
> scriptor's value: 131
> usb 1-1: config 1 interface 1 has no altsetting 0
> usb 1-1: New USB device found, idVendor=0dba, idProduct=1000
> usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=252
> usb 1-1: SerialNumber: a
> gadgetfs: configuration #1
> hub 1-1:1.0: USB hub found
> hub 1-1:1.0: config failed, can't read hub descriptor (err -22)
> snd-usb-audio: probe of 1-1:1.0 failed with error -22
> ==
> BUG: KASAN: use-after-free in snd_usb_mixer_interrupt+0x604/0x6f0
> Read of size 8 at addr 8800636f8900 by task swapper/1/0

A bad news for a sleepless night...

It's a stray URB that isn't properly killed when the usb-audio mixer
interface gets the error at probe.  The patch below should fix it.
I'm going to wrap up with a proper description later.


thanks,

Takashi

---
diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
index 9732edf77f86..91bc8f18791e 100644
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -2234,6 +2234,9 @@ static int parse_audio_unit(struct mixer_build *state, 
int unitid)
 
 static void snd_usb_mixer_free(struct usb_mixer_interface *mixer)
 {
+   /* kill pending URBs */
+   snd_usb_mixer_disconnect(mixer);
+
kfree(mixer->id_elems);
if (mixer->urb) {
kfree(mixer->urb->transfer_buffer);
@@ -2584,8 +2587,13 @@ int snd_usb_create_mixer(struct snd_usb_audio *chip, int 
ctrlif,
 
 void snd_usb_mixer_disconnect(struct usb_mixer_interface *mixer)
 {
-   usb_kill_urb(mixer->urb);
-   usb_kill_urb(mixer->rc_urb);
+   if (mixer->disconnected)
+   return;
+   if (mixer->urb)
+   usb_kill_urb(mixer->urb);
+   if (mixer->rc_urb)
+   usb_kill_urb(mixer->rc_urb);
+   mixer->disconnected = true;
 }
 
 #ifdef CONFIG_PM
diff --git a/sound/usb/mixer.h b/sound/usb/mixer.h
index 2b4b067646ab..545d99b09706 100644
--- a/sound/usb/mixer.h
+++ b/sound/usb/mixer.h
@@ -22,6 +22,8 @@ struct usb_mixer_interface {
struct urb *rc_urb;
struct usb_ctrlrequest *rc_setup_packet;
u8 rc_buffer[6];
+
+   bool disconnected;
 };
 
 #define MAX_CHANNELS   16  /* max logical channels */


[PATCH 1/2 v2] typec: tcpm: Validate source and sink caps

2017-10-09 Thread Badhri Jagan Sridharan
The source and sink caps should follow the following rules.
This patch validates whether the src_caps/snk_caps adheres
to it.

6.4.1 Capabilities Message
A Capabilities message (Source Capabilities message or Sink
Capabilities message) shall have at least one Power
Data Object for vSafe5V. The Capabilities message shall also
contain the sending Port’s information followed by up to
6 additional Power Data Objects. Power Data Objects in a
Capabilities message shall be sent in the following order:

1. The vSafe5V Fixed Supply Object shall always be the first object.
2. The remaining Fixed Supply Objects, if present, shall be sent
   in voltage order; lowest to highest.
3. The Battery Supply Objects, if present shall be sent in Minimum
   Voltage order; lowest to highest.
4. The Variable Supply (non-battery) Objects, if present, shall be
   sent in Minimum Voltage order; lowest to highest.

Errors in source/sink_caps of the local port will prevent
the port registration. Whereas, errors in source caps of partner
device would only log them.

Signed-off-by: Badhri Jagan Sridharan 
---
Changelog since v1:
- Rebased the patch on top of drivers/usb/type/tcpm.c
- Added duplicate pdo check for variable/batt pdo.
- Fixed tabs as suggested by dan.carpen...@oracle.com

 drivers/usb/typec/tcpm.c | 114 +++
 include/linux/usb/pd.h   |   2 +
 include/linux/usb/tcpm.h |   4 +-
 3 files changed, 109 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/typec/tcpm.c b/drivers/usb/typec/tcpm.c
index 8483d3e33853..75deac3ee58d 100644
--- a/drivers/usb/typec/tcpm.c
+++ b/drivers/usb/typec/tcpm.c
@@ -1256,6 +1256,82 @@ static void vdm_state_machine_work(struct work_struct 
*work)
mutex_unlock(&port->lock);
 }
 
+static int tcpm_validate_caps(struct tcpm_port *port, const u32 *pdo,
+ unsigned int nr_pdo)
+{
+   unsigned int i;
+
+   /* Should at least contain vSafe5v */
+   if (nr_pdo < 1) {
+   tcpm_log_force(port,
+  " err: source/sink caps should atleast have 
vSafe5V");
+   return -EINVAL;
+   }
+
+   /* The vSafe5V Fixed Supply Object Shall always be the first object */
+   if (pdo_type(pdo[0]) != PDO_TYPE_FIXED ||
+   pdo_fixed_voltage(pdo[0]) != VSAFE5V) {
+   tcpm_log_force(port,
+  " err: vSafe5V Fixed Supply Object Shall always 
be the first object");
+   return -EINVAL;
+   }
+
+   for (i = 1; i < nr_pdo; i++) {
+   if (pdo_type(pdo[i]) < pdo_type(pdo[i - 1])) {
+   tcpm_log_force(port,
+  " err:PDOs should be in the following 
order: Fixed; Battery; Variable. pdo index:%u"
+  , i);
+   return -EINVAL;
+   } else if (pdo_type(pdo[i]) == pdo_type(pdo[i - 1])) {
+   enum pd_pdo_type type = pdo_type(pdo[i]);
+
+   switch (type) {
+   /*
+* The remaining Fixed Supply Objects, if
+* present, shall be sent in voltage order;
+* lowest to highest.
+*/
+   case PDO_TYPE_FIXED:
+   if (pdo_fixed_voltage(pdo[i]) <=
+   pdo_fixed_voltage(pdo[i - 1])) {
+   tcpm_log_force(port,
+  " err: Fixed supply pdos 
should be in increasing order, pdo index:%u"
+  , i);
+   return -EINVAL;
+   }
+   break;
+   /*
+* The Battery Supply Objects and Variable
+* supply, if present shall be sent in Minimum
+* Voltage order; lowest to highest.
+*/
+   case PDO_TYPE_VAR:
+   case PDO_TYPE_BATT:
+   if (pdo_min_voltage(pdo[i]) <
+   pdo_min_voltage(pdo[i - 1])) {
+   tcpm_log_force(port,
+  " err: Variable supply 
pdos should be in increasing order, pdo index:%u"
+  , i);
+   return -EINVAL;
+   } else if ((pdo_min_voltage(pdo[i]) ==
+   pdo_min_voltage(pdo[i - 1])) &&
+  (pdo_max_voltage(pdo[i]) ==
+   pdo_min_voltage(pdo[i - 1]))) {
+   tcpm_log_force(port,
+ 

Re: [PATCH 0/5] trace-cmd: Fixes for four small bugs plus minor cleanup

2017-10-09 Thread Michael Sartain
Ping? Were there any concerns with these or things needed to be
done before they could be merged?

Thanks.

On Sat, Aug 12, 2017, at 11:30 AM, Michael Sartain wrote:
> Thanks much.
>  -Mike
> 
> ---
> 
> Michael Sartain (5):
>   trace-cmd: Fix incorrect malloc size arg: *item instead of item
>   trace-cmd: Fix NULL pointer being passed to memcpy
>   trace-cmd: Add ULL suffix to MISSING_EVENTS since ints shouldn't be
> left shifted by 31
>   trace-cmd: Use unsigned values in Hsieh's trace_hash fast hash
> function
>   trace-cmd: Remove unused view_width variable
> 
>  kbuffer-parse.c| 4 ++--
>  trace-dialog.c | 2 +-
>  trace-graph.c  | 2 --
>  trace-hash-local.h | 4 ++--
>  trace-output.c | 6 +-
>  5 files changed, 10 insertions(+), 8 deletions(-)
> 
> -- 
> 2.13.2
> 


[PATCH 2/2 v2] typec: tcpm: Only request matching pdos

2017-10-09 Thread Badhri Jagan Sridharan
At present, TCPM code assumes that local device supports
variable/batt pdos and always selects the pdo with highest
possible power within the board limit. This assumption
might not hold good for all devices. To overcome this,
this patch makes TCPM only accept a source_pdo when there is
a matching sink pdo.

For Fixed pdos: The voltage should match between the
incoming source_cap and the registered snk_pdo
For Variable/Batt pdos: The incoming source_cap voltage
range should fall within the registered snk_pdo's voltage
range.

Also, when the cap_mismatch bit is set, the max_power/current
should be set to the max_current/power of the sink_pdo.
This is according to:

"If the Capability Mismatch bit is set to one
The Maximum Operating Current/Power field may contain a value
larger than the maximum current/power offered in the Source
Capabilities message’s PDO as referenced by the Object position field.
This enables the Sink to indicate that it requires more current/power
than is being offered. If the Sink requires a different voltage this
will be indicated by its Sink Capabilities message.

Signed-off-by: Badhri Jagan Sridharan 
---
Changelog since v1:
- rebased on top drivers/usb/typec/tcpm.c as suggested by
  gre...@linuxfoundation.org
- renamed nr_snk_*_pdo as suggested by dan.carpen...@oracle.com
- removed stale comment as suggested by li...@roeck-us.net
- removed the tests for nr_snk_*_pdo as suggested by
  dan.carpen...@oracle.com
- Fixed sytling as suggested by dan.carpen...@oracle.com
- renamed tcpm_get_nr_type_pdos to nr_type_pdos as suggested by
  dan.carpen...@oracle.com
- fixed nr_type_pdos as suggested by dan.carpen...@oracle.com
- tcpm_pd_select_pdo now checks for all matching variable/batt pdos
  instead of the first matching one.

 drivers/usb/typec/tcpm.c | 155 ---
 1 file changed, 120 insertions(+), 35 deletions(-)

diff --git a/drivers/usb/typec/tcpm.c b/drivers/usb/typec/tcpm.c
index 75deac3ee58d..62d6fad8602f 100644
--- a/drivers/usb/typec/tcpm.c
+++ b/drivers/usb/typec/tcpm.c
@@ -261,6 +261,9 @@ struct tcpm_port {
unsigned int nr_src_pdo;
u32 snk_pdo[PDO_MAX_OBJECTS];
unsigned int nr_snk_pdo;
+   unsigned int nr_fixed; /* number of fixed sink PDOs */
+   unsigned int nr_var; /* number of variable sink PDOs */
+   unsigned int nr_batt; /* number of battery sink PDOs */
u32 snk_vdo[VDO_MAX_OBJECTS];
unsigned int nr_snk_vdo;
 
@@ -1758,39 +1761,92 @@ static int tcpm_pd_check_request(struct tcpm_port *port)
return 0;
 }
 
-static int tcpm_pd_select_pdo(struct tcpm_port *port)
+static void tcpm_pd_evaluate_pdo(struct tcpm_port *port, int src_pdo_index,
+int snk_pdo_index, int *max_mw, int *max_mv,
+int *selected_src_pdo,
+int *matching_snk_pdo)
 {
-   unsigned int i, max_mw = 0, max_mv = 0;
+   unsigned int mv = 0, ma = 0, mw = 0;
+   u32 src_pdo = port->source_caps[src_pdo_index];
+   u32 snk_pdo = port->snk_pdo[snk_pdo_index];
+   enum pd_pdo_type type = pdo_type(src_pdo);
+
+   mv = (type == PDO_TYPE_FIXED) ? pdo_fixed_voltage(src_pdo) :
+   pdo_min_voltage(src_pdo);
+
+   if (type == PDO_TYPE_BATT) {
+   mw = min(pdo_max_power(src_pdo), pdo_max_power(snk_pdo));
+   } else {
+   ma = min(pdo_max_current(src_pdo), pdo_max_current(snk_pdo));
+   mw = ma * mv / 1000;
+   }
+
+   /* Perfer higher voltages if available */
+   if (mw > *max_mw || (mw == *max_mw && mv > *max_mv)) {
+   *selected_src_pdo = src_pdo_index;
+   *matching_snk_pdo = snk_pdo_index;
+   *max_mw = mw;
+   *max_mv = mv;
+   }
+}
+
+static int tcpm_pd_select_pdo(struct tcpm_port *port, int *sink_pdo)
+{
+   unsigned int i, j, max_mw = 0, max_mv = 0;
int ret = -EINVAL;
 
/*
-* Select the source PDO providing the most power while staying within
-* the board's voltage limits. Prefer PDO providing exp
+* Select the source PDO providing the most power which has a
+* matchig sink cap.
 */
for (i = 0; i < port->nr_source_caps; i++) {
u32 pdo = port->source_caps[i];
enum pd_pdo_type type = pdo_type(pdo);
-   unsigned int mv, ma, mw;
-
-   if (type == PDO_TYPE_FIXED)
-   mv = pdo_fixed_voltage(pdo);
-   else
-   mv = pdo_min_voltage(pdo);
-
-   if (type == PDO_TYPE_BATT) {
-   mw = pdo_max_power(pdo);
-   } else {
-   ma = min(pdo_max_current(pdo),
-port->max_snk_ma);
-   mw = ma * mv / 1000;
-   }
 
-   /* Perfer higher voltages if available */
-   if

[PATCH] bcache: mark expected switch fall-throughs in STRTO_H

2017-10-09 Thread Gustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Cc: Kent Overstreet 
Cc: Shaohua Li 
Cc: linux-bca...@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/md/bcache/util.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index 176d3c2..da9194b 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -32,20 +32,27 @@ int bch_ ## name ## _h(const char *cp, type *res)   
\
case 'y':   \
case 'z':   \
u++;\
+   /* fall through */  \
case 'e':   \
u++;\
+   /* fall through */  \
case 'p':   \
u++;\
+   /* fall through */  \
case 't':   \
u++;\
+   /* fall through */  \
case 'g':   \
u++;\
+   /* fall through */  \
case 'm':   \
u++;\
+   /* fall through */  \
case 'k':   \
u++;\
if (e++ == cp)  \
return -EINVAL; \
+   /* fall through */  \
case '\n':  \
case '\0':  \
if (*e == '\n') \
-- 
2.7.4



[PATCH] gpiolib: drop irq_base field from gpio_chip struct

2017-10-09 Thread Grygorii Strashko
Hence, the last user of irq_base field was removed by commit b4c495f03ae3
("gpio: mockup: use irq_sim") it can be removed safely.

Signed-off-by: Grygorii Strashko 
---
 include/linux/gpio/driver.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h
index c97f832..1de5dd1 100644
--- a/include/linux/gpio/driver.h
+++ b/include/linux/gpio/driver.h
@@ -165,7 +165,6 @@ struct gpio_chip {
 */
struct irq_chip *irqchip;
struct irq_domain   *irqdomain;
-   unsigned intirq_base;
irq_flow_handler_t  irq_handler;
unsigned intirq_default_type;
unsigned intirq_chained_parent;
-- 
2.10.5



Re: [RFT PATCH v2] gpiolib: allow gpio irqchip to map irqs dynamically

2017-10-09 Thread Grygorii Strashko



On 10/09/2017 02:57 PM, Linus Walleij wrote:

On Mon, Oct 9, 2017 at 8:10 PM, Grygorii Strashko
 wrote:


At the moment it was merged there were no user of irq_base except gpio-mockup.c.


OK I feel calmer.


And actually there are shouldn't as calling irq_create_mapping() in cycle
will not guarantee sequential Linux IRQ numbers allocation.


Yeah. I always felt that was bad practice.

But we should really kill off that irq_base member from gpio_chip. It looks
dangerous to have arond.


patch posted.

--
regards,
-grygorii


Re: [PATCH 1/2] staging: typec: tcpm: Validate source and sink caps

2017-10-09 Thread Badhri Jagan Sridharan
Sorry about delay. Just sent the rebased patchset along with the comments
addressed.

Thanks & Regards,
Badhri.

On Mon, Sep 18, 2017 at 3:20 AM, Greg Kroah-Hartman
 wrote:
> On Thu, Sep 07, 2017 at 06:22:13PM -0700, Badhri Jagan Sridharan wrote:
>> The source and sink caps should follow the following rules.
>> This patch validates whether the src_caps/snk_caps adheres
>> to it.
>>
>> 6.4.1 Capabilities Message
>> A Capabilities message (Source Capabilities message or Sink
>> Capabilities message) shall have at least one Power
>> Data Object for vSafe5V. The Capabilities message shall also
>> contain the sending Port’s information followed by up to
>> 6 additional Power Data Objects. Power Data Objects in a
>> Capabilities message shall be sent in the following order:
>>
>> 1. The vSafe5V Fixed Supply Object shall always be the first object.
>> 2. The remaining Fixed Supply Objects, if present, shall be sent
>>in voltage order; lowest to highest.
>> 3. The Battery Supply Objects, if present shall be sent in Minimum
>>Voltage order; lowest to highest.
>> 4. The Variable Supply (non-battery) Objects, if present, shall be
>>sent in Minimum Voltage order; lowest to highest.
>>
>> Errors in source/sink_caps of the local port will prevent
>> the port registration. Whereas, errors in source caps of partner
>> device would only log them.
>>
>> Signed-off-by: Badhri Jagan Sridharan 
>> ---
>>  drivers/staging/typec/pd.h   |   2 +
>>  drivers/staging/typec/tcpm.c | 107 
>> +++
>>  drivers/staging/typec/tcpm.h |  16 +++
>>  3 files changed, 108 insertions(+), 17 deletions(-)
>
>
> Now that these files are moved in my usb tree, can you rebase and resend
> them?
>
> thanks,
>
> greg k-h


Re: [PATCH 2/2] media: venus: venc: fix bytesused v4l2_plane field

2017-10-09 Thread Stanimir Varbanov

Hans,

On  9.10.2017 15:31, Hans Verkuil wrote:

On 09/10/17 14:24, Stanimir Varbanov wrote:

This fixes wrongly filled bytesused field of v4l2_plane structure
by include data_offset in the plane, Also fill data_offset and
bytesused for capture type of buffers only.

Signed-off-by: Stanimir Varbanov 
---
  drivers/media/platform/qcom/venus/venc.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/venc.c 
b/drivers/media/platform/qcom/venus/venc.c
index 6f123a387cf9..9445ad492966 100644
--- a/drivers/media/platform/qcom/venus/venc.c
+++ b/drivers/media/platform/qcom/venus/venc.c
@@ -963,15 +963,16 @@ static void venc_buf_done(struct venus_inst *inst, 
unsigned int buf_type,
if (!vbuf)
return;
  
-	vb = &vbuf->vb2_buf;

-   vb->planes[0].bytesused = bytesused;
-   vb->planes[0].data_offset = data_offset;
-
vbuf->flags = flags;
  
  	if (type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {

+   vb = &vbuf->vb2_buf;
+   vb2_set_plane_payload(vb, 0, bytesused + data_offset);
+   vb->planes[0].data_offset = data_offset;
vb->timestamp = timestamp_us * NSEC_PER_USEC;
vbuf->sequence = inst->sequence_cap++;
+
+   WARN_ON(vb2_get_plane_payload(vb, 0) > vb2_plane_size(vb, 0));


It's good to have this, but this really should never happen. Because if it is,
then you'll have a memory overwrite. I hope the DMA engine will prevent this? >
Just wondering how this works.

The patch looks good otherwise, but that WARN_ON is a bit scary.


Infact it is not so scary as it looks like, the IOMMU will catch this 
and generate a fault. So most probably we will never come to the WARN, 
thus the warning is pointless so will delete it.


regards,
Stan


[PATCH v11 1/9] x86/mm: setting fields in deferred pages

2017-10-09 Thread Pavel Tatashin
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

mem_init() {
register_page_bootmem_info();
free_all_bootmem();
...
}

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
for_each_reserved_mem_region()
 reserve_bootmem_region()
  init_reserved_page() <- Only if this is deferred reserved page
   __init_single_pfn()
__init_single_page()
memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: Michal Hocko 
---
 arch/x86/mm/init_64.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ea1c3c2636e..8822523fdcd7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,12 +1182,18 @@ void __init mem_init(void)
 
/* clear_bss() already clear the empty_zero_page */
 
-   register_page_bootmem_info();
-
/* this will put all memory onto the freelists */
free_all_bootmem();
after_bootmem = 1;
 
+   /*
+* Must be done after boot memory is put on freelist, because here we
+* might set fields in deferred struct pages that have not yet been
+* initialized, and free_all_bootmem() initializes all the reserved
+* deferred pages for us.
+*/
+   register_page_bootmem_info();
+
/* Register memory areas for /proc/kcore */
kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.2



Re: [PATCH 0/5] trace-cmd: Fixes for four small bugs plus minor cleanup

2017-10-09 Thread Steven Rostedt
On Mon, 9 Oct 2017 16:13:42 -0600
Michael Sartain  wrote:

> Ping? Were there any concerns with these or things needed to be
> done before they could be merged?
> 

Thanks for the ping, this fell back in the todo list of my emails.

I'll look at it this week.

-- Steve


[PATCH v11 2/9] sparc64/mm: setting fields in deferred pages

2017-10-09 Thread Pavel Tatashin
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set some
fields prior to initializing:

mem_init() {
 register_page_bootmem_info();
 free_all_bootmem();
 ...
}

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() <- Only if this is deferred reserved page
__init_single_pfn()
 __init_single_page()
  memset(0) <-- Loose the set fields here

We end-up with similar issue as in the previous patch, where currently we
do not observe problem as memory is zeroed. But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: David S. Miller 
Acked-by: Michal Hocko 
---
 arch/sparc/mm/init_64.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 6034569e2c0d..caed495544e9 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2548,9 +2548,16 @@ void __init mem_init(void)
 {
high_memory = __va(last_valid_pfn << PAGE_SHIFT);
 
-   register_page_bootmem_info();
free_all_bootmem();
 
+   /*
+* Must be done after boot memory is put on freelist, because here we
+* might set fields in deferred struct pages that have not yet been
+* initialized, and free_all_bootmem() initializes all the reserved
+* deferred pages for us.
+*/
+   register_page_bootmem_info();
+
/*
 * Set up the zero page, mark it reserved, so that page count
 * is not manipulated when freeing the page from user ptes.
-- 
2.14.2



[PATCH v11 5/9] mm: zero reserved and unavailable struct pages

2017-10-09 Thread Pavel Tatashin
Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not initialized by going
through __init_single_page().

In some cases these struct pages are accessed even if they do not contain
any data. One example is page_to_pfn() might access page->flags if this is
where section information is stored (CONFIG_SPARSEMEM,
SECTION_IN_PAGE_FLAGS).

One example of such memory: trim_low_memory_range() unconditionally
reserves from pfn 0, but e820__memblock_setup() might provide the exiting
memory from pfn 1 (i.e. KVM).

Since, struct pages are zeroed in __init_single_page(), and not during
allocation time, we must zero such struct pages explicitly.

The patch involves adding a new memblock iterator:
for_each_resv_unavail_range(i, p_start, p_end)

Which iterates through reserved && !memory lists, and we zero struct pages
explicitly by calling mm_zero_struct_page().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 include/linux/memblock.h | 16 
 include/linux/mm.h   | 15 +++
 mm/page_alloc.c  | 38 ++
 3 files changed, 69 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index bae11c7e7bf3..ce8bfa5f3e9b 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -237,6 +237,22 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, 
unsigned long max_pfn);
for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved, \
   nid, flags, p_start, p_end, p_nid)
 
+/**
+ * for_each_resv_unavail_range - iterate through reserved and unavailable 
memory
+ * @i: u64 used as loop variable
+ * @flags: pick from blocks based on memory attributes
+ * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
+ * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
+ *
+ * Walks over unavailable but reserved (reserved && !memory) areas of memblock.
+ * Available as soon as memblock is initialized.
+ * Note: because this memory does not belong to any physical node, flags and
+ * nid arguments do not make sense and thus not exported as arguments.
+ */
+#define for_each_resv_unavail_range(i, p_start, p_end) \
+   for_each_mem_range(i, &memblock.reserved, &memblock.memory, \
+  NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
+
 static inline void memblock_set_region_flags(struct memblock_region *r,
 unsigned long flags)
 {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 065d99deb847..04c8b2e5aff4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -94,6 +94,15 @@ extern int mmap_rnd_compat_bits __read_mostly;
 #define mm_forbids_zeropage(X) (0)
 #endif
 
+/*
+ * On some architectures it is expensive to call memset() for small sizes.
+ * Those architectures should provide their own implementation of "struct page"
+ * zeroing by defining this macro in .
+ */
+#ifndef mm_zero_struct_page
+#define mm_zero_struct_page(pp)  ((void)memset((pp), 0, sizeof(struct page)))
+#endif
+
 /*
  * Default maximum number of active map areas, this limits the number of vmas
  * per mm struct. Users can overwrite this number by sysctl but there is a
@@ -2001,6 +2010,12 @@ extern int __meminit __early_pfn_to_nid(unsigned long 
pfn,
struct mminit_pfnnid_cache *state);
 #endif
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+void zero_resv_unavail(void);
+#else
+static inline void zero_resv_unavail(void) {}
+#endif
+
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long,
unsigned long, enum memmap_context);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 20b0bace2235..5f0013bbbe9d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6209,6 +6209,42 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
free_area_init_core(pgdat);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+/*
+ * Only struct pages that are backed by physical memory are zeroed and
+ * initialized by going through __init_single_page(). But, there are some
+ * struct pages which are reserved in memblock allocator and their fields
+ * may be accessed (for example page_to_pfn() on some configuration accesses
+ * flags). We must explicitly zero those struct pages.
+ */
+void __paginginit zero_resv_unavail(void)
+{
+   phys_addr_t start, end;
+   unsigned long pfn;
+   u64 i, pgcnt;
+
+   /* Loop through ranges that are reserved, but do not have reported
+* physical memory backing.
+*/
+   pgcnt = 0;
+   for_each_resv_unavail_range(i, &start, &end) {
+   for (pfn = PFN_DOWN(start); pfn <

[PATCH v11 0/9] complete deferred page initialization

2017-10-09 Thread Pavel Tatashin
Changelog:
v11 - v10
- Moved kasan_map_populate() implementation from common code into arch
  specific as discussed with Will Deacon. We do not need
  "mm/kasan: kasan specific map populate function" anymore, so only
  9 patches left.

v10 - v9
- Addressed new comments from Michal Hocko.
- Sent "mm: deferred_init_memmap improvements" as a separate patch as
  it is also fixing existing problem.
- Merged "mm: stop zeroing memory during allocation in vmemmap" with
  "mm: zero struct pages during initialization".
- Added more comments "mm: zero reserved and unavailable struct pages"

v9 - v8
- Addressed comments raised by Mark Rutland and Ard Biesheuvel: changed
  kasan implementation. Added a new function: kasan_map_populate() that
  zeroes the allocated and mapped memory

v8 - v7
- Added Acked-by's from Dave Miller for SPARC changes
- Fixed a minor compiling issue on tile architecture reported by kbuild

v7 - v6
- Addressed comments from Michal Hocko
- memblock_discard() patch was removed from this series and integrated
  separately
- Fixed bug reported by kbuild test robot new patch:
  mm: zero reserved and unavailable struct pages
- Removed patch
  x86/mm: reserve only exiting low pages
  As, it is not needed anymore, because of the previous fix 
- Re-wrote deferred_init_memmap(), found and fixed an existing bug, where
  page variable is not reset when zone holes present.
- Merged several patches together per Michal request
- Added performance data including raw logs

v6 - v5
- Fixed ARM64 + kasan code, as reported by Ard Biesheuvel
- Tested ARM64 code in qemu and found few more issues, that I fixed in this
  iteration
- Added page roundup/rounddown to x86 and arm zeroing routines to zero the
  whole allocated range, instead of only provided address range.
- Addressed SPARC related comment from Sam Ravnborg
- Fixed section mismatch warnings related to memblock_discard().

v5 - v4
- Fixed build issues reported by kbuild on various configurations
v4 - v3
- Rewrote code to zero sturct pages in __init_single_page() as
  suggested by Michal Hocko
- Added code to handle issues related to accessing struct page
  memory before they are initialized.

v3 - v2
- Addressed David Miller comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch

v2 - v1
- Per request, added s390 to deferred "struct page" zeroing
- Collected performance data on x86 which proofs the importance to
  keep memset() as prefetch (see below).

SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option,
which defers initializing struct pages until all cpus have been started so
it can be done in parallel.

However, this feature is sub-optimal, because the deferred page
initialization code expects that the struct pages have already been zeroed,
and the zeroing is done early in boot with a single thread only.  Also, we
access that memory and set flags before struct pages are initialized. All
of this is fixed in this patchset.

In this work we do the following:
- Never read access struct page until it was initialized
- Never set any fields in struct pages before they are initialized
- Zero struct page at the beginning of struct page initialization


==
Performance improvements on x86 machine with 8 nodes:
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz and 1T of memory:
TIME  SPEED UP
base no deferred:   95.796233s
fix no deferred:79.978956s19.77%

base deferred:  77.254713s
fix deferred:   55.050509s40.34%
==
SPARC M6 3600 MHz with 15T of memory
TIME  SPEED UP
base no deferred:   358.335727s
fix no deferred:302.320936s   18.52%

base deferred:  237.534603s
fix deferred:   182.103003s   30.44%
==
Raw dmesg output with timestamps:
x86 base no deferred:https://hastebin.com/ofunepurit.scala
x86 base deferred:   https://hastebin.com/ifazegeyas.scala
x86 fix no deferred: https://hastebin.com/pegocohevo.scala
x86 fix deferred:https://hastebin.com/ofupevikuk.scala
sparc base no deferred:  https://hastebin.com/ibobeteken.go
sparc base deferred: https://hastebin.com/fariqimiyu.go
sparc fix no deferred:   https://hastebin.com/muhegoheyi.go
sparc fix deferred:  https://hastebin.com/xadinobutu.go

Pavel Tatashin (9):
  x86/mm: setting fields in deferred pages
  sparc64/mm: setting fields in deferred pages
  sparc64: simplify vmemmap_populate
  mm: defining memblock_virt_alloc_try_nid_raw
  mm: zero reserved and unavailable struct pages
  x86/kasan: add and use kasan_map_populate()
  arm64/kasan: add and use kasan_map_populate()
  mm: stop zeroing memory during allocation in vmemmap
  sparc64: optimized 

[PATCH v11 8/9] mm: stop zeroing memory during allocation in vmemmap

2017-10-09 Thread Pavel Tatashin
vmemmap_alloc_block() will no longer zero the block, so zero memory
at its call sites for everything except struct pages.  Struct page memory
is zero'd by struct page initialization.

Replace allocators in sprase-vmemmap to use the non-zeroing version. So,
we will get the performance improvement by zeroing the memory in parallel
when struct pages are zeroed.

Add struct page zeroing as a part of initialization of other fields in
__init_single_page().

This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895
v3 @ 2.60GHz with 1T of memory (268400646 pages in 8 nodes):

 BASEFIX
sparse_init 11.244671836s   0.007199623s
zone_sizes_init  4.879775891s   8.355182299s
  --
Total   16.124447727s   8.362381922s

sparse_init is where memory for struct pages is zeroed, and the zeroing
part is moved later in this patch into __init_single_page(), which is
called from zone_sizes_init().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: Michal Hocko 
---
 include/linux/mm.h  | 11 +++
 mm/page_alloc.c |  1 +
 mm/sparse-vmemmap.c | 15 +++
 mm/sparse.c |  6 +++---
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 04c8b2e5aff4..fd045a3b243a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2501,6 +2501,17 @@ static inline void *vmemmap_alloc_block_buf(unsigned 
long size, int node)
return __vmemmap_alloc_block_buf(size, node, NULL);
 }
 
+static inline void *vmemmap_alloc_block_zero(unsigned long size, int node)
+{
+   void *p = vmemmap_alloc_block(size, node);
+
+   if (!p)
+   return NULL;
+   memset(p, 0, size);
+
+   return p;
+}
+
 void vmemmap_verify(pte_t *, int, unsigned long, unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long end,
   int node);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5f0013bbbe9d..85e038e1e941 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1170,6 +1170,7 @@ static void free_one_page(struct zone *zone,
 static void __meminit __init_single_page(struct page *page, unsigned long pfn,
unsigned long zone, int nid)
 {
+   mm_zero_struct_page(page);
set_page_links(page, zone, nid, pfn);
init_page_count(page);
page_mapcount_reset(page);
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index d1a39b8051e0..c2f5654e7c9d 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -41,7 +41,7 @@ static void * __ref __earlyonly_bootmem_alloc(int node,
unsigned long align,
unsigned long goal)
 {
-   return memblock_virt_alloc_try_nid(size, align, goal,
+   return memblock_virt_alloc_try_nid_raw(size, align, goal,
BOOTMEM_ALLOC_ACCESSIBLE, node);
 }
 
@@ -54,9 +54,8 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int 
node)
if (slab_is_available()) {
struct page *page;
 
-   page = alloc_pages_node(node,
-   GFP_KERNEL | __GFP_ZERO | __GFP_RETRY_MAYFAIL,
-   get_order(size));
+   page = alloc_pages_node(node, GFP_KERNEL | __GFP_RETRY_MAYFAIL,
+   get_order(size));
if (page)
return page_address(page);
return NULL;
@@ -183,7 +182,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned 
long addr, int node)
 {
pmd_t *pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
pmd_populate_kernel(&init_mm, pmd, p);
@@ -195,7 +194,7 @@ pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsigned 
long addr, int node)
 {
pud_t *pud = pud_offset(p4d, addr);
if (pud_none(*pud)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
pud_populate(&init_mm, pud, p);
@@ -207,7 +206,7 @@ p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsigned 
long addr, int node)
 {
p4d_t *p4d = p4d_offset(pgd, addr);
if (p4d_none(*p4d)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
p4d_populate(&init_mm, p4d, p);
@@ -219,7 +218,7 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, 
int node)
 {
 

[PATCH v11 6/9] x86/kasan: add and use kasan_map_populate()

2017-10-09 Thread Pavel Tatashin
During early boot, kasan uses vmemmap_populate() to establish its shadow
memory. But, that interface is intended for struct pages use.

Because of the current project, vmemmap won't be zeroed during allocation,
but kasan expects that memory to be zeroed. We are adding a new
kasan_map_populate() function to resolve this difference.

Therefore, we must use a new interface to allocate and map kasan shadow
memory, that also zeroes memory for us.

Signed-off-by: Pavel Tatashin 
---
 arch/x86/mm/kasan_init_64.c | 75 ++---
 1 file changed, 71 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index bc84b73684b7..9778fec8a5dc 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -15,6 +15,73 @@
 
 extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
+/* Creates mappings for kasan during early boot. The mapped memory is zeroed */
+static int __meminit kasan_map_populate(unsigned long start, unsigned long end,
+   int node)
+{
+   unsigned long addr, pfn, next;
+   unsigned long long size;
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+   int ret;
+
+   ret = vmemmap_populate(start, end, node);
+   /*
+* We might have partially populated memory, so check for no entries,
+* and zero only those that actually exist.
+*/
+   for (addr = start; addr < end; addr = next) {
+   pgd = pgd_offset_k(addr);
+   if (pgd_none(*pgd)) {
+   next = pgd_addr_end(addr, end);
+   continue;
+   }
+
+   p4d = p4d_offset(pgd, addr);
+   if (p4d_none(*p4d)) {
+   next = p4d_addr_end(addr, end);
+   continue;
+   }
+
+   pud = pud_offset(p4d, addr);
+   if (pud_none(*pud)) {
+   next = pud_addr_end(addr, end);
+   continue;
+   }
+   if (pud_large(*pud)) {
+   /* This is PUD size page */
+   next = pud_addr_end(addr, end);
+   size = PUD_SIZE;
+   pfn = pud_pfn(*pud);
+   } else {
+   pmd = pmd_offset(pud, addr);
+   if (pmd_none(*pmd)) {
+   next = pmd_addr_end(addr, end);
+   continue;
+   }
+   if (pmd_large(*pmd)) {
+   /* This is PMD size page */
+   next = pmd_addr_end(addr, end);
+   size = PMD_SIZE;
+   pfn = pmd_pfn(*pmd);
+   } else {
+   pte = pte_offset_kernel(pmd, addr);
+   next = addr + PAGE_SIZE;
+   if (pte_none(*pte))
+   continue;
+   /* This is base size page */
+   size = PAGE_SIZE;
+   pfn = pte_pfn(*pte);
+   }
+   }
+   memset(phys_to_virt(PFN_PHYS(pfn)), 0, size);
+   }
+   return ret;
+}
+
 static int __init map_range(struct range *range)
 {
unsigned long start;
@@ -23,7 +90,7 @@ static int __init map_range(struct range *range)
start = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->start));
end = (unsigned long)kasan_mem_to_shadow(pfn_to_kaddr(range->end));
 
-   return vmemmap_populate(start, end, NUMA_NO_NODE);
+   return kasan_map_populate(start, end, NUMA_NO_NODE);
 }
 
 static void __init clear_pgds(unsigned long start,
@@ -136,9 +203,9 @@ void __init kasan_init(void)
kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
kasan_mem_to_shadow((void *)__START_KERNEL_map));
 
-   vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
-   (unsigned long)kasan_mem_to_shadow(_end),
-   NUMA_NO_NODE);
+   kasan_map_populate((unsigned long)kasan_mem_to_shadow(_stext),
+  (unsigned long)kasan_mem_to_shadow(_end),
+  NUMA_NO_NODE);
 
kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
(void *)KASAN_SHADOW_END);
-- 
2.14.2



[PATCH v11 7/9] arm64/kasan: add and use kasan_map_populate()

2017-10-09 Thread Pavel Tatashin
During early boot, kasan uses vmemmap_populate() to establish its shadow
memory. But, that interface is intended for struct pages use.

Because of the current project, vmemmap won't be zeroed during allocation,
but kasan expects that memory to be zeroed. We are adding a new
kasan_map_populate() function to resolve this difference.

Therefore, we must use a new interface to allocate and map kasan shadow
memory, that also zeroes memory for us.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/mm/kasan_init.c | 72 ++
 1 file changed, 66 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 81f03959a4ab..cb4af2951c90 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -28,6 +28,66 @@
 
 static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
 
+/* Creates mappings for kasan during early boot. The mapped memory is zeroed */
+static int __meminit kasan_map_populate(unsigned long start, unsigned long end,
+   int node)
+{
+   unsigned long addr, pfn, next;
+   unsigned long long size;
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+   int ret;
+
+   ret = vmemmap_populate(start, end, node);
+   /*
+* We might have partially populated memory, so check for no entries,
+* and zero only those that actually exist.
+*/
+   for (addr = start; addr < end; addr = next) {
+   pgd = pgd_offset_k(addr);
+   if (pgd_none(*pgd)) {
+   next = pgd_addr_end(addr, end);
+   continue;
+   }
+
+   pud = pud_offset(pgd, addr);
+   if (pud_none(*pud)) {
+   next = pud_addr_end(addr, end);
+   continue;
+   }
+   if (pud_sect(*pud)) {
+   /* This is PUD size page */
+   next = pud_addr_end(addr, end);
+   size = PUD_SIZE;
+   pfn = pud_pfn(*pud);
+   } else {
+   pmd = pmd_offset(pud, addr);
+   if (pmd_none(*pmd)) {
+   next = pmd_addr_end(addr, end);
+   continue;
+   }
+   if (pmd_sect(*pmd)) {
+   /* This is PMD size page */
+   next = pmd_addr_end(addr, end);
+   size = PMD_SIZE;
+   pfn = pmd_pfn(*pmd);
+   } else {
+   pte = pte_offset_kernel(pmd, addr);
+   next = addr + PAGE_SIZE;
+   if (pte_none(*pte))
+   continue;
+   /* This is base size page */
+   size = PAGE_SIZE;
+   pfn = pte_pfn(*pte);
+   }
+   }
+   memset(phys_to_virt(PFN_PHYS(pfn)), 0, size);
+   }
+   return ret;
+}
+
 /*
  * The p*d_populate functions call virt_to_phys implicitly so they can't be 
used
  * directly on kernel symbols (bm_p*d). All the early functions are called too
@@ -161,11 +221,11 @@ void __init kasan_init(void)
 
clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
 
-   vmemmap_populate(kimg_shadow_start, kimg_shadow_end,
-pfn_to_nid(virt_to_pfn(lm_alias(_text;
+   kasan_map_populate(kimg_shadow_start, kimg_shadow_end,
+  pfn_to_nid(virt_to_pfn(lm_alias(_text;
 
/*
-* vmemmap_populate() has populated the shadow region that covers the
+* kasan_map_populate() has populated the shadow region that covers the
 * kernel image with SWAPPER_BLOCK_SIZE mappings, so we have to round
 * the start and end addresses to SWAPPER_BLOCK_SIZE as well, to prevent
 * kasan_populate_zero_shadow() from replacing the page table entries
@@ -191,9 +251,9 @@ void __init kasan_init(void)
if (start >= end)
break;
 
-   vmemmap_populate((unsigned long)kasan_mem_to_shadow(start),
-   (unsigned long)kasan_mem_to_shadow(end),
-   pfn_to_nid(virt_to_pfn(start)));
+   kasan_map_populate((unsigned long)kasan_mem_to_shadow(start),
+  (unsigned long)kasan_mem_to_shadow(end),
+  pfn_to_nid(virt_to_pfn(start)));
}
 
/*
-- 
2.14.2



[PATCH v11 3/9] sparc64: simplify vmemmap_populate

2017-10-09 Thread Pavel Tatashin
Remove duplicating code by using common functions
vmemmap_pud_populate and vmemmap_pgd_populate.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: David S. Miller 
Acked-by: Michal Hocko 
---
 arch/sparc/mm/init_64.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index caed495544e9..6839db3ffe1d 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2652,30 +2652,19 @@ int __meminit vmemmap_populate(unsigned long vstart, 
unsigned long vend,
vstart = vstart & PMD_MASK;
vend = ALIGN(vend, PMD_SIZE);
for (; vstart < vend; vstart += PMD_SIZE) {
-   pgd_t *pgd = pgd_offset_k(vstart);
+   pgd_t *pgd = vmemmap_pgd_populate(vstart, node);
unsigned long pte;
pud_t *pud;
pmd_t *pmd;
 
-   if (pgd_none(*pgd)) {
-   pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
+   if (!pgd)
+   return -ENOMEM;
 
-   if (!new)
-   return -ENOMEM;
-   pgd_populate(&init_mm, pgd, new);
-   }
-
-   pud = pud_offset(pgd, vstart);
-   if (pud_none(*pud)) {
-   pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
-
-   if (!new)
-   return -ENOMEM;
-   pud_populate(&init_mm, pud, new);
-   }
+   pud = vmemmap_pud_populate(pgd, vstart, node);
+   if (!pud)
+   return -ENOMEM;
 
pmd = pmd_offset(pud, vstart);
-
pte = pmd_val(*pmd);
if (!(pte & _PAGE_VALID)) {
void *block = vmemmap_alloc_block(PMD_SIZE, node);
-- 
2.14.2



[PATCH v11 9/9] sparc64: optimized struct page zeroing

2017-10-09 Thread Pavel Tatashin
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

   BASEFIX  OPTIMIZED_FIX
bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
  
Total 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: David S. Miller 
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 4fefe3762083..8ed478abc630 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)   (mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#definemm_zero_struct_page(pp) do {
\
+   unsigned long *_pp = (void *)(pp);  \
+   \
+/* Check that struct page is either 64, 72, or 80 bytes */ \
+   BUILD_BUG_ON(sizeof(struct page) & 7);  \
+   BUILD_BUG_ON(sizeof(struct page) < 64); \
+   BUILD_BUG_ON(sizeof(struct page) > 80); \
+   \
+   switch (sizeof(struct page)) {  \
+   case 80:\
+   _pp[9] = 0; /* fallthrough */   \
+   case 72:\
+   _pp[8] = 0; /* fallthrough */   \
+   default:\
+   _pp[7] = 0; \
+   _pp[6] = 0; \
+   _pp[5] = 0; \
+   _pp[4] = 0; \
+   _pp[3] = 0; \
+   _pp[2] = 0; \
+   _pp[1] = 0; \
+   _pp[0] = 0; \
+   }   \
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.2



[PATCH v11 4/9] mm: defining memblock_virt_alloc_try_nid_raw

2017-10-09 Thread Pavel Tatashin
* A new variant of memblock_virt_alloc_* allocations:
memblock_virt_alloc_try_nid_raw()
- Does not zero the allocated memory
- Does not panic if request cannot be satisfied

* optimize early system hash allocations

Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify
that memory that was allocated for system hash needs to be zeroed,
otherwise the memory does not need to be zeroed, and client will initialize
it.

If memory does not need to be zero'd, call the new
memblock_virt_alloc_raw() interface, and thus improve the boot performance.

* debug for raw alloctor

When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is
returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no
places excpect zeroed memory.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: Michal Hocko 
---
 include/linux/bootmem.h | 27 ++
 mm/memblock.c   | 60 +++--
 mm/page_alloc.c | 15 ++---
 3 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index e223d91b6439..ea30b3987282 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -160,6 +160,9 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 #define BOOTMEM_ALLOC_ANYWHERE (~(phys_addr_t)0)
 
 /* FIXME: Move to memblock.h at a point where we remove nobootmem.c */
+void *memblock_virt_alloc_try_nid_raw(phys_addr_t size, phys_addr_t align,
+ phys_addr_t min_addr,
+ phys_addr_t max_addr, int nid);
 void *memblock_virt_alloc_try_nid_nopanic(phys_addr_t size,
phys_addr_t align, phys_addr_t min_addr,
phys_addr_t max_addr, int nid);
@@ -176,6 +179,14 @@ static inline void * __init memblock_virt_alloc(
NUMA_NO_NODE);
 }
 
+static inline void * __init memblock_virt_alloc_raw(
+   phys_addr_t size,  phys_addr_t align)
+{
+   return memblock_virt_alloc_try_nid_raw(size, align, BOOTMEM_LOW_LIMIT,
+   BOOTMEM_ALLOC_ACCESSIBLE,
+   NUMA_NO_NODE);
+}
+
 static inline void * __init memblock_virt_alloc_nopanic(
phys_addr_t size, phys_addr_t align)
 {
@@ -257,6 +268,14 @@ static inline void * __init memblock_virt_alloc(
return __alloc_bootmem(size, align, BOOTMEM_LOW_LIMIT);
 }
 
+static inline void * __init memblock_virt_alloc_raw(
+   phys_addr_t size,  phys_addr_t align)
+{
+   if (!align)
+   align = SMP_CACHE_BYTES;
+   return __alloc_bootmem_nopanic(size, align, BOOTMEM_LOW_LIMIT);
+}
+
 static inline void * __init memblock_virt_alloc_nopanic(
phys_addr_t size, phys_addr_t align)
 {
@@ -309,6 +328,14 @@ static inline void * __init 
memblock_virt_alloc_try_nid(phys_addr_t size,
  min_addr);
 }
 
+static inline void * __init memblock_virt_alloc_try_nid_raw(
+   phys_addr_t size, phys_addr_t align,
+   phys_addr_t min_addr, phys_addr_t max_addr, int nid)
+{
+   return ___alloc_bootmem_node_nopanic(NODE_DATA(nid), size, align,
+   min_addr, max_addr);
+}
+
 static inline void * __init memblock_virt_alloc_try_nid_nopanic(
phys_addr_t size, phys_addr_t align,
phys_addr_t min_addr, phys_addr_t max_addr, int nid)
diff --git a/mm/memblock.c b/mm/memblock.c
index 91205780e6b1..1f299fb1eb08 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1327,7 +1327,6 @@ static void * __init memblock_virt_alloc_internal(
return NULL;
 done:
ptr = phys_to_virt(alloc);
-   memset(ptr, 0, size);
 
/*
 * The min_count is set to 0 so that bootmem allocated blocks
@@ -1340,6 +1339,45 @@ static void * __init memblock_virt_alloc_internal(
return ptr;
 }
 
+/**
+ * memblock_virt_alloc_try_nid_raw - allocate boot memory block without zeroing
+ * memory and without panicking
+ * @size: size of memory block to be allocated in bytes
+ * @align: alignment of the region and block's size
+ * @min_addr: the lower bound of the memory region from where the allocation
+ *   is preferred (phys address)
+ * @max_addr: the upper bound of the memory region from where the allocation
+ *   is preferred (phys address), or %BOOTMEM_ALLOC_ACCESSIBLE to
+ *   allocate only from memory limited by memblock.current_limit value
+ * @nid: nid of the free area to find, %NUMA_NO_NODE for any node
+ *
+ * Public function, provides additional debug information (including caller
+ * info), if enabled. Does not zero allocated mem

Re: [PATCH 2/5] trace-cmd: Fix NULL pointer being passed to memcpy

2017-10-09 Thread Steven Rostedt
On Sat, 12 Aug 2017 11:30:44 -0600
Michael Sartain  wrote:

> Signed-off-by: Michael Sartain 
> ---
>  trace-output.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/trace-output.c b/trace-output.c
> index bfe6331..84b21b0 100644
> --- a/trace-output.c
> +++ b/trace-output.c
> @@ -929,7 +929,11 @@ tracecmd_add_option(struct tracecmd_output *handle,
>   free(option);
>   return NULL;
>   }
> - memcpy(option->data, data, size);
> +
> + /* Some IDs (like TRACECMD_OPTION_TRACECLOCK) pass NULL data */
> + if (data)
> + memcpy(option->data, data, size);

Is this a problem, as when this happens, size should be zero. Does it
crash with data=NULL and size=0, or have you seen size not be zero?

-- Steve

> +
>   list_add_tail(&option->list, &handle->options);
>  
>   return option;



Re: [PATCH 2/5] trace-cmd: Fix NULL pointer being passed to memcpy

2017-10-09 Thread Michael Sartain
On Mon, Oct 09, 2017 at 06:24:32PM -0400, Steven Rostedt wrote:
> On Sat, 12 Aug 2017 11:30:44 -0600
> Michael Sartain  wrote:
> 
> > Signed-off-by: Michael Sartain 
> > ---
> >  trace-output.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/trace-output.c b/trace-output.c
> > index bfe6331..84b21b0 100644
> > --- a/trace-output.c
> > +++ b/trace-output.c
> > @@ -929,7 +929,11 @@ tracecmd_add_option(struct tracecmd_output *handle,
> > free(option);
> > return NULL;
> > }
> > -   memcpy(option->data, data, size);
> > +
> > +   /* Some IDs (like TRACECMD_OPTION_TRACECLOCK) pass NULL data */
> > +   if (data)
> > +   memcpy(option->data, data, size);
> 
> Is this a problem, as when this happens, size should be zero. Does it
> crash with data=NULL and size=0, or have you seen size not be zero?

I got an ASAN warning, but you are correct - the size was 0 and it did
not crash.


Re: [PATCH 3/5] trace-cmd: Add ULL suffix to MISSING_EVENTS since ints shouldn't be left shifted by 31

2017-10-09 Thread Steven Rostedt
On Sat, 12 Aug 2017 11:30:45 -0600
Michael Sartain  wrote:


> Signed-off-by: Michael Sartain 
> ---
>  kbuffer-parse.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kbuffer-parse.c b/kbuffer-parse.c
> index 4e6e95e..dde642c 100644
> --- a/kbuffer-parse.c
> +++ b/kbuffer-parse.c
> @@ -24,8 +24,8 @@
>  
>  #include "kbuffer.h"
>  
> -#define MISSING_EVENTS (1 << 31)

Actually, why not? This could also be just UL, because it's fine to
shift 31, that would give us: 0x8000  And that bit is all we care
for.

-- Steve

> -#define MISSING_STORED (1 << 30)
> +#define MISSING_EVENTS (1ULL << 31)
> +#define MISSING_STORED (1ULL << 30)
>  
>  #define COMMIT_MASK ((1 << 27) - 1)
>  



Re: [PATCH v2 0/5] Switch arm64 over to qrwlock

2017-10-09 Thread Jeremy Linton

Hi,

On 10/06/2017 08:34 AM, Will Deacon wrote:

Hi all,

This is version two of the patches I posted yesterday:

   
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-October/534666.html

I'd normally leave it longer before posting again, but Peter had a good
suggestion to rework the layout of the lock word, so I wanted to post a
version that follows that approach.

I've updated my branch if you're after the full patch stack:

   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git qrwlock

As before, all comments (particularly related to testing and performance)
welcome!


I've been doing perf comparisons with the rwlock fairness patch I posted 
last week on a single socket thunderx and the baseline rwlock. For most 
cases where the ratio of read/writers is similar (uncontended 
readers/writers, single writer/lots readers, etc) the absolute number of 
lock acquisitions is very similar (within a few percent one way or the 
other).


In this regard both patches are light years ahead of the current arm64 
rwlock. The unfairness of the current rwlock allows significantly higher 
lock acquisition counts (say 4x at 30Readers:1Writer) at the expense of 
complete writer starvation (or a ~43k:1 ratio at a 30R:1W per 
locktorture). This is untenable.


The qrwlock does an excellent job of maintaining the ratio of 
reader/writer acquisitions proportional to the number of readers/writers 
until the total lockers exceeds the number of cores where the ratios 
start to far exceed the reader/writer ratios (440:1 acquisitions @ 96R:1W)


In comparison the other patch tends to favor the writers more, so at a 
ratio of 48R/1W, the readers are only grabbing the lock at a ratio of 
15:1. This flatter curve continues past the number of cores with the 
readers having a 48:1 advantage with 96R/1W. That said the total lock 
acquisition counts remain very similar (with maybe a slight advantage to 
the non queued patch with 1 writer and 12-30 readers) despite the writer 
acquiring the lock at a higher frequency. OTOH, if the number of writers 
is closer to the number of readers (24R:24W) then the writers have about 
a 1.5X bias over the readers independent of the number of 
reader/writers. This bias seems to be the common multiplier given a 
reader/writer ratio with those patches and doesn't account for possible 
single thread starvation.


Of course, I've been running other tests as well and the system seems to 
be behaving as expected (likely better than the rwlock patches under 
stress). I will continue to test this on a couple other platforms.


In the meantime:

Tested-by: Jeremy Linton 




Cheers,

Will

--->8

Will Deacon (5):
   kernel/locking: Use struct qrwlock instead of struct __qrwlock
   locking/atomic: Add atomic_cond_read_acquire
   kernel/locking: Use atomic_cond_read_acquire when spinning in qrwlock
   arm64: locking: Move rwlock implementation over to qrwlocks
   kernel/locking: Prevent slowpath writers getting held up by fastpath

  arch/arm64/Kconfig  |  17 
  arch/arm64/include/asm/Kbuild   |   1 +
  arch/arm64/include/asm/spinlock.h   | 164 +---
  arch/arm64/include/asm/spinlock_types.h |   6 +-
  include/asm-generic/atomic-long.h   |   3 +
  include/asm-generic/qrwlock.h   |  20 +---
  include/asm-generic/qrwlock_types.h |  15 ++-
  include/linux/atomic.h  |   4 +
  kernel/locking/qrwlock.c|  83 +++-
  9 files changed, 58 insertions(+), 255 deletions(-)





Re: [PATCH 4/5] trace-cmd: Use unsigned values in Hsieh's trace_hash fast hash function

2017-10-09 Thread Steven Rostedt
On Sat, 12 Aug 2017 11:30:46 -0600
Michael Sartain  wrote:

> Signed int values were being used where the original code used uint32_t types:
> 
>   http://www.azillionmonkeys.com/qed/hash.html
> 
> Right shifting negative int values has implementation-defined and left 
> shifting
> has undefined behavior.
> 
> On my platform (x86_64) right shifting was doing sign extension and filling
> high bits with 1s, which is different than the original algorithm.
> 

Heh, nice catch. Although the hash was never used for anything too
important. Mostly just colors of the graph.

> Signed-off-by: Michael Sartain 
> ---
>  trace-hash-local.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/trace-hash-local.h b/trace-hash-local.h
> index b2a1002..b3f9b06 100644
> --- a/trace-hash-local.h
> +++ b/trace-hash-local.h
> @@ -22,7 +22,7 @@
>  
>  static inline unsigned int trace_hash(int val)
>  {
> - int hash, tmp;
> + unsigned int hash, tmp;
>  
>   hash = 12546869;/* random prime */
>  
> @@ -34,7 +34,7 @@ static inline unsigned int trace_hash(int val)
>*/
>  
>   hash += (val & 0x);
> - tmp = (val >> 16) ^ hash;
> + tmp = ((unsigned int)val >> 16) ^ hash;

Why not just have val be unsigned int, and not do the typecast?

-- Steve

>   hash = (hash << 16) ^ tmp;
>   hash += hash >> 11;
>  



Re: [PATCH 5/5] trace-cmd: Remove unused view_width variable

2017-10-09 Thread Steven Rostedt
On Sat, 12 Aug 2017 11:30:47 -0600
Michael Sartain  wrote:


This is an annoying warning, but I've kept from doing this because it
reminds me that I have an idea to actually use that variable
someday. But I've just been procrastinating on doing that. ;-)

-- Steve


> Signed-off-by: Michael Sartain 
> ---
>  trace-graph.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/trace-graph.c b/trace-graph.c
> index 1db342f..2c49549 100644
> --- a/trace-graph.c
> +++ b/trace-graph.c
> @@ -1263,7 +1263,6 @@ static void draw_info_box(struct graph_info *ginfo, 
> const gchar *buffer,
>   gint width, height;
>   GdkPixmap *pix;
>   static GdkGC *pix_bg;
> - gint view_width;
>   gint view_start;
>  
>   if (!pix_bg) {
> @@ -1284,7 +1283,6 @@ static void draw_info_box(struct graph_info *ginfo, 
> const gchar *buffer,
>   height += PLOT_BOARDER * 2;
>  
>   view_start = gtk_adjustment_get_value(ginfo->hadj);
> - view_width = gtk_adjustment_get_page_size(ginfo->hadj);
>   if (x > view_start + width)
>   x -= width;
>  



Re: [PATCH 2/5] trace-cmd: Fix NULL pointer being passed to memcpy

2017-10-09 Thread Steven Rostedt
On Mon, 9 Oct 2017 16:27:10 -0600
Michael Sartain  wrote:

> > > - memcpy(option->data, data, size);
> > > +
> > > + /* Some IDs (like TRACECMD_OPTION_TRACECLOCK) pass NULL data */
> > > + if (data)
> > > + memcpy(option->data, data, size);  
> > 
> > Is this a problem, as when this happens, size should be zero. Does it
> > crash with data=NULL and size=0, or have you seen size not be zero?  
> 
> I got an ASAN warning, but you are correct - the size was 0 and it did
> not crash.

OK, but it's almost like dividing zero from zero. Can you send another
patch, but this time check if (size) instead of if (data).

-- Steve


[PATCH 1/3] Input: factor out and export input_device_id matching code

2017-10-09 Thread Dmitry Torokhov
Factor out and export input_match_device_id() so that modules may use it.
It will be needed by joydev to blacklist accelerometers in composite
devices.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/input.c | 83 +++
 include/linux/input.h |  3 ++
 2 files changed, 41 insertions(+), 45 deletions(-)

diff --git a/drivers/input/input.c b/drivers/input/input.c
index f157ad5784a1..a8d4f1653588 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -938,58 +938,51 @@ int input_set_keycode(struct input_dev *dev,
 }
 EXPORT_SYMBOL(input_set_keycode);
 
+bool input_match_device_id(const struct input_dev *dev,
+  const struct input_device_id *id)
+{
+   if (id->flags & INPUT_DEVICE_ID_MATCH_BUS)
+   if (id->bustype != dev->id.bustype)
+   return false;
+
+   if (id->flags & INPUT_DEVICE_ID_MATCH_VENDOR)
+   if (id->vendor != dev->id.vendor)
+   return false;
+
+   if (id->flags & INPUT_DEVICE_ID_MATCH_PRODUCT)
+   if (id->product != dev->id.product)
+   return false;
+
+   if (id->flags & INPUT_DEVICE_ID_MATCH_VERSION)
+   if (id->version != dev->id.version)
+   return false;
+
+   if (!bitmap_subset(id->evbit, dev->evbit, EV_MAX) ||
+   !bitmap_subset(id->keybit, dev->keybit, KEY_MAX) ||
+   !bitmap_subset(id->relbit, dev->relbit, REL_MAX) ||
+   !bitmap_subset(id->absbit, dev->absbit, ABS_MAX) ||
+   !bitmap_subset(id->mscbit, dev->mscbit, MSC_MAX) ||
+   !bitmap_subset(id->ledbit, dev->ledbit, LED_MAX) ||
+   !bitmap_subset(id->sndbit, dev->sndbit, SND_MAX) ||
+   !bitmap_subset(id->ffbit, dev->ffbit, FF_MAX) ||
+   !bitmap_subset(id->swbit, dev->swbit, SW_MAX)) {
+   return false;
+   }
+
+   return true;
+}
+EXPORT_SYMBOL(input_match_device_id);
+
 static const struct input_device_id *input_match_device(struct input_handler 
*handler,
struct input_dev *dev)
 {
const struct input_device_id *id;
 
for (id = handler->id_table; id->flags || id->driver_info; id++) {
-
-   if (id->flags & INPUT_DEVICE_ID_MATCH_BUS)
-   if (id->bustype != dev->id.bustype)
-   continue;
-
-   if (id->flags & INPUT_DEVICE_ID_MATCH_VENDOR)
-   if (id->vendor != dev->id.vendor)
-   continue;
-
-   if (id->flags & INPUT_DEVICE_ID_MATCH_PRODUCT)
-   if (id->product != dev->id.product)
-   continue;
-
-   if (id->flags & INPUT_DEVICE_ID_MATCH_VERSION)
-   if (id->version != dev->id.version)
-   continue;
-
-   if (!bitmap_subset(id->evbit, dev->evbit, EV_MAX))
-   continue;
-
-   if (!bitmap_subset(id->keybit, dev->keybit, KEY_MAX))
-   continue;
-
-   if (!bitmap_subset(id->relbit, dev->relbit, REL_MAX))
-   continue;
-
-   if (!bitmap_subset(id->absbit, dev->absbit, ABS_MAX))
-   continue;
-
-   if (!bitmap_subset(id->mscbit, dev->mscbit, MSC_MAX))
-   continue;
-
-   if (!bitmap_subset(id->ledbit, dev->ledbit, LED_MAX))
-   continue;
-
-   if (!bitmap_subset(id->sndbit, dev->sndbit, SND_MAX))
-   continue;
-
-   if (!bitmap_subset(id->ffbit, dev->ffbit, FF_MAX))
-   continue;
-
-   if (!bitmap_subset(id->swbit, dev->swbit, SW_MAX))
-   continue;
-
-   if (!handler->match || handler->match(handler, dev))
+   if (input_match_device_id(dev, id) &&
+   (!handler->match || handler->match(handler, dev))) {
return id;
+   }
}
 
return NULL;
diff --git a/include/linux/input.h b/include/linux/input.h
index 9b03f34807a7..066093fb8b3c 100644
--- a/include/linux/input.h
+++ b/include/linux/input.h
@@ -477,6 +477,9 @@ int input_get_keycode(struct input_dev *dev, struct 
input_keymap_entry *ke);
 int input_set_keycode(struct input_dev *dev,
  const struct input_keymap_entry *ke);
 
+bool input_match_device_id(const struct input_dev *dev,
+  const struct input_device_id *id);
+
 void input_enable_softrepeat(struct input_dev *dev, int delay, int period);
 
 extern struct class input_class;
-- 
2.14.2.920.gcf0c67979c-goog



[PATCH 3/3] Input: joydev - blacklist ds3/ds4/udraw motion sensors

2017-10-09 Thread Dmitry Torokhov
From: Roderick Colenbrander 

Introduce a device table used for blacklisting devices. We currently
blacklist the motion sensor subdevice of THQ Udraw and Sony ds3/ds4.

Signed-off-by: Roderick Colenbrander 
Signed-off-by: Dmitry Torokhov 
---
 drivers/input/joydev.c | 70 +-
 1 file changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/input/joydev.c b/drivers/input/joydev.c
index 29d677c714d2..7b29a8944039 100644
--- a/drivers/input/joydev.c
+++ b/drivers/input/joydev.c
@@ -747,6 +747,68 @@ static void joydev_cleanup(struct joydev *joydev)
input_close_device(handle);
 }
 
+/*
+ * These codes are copied from from hid-ids.h, unfortunately there is no common
+ * usb_ids/bt_ids.h header.
+ */
+#define USB_VENDOR_ID_SONY 0x054c
+#define USB_DEVICE_ID_SONY_PS3_CONTROLLER  0x0268
+#define USB_DEVICE_ID_SONY_PS4_CONTROLLER  0x05c4
+#define USB_DEVICE_ID_SONY_PS4_CONTROLLER_20x09cc
+#define USB_DEVICE_ID_SONY_PS4_CONTROLLER_DONGLE   0x0ba0
+
+#define USB_VENDOR_ID_THQ  0x20d6
+#define USB_DEVICE_ID_THQ_PS3_UDRAW0xcb17
+
+#define ACCEL_DEV(vnd, prd)\
+   {   \
+   .flags = INPUT_DEVICE_ID_MATCH_VENDOR | \
+   INPUT_DEVICE_ID_MATCH_PRODUCT | \
+   INPUT_DEVICE_ID_MATCH_PROPBIT,  \
+   .vendor = (vnd),\
+   .product = (prd),   \
+   .propbit = { BIT_MASK(INPUT_PROP_ACCELEROMETER) },  \
+   }
+
+static const struct input_device_id joydev_blacklist[] = {
+   /* Avoid touchpads and touchscreens */
+   {
+   .flags = INPUT_DEVICE_ID_MATCH_EVBIT |
+   INPUT_DEVICE_ID_MATCH_KEYBIT,
+   .evbit = { BIT_MASK(EV_KEY) },
+   .keybit = { [BIT_WORD(BTN_TOUCH)] = BIT_MASK(BTN_TOUCH) },
+   },
+   /* Avoid tablets, digitisers and similar devices */
+   {
+   .flags = INPUT_DEVICE_ID_MATCH_EVBIT |
+   INPUT_DEVICE_ID_MATCH_KEYBIT,
+   .evbit = { BIT_MASK(EV_KEY) },
+   .keybit = { [BIT_WORD(BTN_DIGI)] = BIT_MASK(BTN_DIGI) },
+   },
+   /* Disable accelerometers on composite devices */
+   ACCEL_DEV(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS3_CONTROLLER),
+   ACCEL_DEV(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS4_CONTROLLER),
+   ACCEL_DEV(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS4_CONTROLLER_2),
+   ACCEL_DEV(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS4_CONTROLLER_DONGLE),
+   ACCEL_DEV(USB_VENDOR_ID_THQ, USB_DEVICE_ID_THQ_PS3_UDRAW),
+   { /* sentinel */ }
+};
+
+static bool joydev_dev_is_blacklisted(struct input_dev *dev)
+{
+   const struct input_device_id *id;
+
+   for (id = joydev_blacklist; id->flags; id++) {
+   if (input_match_device_id(dev, id)) {
+   dev_dbg(&dev->dev,
+   "joydev: blacklisting '%s'\n", dev->name);
+   return true;
+   }
+   }
+
+   return false;
+}
+
 static bool joydev_dev_is_absolute_mouse(struct input_dev *dev)
 {
DECLARE_BITMAP(jd_scratch, KEY_CNT);
@@ -807,12 +869,8 @@ static bool joydev_dev_is_absolute_mouse(struct input_dev 
*dev)
 
 static bool joydev_match(struct input_handler *handler, struct input_dev *dev)
 {
-   /* Avoid touchpads and touchscreens */
-   if (test_bit(EV_KEY, dev->evbit) && test_bit(BTN_TOUCH, dev->keybit))
-   return false;
-
-   /* Avoid tablets, digitisers and similar devices */
-   if (test_bit(EV_KEY, dev->evbit) && test_bit(BTN_DIGI, dev->keybit))
+   /* Disable blacklisted devices */
+   if (joydev_dev_is_blacklisted(dev))
return false;
 
/* Avoid absolute mice */
-- 
2.14.2.920.gcf0c67979c-goog



[PATCH 2/3] Input: allow matching device IDs on property bits

2017-10-09 Thread Dmitry Torokhov
Let's allow matching input devices on their property bits, both in-kernel
and when generating module aliases.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/input.c | 3 ++-
 include/linux/input.h | 4 
 include/linux/mod_devicetable.h   | 3 +++
 scripts/mod/devicetable-offsets.c | 1 +
 scripts/mod/file2alias.c  | 6 +-
 5 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/input/input.c b/drivers/input/input.c
index a8d4f1653588..58c0b7344fdf 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -965,7 +965,8 @@ bool input_match_device_id(const struct input_dev *dev,
!bitmap_subset(id->ledbit, dev->ledbit, LED_MAX) ||
!bitmap_subset(id->sndbit, dev->sndbit, SND_MAX) ||
!bitmap_subset(id->ffbit, dev->ffbit, FF_MAX) ||
-   !bitmap_subset(id->swbit, dev->swbit, SW_MAX)) {
+   !bitmap_subset(id->swbit, dev->swbit, SW_MAX) ||
+   !bitmap_subset(id->propbit, dev->propbit, INPUT_PROP_MAX)) {
return false;
}
 
diff --git a/include/linux/input.h b/include/linux/input.h
index 066093fb8b3c..9e27e6d1d7d2 100644
--- a/include/linux/input.h
+++ b/include/linux/input.h
@@ -238,6 +238,10 @@ struct input_dev {
 #error "SW_MAX and INPUT_DEVICE_ID_SW_MAX do not match"
 #endif
 
+#if INPUT_PROP_MAX != INPUT_DEVICE_ID_PROP_MAX
+#error "INPUT_PROP_MAX and INPUT_DEVICE_ID_PROP_MAX do not match"
+#endif
+
 #define INPUT_DEVICE_ID_MATCH_DEVICE \
(INPUT_DEVICE_ID_MATCH_BUS | INPUT_DEVICE_ID_MATCH_VENDOR | 
INPUT_DEVICE_ID_MATCH_PRODUCT)
 #define INPUT_DEVICE_ID_MATCH_DEVICE_AND_VERSION \
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 3f74ef2281e8..72f0b7f19c59 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -293,6 +293,7 @@ struct pcmcia_device_id {
 #define INPUT_DEVICE_ID_SND_MAX0x07
 #define INPUT_DEVICE_ID_FF_MAX 0x7f
 #define INPUT_DEVICE_ID_SW_MAX 0x0f
+#define INPUT_DEVICE_ID_PROP_MAX   0x1f
 
 #define INPUT_DEVICE_ID_MATCH_BUS  1
 #define INPUT_DEVICE_ID_MATCH_VENDOR   2
@@ -308,6 +309,7 @@ struct pcmcia_device_id {
 #define INPUT_DEVICE_ID_MATCH_SNDBIT   0x0400
 #define INPUT_DEVICE_ID_MATCH_FFBIT0x0800
 #define INPUT_DEVICE_ID_MATCH_SWBIT0x1000
+#define INPUT_DEVICE_ID_MATCH_PROPBIT  0x2000
 
 struct input_device_id {
 
@@ -327,6 +329,7 @@ struct input_device_id {
kernel_ulong_t sndbit[INPUT_DEVICE_ID_SND_MAX / BITS_PER_LONG + 1];
kernel_ulong_t ffbit[INPUT_DEVICE_ID_FF_MAX / BITS_PER_LONG + 1];
kernel_ulong_t swbit[INPUT_DEVICE_ID_SW_MAX / BITS_PER_LONG + 1];
+   kernel_ulong_t propbit[INPUT_DEVICE_ID_PROP_MAX / BITS_PER_LONG + 1];
 
kernel_ulong_t driver_info;
 };
diff --git a/scripts/mod/devicetable-offsets.c 
b/scripts/mod/devicetable-offsets.c
index e4d90e50f6fe..812657ab5aa3 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -105,6 +105,7 @@ int main(void)
DEVID_FIELD(input_device_id, sndbit);
DEVID_FIELD(input_device_id, ffbit);
DEVID_FIELD(input_device_id, swbit);
+   DEVID_FIELD(input_device_id, propbit);
 
DEVID(eisa_device_id);
DEVID_FIELD(eisa_device_id, sig);
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 29d6699d5a06..bc25898f6df0 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -761,7 +761,7 @@ static void do_input(char *alias,
sprintf(alias + strlen(alias), "%X,*", i);
 }
 
-/* input:b0v0p0e0-eXkXrXaXmXlXsXfXwX where X is comma-separated %02X. */
+/* input:b0v0p0e0-eXkXrXaXmXlXsXfXwXprX where X is comma-separated %02X. */
 static int do_input_entry(const char *filename, void *symval,
  char *alias)
 {
@@ -779,6 +779,7 @@ static int do_input_entry(const char *filename, void 
*symval,
DEF_FIELD_ADDR(symval, input_device_id, sndbit);
DEF_FIELD_ADDR(symval, input_device_id, ffbit);
DEF_FIELD_ADDR(symval, input_device_id, swbit);
+   DEF_FIELD_ADDR(symval, input_device_id, propbit);
 
sprintf(alias, "input:");
 
@@ -816,6 +817,9 @@ static int do_input_entry(const char *filename, void 
*symval,
sprintf(alias + strlen(alias), "w*");
if (flags & INPUT_DEVICE_ID_MATCH_SWBIT)
do_input(alias, *swbit, 0, INPUT_DEVICE_ID_SW_MAX);
+   sprintf(alias + strlen(alias), "pr*");
+   if (flags & INPUT_DEVICE_ID_MATCH_PROPBIT)
+   do_input(alias, *propbit, 0, INPUT_DEVICE_ID_PROP_MAX);
return 1;
 }
 ADD_TO_DEVTABLE("input", input_device_id, do_input_entry);
-- 
2.14.2.920.gcf0c67979c-goog



Re: [PATCH] bitfield: Use __ffs64(x) to fix missing __ffsdi2()

2017-10-09 Thread kbuild test robot
Hi Geert,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.14-rc4]
[cannot apply to next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Geert-Uytterhoeven/bitfield-Use-__ffs64-x-to-fix-missing-__ffsdi2/20171010-050009
config: x86_64-allyesdebian (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/linux/list.h:4,
from include/linux/module.h:9,
from net/mac80211/sta_info.c:12:
   In function 'sta_stats_decode_rate',
   inlined from 'sta_set_rate_info_rx' at net/mac80211/sta_info.c:2016:2,
   inlined from 'sta_set_sinfo' at net/mac80211/sta_info.c:2218:7:
>> include/linux/compiler.h:576:38: error: call to '__compiletime_assert_1971' 
>> declared with attribute error: BUILD_BUG_ON failed: ((~0UL) - (1UL << 
>> (8)) + 1) & (~0UL >> (64 - 1 - (11) + (1ULL << __ffs64~0UL) - (1UL 
>> << (8)) + 1) & (~0UL >> (64 - 1 - (11))) & ((~0UL) - (1UL << (8)) + 
>> 1) & (~0UL >> (64 - 1 - (11) + (1ULL << __ffs64~0UL) - (1UL << (8)) 
>> + 1) & (~0UL >> (64 - 1 - (11))) - 1)) != 0
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^
   include/linux/compiler.h:556:4: note: in definition of macro 
'__compiletime_assert'
   prefix ## suffix();\
   ^~
   include/linux/compiler.h:576:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/build_bug.h:70:2: note: in expansion of macro 
'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
 ^~~~
>> include/linux/build_bug.h:19:2: note: in expansion of macro 'BUILD_BUG_ON'
 BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
 ^~~~
>> include/linux/bitfield.h:62:3: note: in expansion of macro 
>> '__BUILD_BUG_ON_NOT_POWER_OF_2'
  __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +   \
  ^
>> include/linux/bitfield.h:103:3: note: in expansion of macro 
>> '__BF_FIELD_CHECK'
  __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
  ^~~~
>> net/mac80211/sta_info.h:764:32: note: in expansion of macro 'FIELD_GET'
#define STA_STATS_GET(_n, _v)  FIELD_GET(STA_STATS_FIELD_ ## _n, _v)
   ^
>> net/mac80211/sta_info.c:1971:14: note: in expansion of macro 'STA_STATS_GET'
 rinfo->bw = STA_STATS_GET(BW, rate);
 ^
   include/linux/compiler.h:576:38: error: call to '__compiletime_assert_1973' 
declared with attribute error: BUILD_BUG_ON failed: ((~0UL) - (1UL << (13)) 
+ 1) & (~0UL >> (64 - 1 - (15) + (1ULL << __ffs64~0UL) - (1UL << (13)) 
+ 1) & (~0UL >> (64 - 1 - (15))) & ((~0UL) - (1UL << (13)) + 1) & (~0UL 
>> (64 - 1 - (15) + (1ULL << __ffs64~0UL) - (1UL << (13)) + 1) & (~0UL 
>> (64 - 1 - (15))) - 1)) != 0
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^
   include/linux/compiler.h:556:4: note: in definition of macro 
'__compiletime_assert'
   prefix ## suffix();\
   ^~
   include/linux/compiler.h:576:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/build_bug.h:70:2: note: in expansion of macro 
'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
 ^~~~
>> include/linux/build_bug.h:19:2: note: in expansion of macro 'BUILD_BUG_ON'
 BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
 ^~~~
>> include/linux/bitfield.h:62:3: note: in expansion of macro 
>> '__BUILD_BUG_ON_NOT_POWER_OF_2'
  __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +   \
  ^
>> include/linux/bitfield.h:103:3: note: in expansion of macro 
>> '__BF_FIELD_CHECK'
 

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-09 Thread Andrew Morton
On Sat,  7 Oct 2017 06:58:04 +0800 Yafang Shao  wrote:

> After disable periodic writeback by writing 0 to
> dirty_writeback_centisecs, the handler wb_workfn() will not be
> entered again until the dirty background limit reaches or
> sync syscall is executed or no enough free memory available or
> vmscan is triggered.
> So the periodic writeback can't be enabled by writing a non-zero
> value to dirty_writeback_centisecs
> As it can be disabled by sysctl, it should be able to enable by 
> sysctl as well.
> 
> ...
>
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1972,7 +1972,13 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb)
>  int dirty_writeback_centisecs_handler(struct ctl_table *table, int write,
>   void __user *buffer, size_t *length, loff_t *ppos)
>  {
> - proc_dointvec(table, write, buffer, length, ppos);
> + unsigned int old_interval = dirty_writeback_interval;
> + int ret;
> +
> + ret = proc_dointvec(table, write, buffer, length, ppos);
> + if (!ret && !old_interval && dirty_writeback_interval)
> + wakeup_flusher_threads(0, WB_REASON_PERIODIC);
> +
>   return 0;

We could do with a code comment here, explaining why this code exists.

And...  I'm not sure it works correctly?  For example, if a device
doesn't presently have bdi_has_dirty_io() then wakeup_flusher_threads()
will skip it and the periodic writeback still won't be started?

(why does the dirty_writeback_interval==0 special case exist, btw? 
Seems to be a strange thing to do).

(and what happens if the interval was set to 1 hour and the user
rewrites that to 1 second?  Does that change take 1 hour to take
effect?)



Re: Warning on __wait_on_bit

2017-10-09 Thread Cong Wang
Hi, Tom

On Sat, Sep 30, 2017 at 1:55 AM, Tom Green  wrote:
> Hi Tejun Cong Shaohua,
>
> I just observed the same problem in https://lkml.org/lkml/2016/12/9/373. It
> seems to be exactly same as decribed in https://lwn.net/Articles/628628/.
> Any progress? Thanks.

Sorry for the delay, I was on vacation.

Could you try the attached patch (against the latest Linus tree)?
I did a quick boot test, it runs well.

Thanks!
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 134d9f560240..211a773cb39f 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -71,23 +71,28 @@ nfs_fattr_to_ino_t(struct nfs_fattr *fattr)
return nfs_fileid_to_ino_t(fattr->fileid);
 }
 
-static int nfs_wait_killable(int mode)
+static int nfs_wait_killable(int mode, struct wait_queue_entry *wait)
 {
-   freezable_schedule_unsafe();
+   freezer_do_not_count();
+   if (wait)
+   wait_woken(wait, mode, MAX_SCHEDULE_TIMEOUT);
+   else
+   schedule();
+   freezer_count_unsafe();
if (signal_pending_state(mode, current))
return -ERESTARTSYS;
return 0;
 }
 
-int nfs_wait_bit_killable(struct wait_bit_key *key, int mode)
+int nfs_wait_bit_killable(struct wait_bit_key *key, int mode, struct 
wait_queue_entry *wait)
 {
-   return nfs_wait_killable(mode);
+   return nfs_wait_killable(mode, wait);
 }
 EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
 
 int nfs_wait_atomic_killable(atomic_t *p)
 {
-   return nfs_wait_killable(TASK_KILLABLE);
+   return nfs_wait_killable(TASK_KILLABLE, NULL);
 }
 
 /**
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 5bdf952f414b..a9144d311a50 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -386,7 +386,7 @@ extern void nfs_clear_inode(struct inode *);
 extern void nfs_evict_inode(struct inode *);
 void nfs_zap_acl_cache(struct inode *inode);
 extern bool nfs_check_cache_invalid(struct inode *, unsigned long);
-extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode);
+extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode, struct 
wait_queue_entry *wait);
 extern int nfs_wait_atomic_killable(atomic_t *p);
 
 /* super.c */
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index dd03e837ebb7..1d197994b181 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -272,6 +272,7 @@ static inline bool try_to_freeze(void) { return false; }
 
 static inline void freezer_do_not_count(void) {}
 static inline void freezer_count(void) {}
+static inline void freezer_count_unsafe(void) {}
 static inline int freezer_should_skip(struct task_struct *p) { return 0; }
 static inline void set_freezable(void) {}
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26a7df4e558c..1fd387e6c554 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -184,6 +184,9 @@ extern void io_schedule_finish(int token);
 extern long io_schedule_timeout(long timeout);
 extern void io_schedule(void);
 
+struct wait_queue_entry;
+extern void io_wait_timeout(struct wait_queue_entry *wait, int mode, long 
timeout);
+
 /**
  * struct prev_cputime - snapshot of system and user cputime
  * @utime: time spent in user mode
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 87c4641023fb..5a320e1c215e 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -987,6 +987,7 @@ void finish_wait(struct wait_queue_head *wq_head, struct 
wait_queue_entry *wq_en
 long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long 
timeout);
 int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int 
sync, void *key);
 int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, 
int sync, void *key);
+int autoremove_woken_wake_function(struct wait_queue_entry *wq_entry, unsigned 
mode, int sync, void *key);
 
 #define DEFINE_WAIT_FUNC(name, function)   
\
struct wait_queue_entry name = {
\
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 12b26660d7e9..f7a98ba7100c 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -24,7 +24,7 @@ struct wait_bit_queue_entry {
 #define __WAIT_ATOMIC_T_KEY_INITIALIZER(p) 
\
{ .flags = p, .bit_nr = WAIT_ATOMIC_T_BIT_NR, }
 
-typedef int wait_bit_action_f(struct wait_bit_key *key, int mode);
+typedef int wait_bit_action_f(struct wait_bit_key *key, int mode, struct 
wait_queue_entry *wait);
 void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit);
 int __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry 
*wbq_entry, wait_bit_action_f *action, unsigned int mode);
 int __wait_on_bit_lock(struct wait_queue_head *wq_head, struct 
wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode);
@@ -50,10 +50,10 @@ int wake_bit_function(struct wait_queue_entry *wq_entry, 
unsigned mode, int sync
 

[PATCH v2] Userfaultfd: Add description for UFFD_FEATURE_SIGBUS

2017-10-09 Thread Prakash Sangappa
Userfaultfd feature UFFD_FEATURE_SIGBUS was merged recently and should
be available in Linux 4.14 release. This patch is for the manpage
changes documenting this API.

Documents the following commit:

commit 2d6d6f5a09a96cc1fec7ed992b825e05f64cb50e
Author: Prakash Sangappa 
Date: Wed Sep 6 16:23:39 2017 -0700

 mm: userfaultfd: add feature to request for a signal delivery

Signed-off-by: Prakash Sangappa 
---
v2: Incorporated review feedback changes.
---
 man2/ioctl_userfaultfd.2 |  9 +
 man2/userfaultfd.2   | 23 +++
 2 files changed, 32 insertions(+)

diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2
index 60fd29b..32f0744 100644
--- a/man2/ioctl_userfaultfd.2
+++ b/man2/ioctl_userfaultfd.2
@@ -196,6 +196,15 @@ with the
 flag set,
 .BR memfd_create (2),
 and so on.
+.TP
+.B UFFD_FEATURE_SIGBUS
+Since Linux 4.14, If this feature bit is set, no page-fault events
+.B (UFFD_EVENT_PAGEFAULT)
+will be delivered, instead a
+.B SIGBUS
+signal will be sent to the faulting process. Applications using this
+feature will not require the use of a userfaultfd monitor for processing
+memory accesses to the regions registered with userfaultfd.
 .IP
 The returned
 .I ioctls
diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
index 1741ee3..3c5b9c0 100644
--- a/man2/userfaultfd.2
+++ b/man2/userfaultfd.2
@@ -172,6 +172,29 @@ or
 .BR ioctl (2)
 operations to resolve the page fault.
 .PP
+Starting from Linux 4.14, if application sets
+.B UFFD_FEATURE_SIGBUS
+feature bit using
+.B UFFDIO_API
+.BR ioctl (2),
+no page fault notification will be forwarded to
+the user-space, instead a
+.B SIGBUS
+signal is delivered to the faulting process. With this feature,
+userfaultfd can be used for robustness purpose to simply catch
+any access to areas within the registered address range that do not
+have pages allocated, without having to listen to userfaultfd events.
+No userfaultfd monitor will be required for dealing with such memory
+accesses. For example, this feature can be useful for applications that
+want to prevent the kernel from automatically allocating pages and filling
+holes in sparse files when the hole is accessed thru mapped address.
+.PP
+The
+.B UFFD_FEATURE_SIGBUS
+feature is implicitly inherited through fork() if used in combination with
+.BR UFFD_FEATURE_FORK .
+
+.PP
 Details of the various
 .BR ioctl (2)
 operations can be found in
-- 
2.7.4



Re: [PATCH 0/3] arm64: cpuinfo: make /proc/cpuinfo more human-readable

2017-10-09 Thread Al Stone
On 09/27/2017 04:34 AM, Mark Rutland wrote:
> On Tue, Sep 26, 2017 at 04:23:21PM -0600, Al Stone wrote:
>> As ARMv8 servers get deployed, I keep getting the same set of questions
>> from end-users of those systems: what do all the hex numbers mean in 
>> /proc/cpuinfo and could you make them so I don't have to carry a cheat
>> sheet with me all the time?
> 
> I appreciate that /proc/cpuinfo can be opaque to end users, but I do not
> believe that this is the right solution to that problem.
> 
> There are a number of issues stemming from the face that /proc/cpuinfo
> is ill-defined and overused for a number of cases. Changes to it almost
> certainly violate brittle de-facto ABI details people are relying on,
> and there's a very long tail on fallout resulting from this. In
> addition, many niceties come at an excessive maintenance cost, and are
> simply unreliable as an ABI.

Instead of dealing with the replies to the specific patches, I'd like to deal
with the real gist of the matter first -- details can be worked out later.

While I don't disagree -- cpuinfo is enormously fragile -- what _is_ a good
solution, then?  Right now, things are not terribly useful.  Yes, it is true
/sys/firmware/dmi has info in it, but again it is all numeric (or in a file
named 'raw').  Any one who wishes to read the values will need to put some
interpretation on them somehow.  My personal preference is that the kernel
control that interpretation instead of random admin tools scattered over a raft
of different companies.

> So, as with all other patches modifying /proc/cpuinfo, I must NAK this
> series.

Hrm.  I suggest that policy needs to be rethought.  I understand the reasoning,
but it puts my end-users in a very difficult position; they now have to justify
modifying management software that is already in operation in order to purchase
and integrate ARM servers.  And while I really don't want distro-specific code,
I'd have to consider it as a possibility.

To be completely fair, though, I'm not completely fixated on cpuinfo as the only
possible answer.  It has been the most requested, however.

> For the MPIDR and product info, I think we can expose this in a more
> structured way (e.g. under sysfs). IIUC the product info is all derived
> from DMI -- do we not expose that already?

Agreed; the MPIDR I think could just as easily be in /sys/devices/cpu/* as
a hex value.  It doesn't have the value to end-users that the other info does,
either -- for debugging, definitely, but not necessarily sysadmin.  I'll put
something together for this.

The DMI info is in sysfs but it is not in a fashion normally consumed, hence
the questions from end-users, in a way.  They're not expecting to have to do
the interpretation, and they are expecting the platform to be able to tell
them what "Model: 2" means.

> For the human-readable names I must NAK such patches. This is an
> extremely brittle ABI that cannot be forwards compatible, and comes with
> hilarious political problems. This should be managed in userspace alone.
> 
> I thought tools like lscpu and lshw were intended to gather such
> information for users. What are these missing today?

All of the above human-readable parts -- lscpu reads model information from
/proc/cpuinfo, for example, and would require new code to read this info from
somewhere else, and then figure out how to interpret it.  All I get is a lovely
'2' for 'model' :).  We could obviously change the userspace tools, but I'm
guessing it just moves the political hilarity from the kernel to userspace,
correct?  It also runs the risk of even greater silliness, like perhaps lscpu
gets a product name right, but lshw calls it something different, and so on.

The other approach that occurs to me is to blend what x86 does and sysfs; i.e.,
would it be acceptable to modify cputype.h to include comments (as x86 does)
that are then munged by a script somewhat like arch/x86/kernel/cpu/mkcapflags.sh
that then produces a header file with proper human-readable content that can
then be exposed in sysfs?  I don't think we can ever remove the brittleness
completely, but perhaps this would at least mitigate it.  This would still
require userspace modifications, too, but they could at least rely on a uniform
source of data -- and perhaps a single place to focus when hilarity ensues :).
I'll bodge some patches together and maybe start over as an RFC, if there's a
chance this would be accepted.

-- 
ciao,
al
---
Al Stone
Software Engineer
Red Hat, Inc.
a...@redhat.com
---


[PATCH v3 4/5] uio: uio_pdrv_genirq: Add st,stm32f100 to dt compatible list

2017-10-09 Thread Chris Packham
The stm32f100 is a general purpose micro controller. Allow users to
interact with one as a uio device.

Signed-off-by: Chris Packham 
---
Changes in v3:
- (replaces "uio: add default compatible string to uio_pdrv_genirq")
- remove superfluous change suggested by checkpatch.pl
- As suggested by Rob, add specific dt bindings for the hardware I'm
  supporting. uio_pdrv_genirq is one potential implementor of a compatible
  driver.

 drivers/uio/uio_pdrv_genirq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/uio/uio_pdrv_genirq.c b/drivers/uio/uio_pdrv_genirq.c
index f598ecddc8a7..532de98034c2 100644
--- a/drivers/uio/uio_pdrv_genirq.c
+++ b/drivers/uio/uio_pdrv_genirq.c
@@ -253,11 +253,12 @@ static const struct dev_pm_ops uio_pdrv_genirq_dev_pm_ops 
= {
 
 #ifdef CONFIG_OF
 static struct of_device_id uio_of_genirq_match[] = {
+   { .compatible = "st,stm32f100" },
{ /* This is filled with module_parm */ },
{ /* Sentinel */ },
 };
 MODULE_DEVICE_TABLE(of, uio_of_genirq_match);
-module_param_string(of_id, uio_of_genirq_match[0].compatible, 128, 0);
+module_param_string(of_id, uio_of_genirq_match[1].compatible, 128, 0);
 MODULE_PARM_DESC(of_id, "Openfirmware id of the device to be handled by uio");
 #endif
 
-- 
2.14.2



[PATCH v3 0/5] using uio_pdrv_genirq without module param

2017-10-09 Thread Chris Packham
I found myself about to add a driver that was a sub-optimal clone of
uio_pdrv_genirq the only difference was that I didn't want to modify the args
passed to the kernel by my bootloader. If uio_pdrv_genirq had a default
of_match entry I could simply use that. This series attempts to implement this.

I wonder if it is worth having a catch-all compat string for this driver
"generic-dev" or something. People seem to be ok with spidev not having any
such fallback so perhaps there is no-need (other than the 2 devices I want to
support with userspace drivers).

Changes in v2:
- added a better commit message to 1/2
- remove bogus checkpatch fix in 2/2

Changes in v3:
- As suggested by Rob document the bindings for the hardware and treat
  uio_pdrv_genirq.c as one potential implementer of a compatible driver.
  Following the model used by spidev.c
- Patches 1-3 replace patch 1 from v2 of this series
- Patches 4-5 replace patch 2 from v2

Chris Packham (5):
  uio: dt-bindings: document existing binding for uio-pdrv-genirq
  uio: dt-bindings: Add binding for "st,stm32f100"
  uio: dt-bindings: Add binding for "marvell,88e2040"
  uio: uio_pdrv_genirq: Add st,stm32f100 to dt compatible list
  uio: uio_pdrv_genirq: Add "marvell,88e2040" to dt compatible list

 .../devicetree/bindings/uio/uio-pdrv-genirq.txt| 28 ++
 drivers/uio/uio_pdrv_genirq.c  |  4 +++-
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt

-- 
2.14.2



[PATCH v3 3/5] uio: dt-bindings: Add binding for "marvell,88e2040"

2017-10-09 Thread Chris Packham
The marvell,88e2040 is a N-BaseT Ethernet PHY which requires a userspace
driver. The only kernel presence is an interrupt handler which
uio_pdrv_genirq can provide.

Signed-off-by: Chris Packham 
---
Changes in v3:
- (replaces "uio: dt-bindings: document binding for uio-pdrv-genirq")
- split the bindings up into the devices I'm actually wanting to support. The
  second is a marvell N-BaseT Ethernet PHY

 Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt 
b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
index fb4c1b5059f2..7b28bead4016 100644
--- a/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
+++ b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
@@ -1,8 +1,8 @@
 *Generic UIO platform driver with interrupts
 
 Required properties:
-- compatible: Should be a value set with the of_id module parameter or the
-  built in value "st,stm32f100".
+- compatible: Should be a value set with the of_id module parameter or one of
+  the built in values "st,stm32f100", "marvell,88e2040".
 
 Optional properties:
 - interrupts: Should contain the interrupt to be associated with this device
-- 
2.14.2



[PATCH v3 2/5] uio: dt-bindings: Add binding for "st,stm32f100"

2017-10-09 Thread Chris Packham
The stm32f100 is a general purpose micro controller. Document a binding
that allows a user-space driver to be implemented for these devices.

Signed-off-by: Chris Packham 
---
Changes in v3:
- (replaces "uio: dt-bindings: document binding for uio-pdrv-genirq")
- split the bindings up into the devices I'm actually wanting to support. The
  first is a stm32f100 micro controller.

 Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt 
b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
index 5a92b1f8825f..fb4c1b5059f2 100644
--- a/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
+++ b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
@@ -1,7 +1,8 @@
 *Generic UIO platform driver with interrupts
 
 Required properties:
-- compatible: Should be a value set with the of_id module parameter.
+- compatible: Should be a value set with the of_id module parameter or the
+  built in value "st,stm32f100".
 
 Optional properties:
 - interrupts: Should contain the interrupt to be associated with this device
-- 
2.14.2



[PATCH v3 1/5] uio: dt-bindings: document existing binding for uio-pdrv-genirq

2017-10-09 Thread Chris Packham
Document the device tree bindings for the uio-prv-genirq driver. Provide
some examples on how it can be used.

Signed-off-by: Chris Packham 
---
Changes in v3:
- New. Document existing binding first.

 .../devicetree/bindings/uio/uio-pdrv-genirq.txt| 27 ++
 1 file changed, 27 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt

diff --git a/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt 
b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
new file mode 100644
index ..5a92b1f8825f
--- /dev/null
+++ b/Documentation/devicetree/bindings/uio/uio-pdrv-genirq.txt
@@ -0,0 +1,27 @@
+*Generic UIO platform driver with interrupts
+
+Required properties:
+- compatible: Should be a value set with the of_id module parameter.
+
+Optional properties:
+- interrupts: Should contain the interrupt to be associated with this device
+  (only a single interrupt is supported per device).
+- interrupt-parent: Specifies the phandle to the parent interrupt controller.
+- reg: Should specify the physical address spaces used by this device.
+
+Example:
+
+/* Device with MM IO and interrupt assuming of_id="me,my-device" */
+my-device@1 {
+   compatible = "me,my-device";
+   reg = <0x1 0x40>;
+   interrupts = <4 IRQ_TYPE_EDGE_BOTH>;
+   interrupt-parent = <&gic>;
+};
+
+/* Device with interrupt only assuming of_id="me,my-device" */
+my-int {
+   compatible = "me,my-device";
+   interrupts = <6 IRQ_TYPE_EDGE_BOTH>;
+   interrupt-parent = <&gic>;
+};
-- 
2.14.2



[PATCH v3 5/5] uio: uio_pdrv_genirq: Add "marvell,88e2040" to dt compatible list

2017-10-09 Thread Chris Packham
The marvell,88e2040 is a N-BaseT Ethernet PHY which requires a userspace
driver. uio_pdrv_genirq can be used to provide interrupts to such a
userspace driver.

Signed-off-by: Chris Packham 
---
Changes in v3:
- (replaces "uio: add default compatible string to uio_pdrv_genirq")
- As suggested by Rob, add specific dt bindings for the hardware I'm
  supporting. uio_pdrv_genirq is one potential implementor of a compatible
  driver.

 drivers/uio/uio_pdrv_genirq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/uio/uio_pdrv_genirq.c b/drivers/uio/uio_pdrv_genirq.c
index 532de98034c2..4bcab82b899e 100644
--- a/drivers/uio/uio_pdrv_genirq.c
+++ b/drivers/uio/uio_pdrv_genirq.c
@@ -254,11 +254,12 @@ static const struct dev_pm_ops uio_pdrv_genirq_dev_pm_ops 
= {
 #ifdef CONFIG_OF
 static struct of_device_id uio_of_genirq_match[] = {
{ .compatible = "st,stm32f100" },
+   { .compatible = "marvell,88e2040" },
{ /* This is filled with module_parm */ },
{ /* Sentinel */ },
 };
 MODULE_DEVICE_TABLE(of, uio_of_genirq_match);
-module_param_string(of_id, uio_of_genirq_match[1].compatible, 128, 0);
+module_param_string(of_id, uio_of_genirq_match[2].compatible, 128, 0);
 MODULE_PARM_DESC(of_id, "Openfirmware id of the device to be handled by uio");
 #endif
 
-- 
2.14.2



[PATCH v6 02/14] platform/x86: dell-wmi: increase severity of some failures

2017-10-09 Thread Mario Limonciello
There is a lot of error checking in place for the format of the WMI
descriptor buffer, but some of the potentially raised issues should
be considered critical failures.

If the buffer size or header don't match, this is a good indication
that the buffer format changed in a way that the rest of the data
should not be relied upon.

For the remaining data set vectors, continue to notate a warning
in undefined results, but as those are fields that the descriptor
intended to refer to other applications, don't fail if they're new
values.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/dell-wmi.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/platform/x86/dell-wmi.c b/drivers/platform/x86/dell-wmi.c
index 1fbef560ca67..2cfaaa8faf0a 100644
--- a/drivers/platform/x86/dell-wmi.c
+++ b/drivers/platform/x86/dell-wmi.c
@@ -657,17 +657,18 @@ static int dell_wmi_check_descriptor_buffer(struct 
wmi_device *wdev)
dev_err(&wdev->dev,
"Dell descriptor buffer has invalid length (%d)\n",
obj->buffer.length);
-   if (obj->buffer.length < 16) {
-   ret = -EINVAL;
-   goto out;
-   }
+   ret = -EINVAL;
+   goto out;
}
 
buffer = (u32 *)obj->buffer.pointer;
 
-   if (buffer[0] != 0x4C4C4544 && buffer[1] != 0x494D5720)
-   dev_warn(&wdev->dev, "Dell descriptor buffer has invalid 
signature (%*ph)\n",
+   if (buffer[0] != 0x4C4C4544 && buffer[1] != 0x494D5720) {
+   dev_err(&wdev->dev, "Dell descriptor buffer has invalid 
signature (%*ph)\n",
8, buffer);
+   ret = -EINVAL;
+   goto out;
+   }
 
if (buffer[2] != 0 && buffer[2] != 1)
dev_warn(&wdev->dev, "Dell descriptor buffer has unknown 
version (%d)\n",
-- 
2.14.1



[PATCH v6 00/14] Introduce support for Dell SMBIOS over WMI

2017-10-09 Thread Mario Limonciello
The existing way that the dell-smbios helper module and associated
other drivers (dell-laptop, dell-wmi) communicate with the platform
really isn't secure.  It requires creating a buffer in physical
DMA32 memory space and passing that to the platform via SMM.

Since the platform got a physical memory pointer, you've just got
to trust that the platform has only modified (and accessed) memory
within that buffer.

Dell Platform designers recognize this security risk and offer a
safer way to communicate with the platform over ACPI.  This is
in turn exposed via a WMI interface to the OS.

When communicating over WMI-ACPI the communication doesn't occur
with physical memory pointers.  When the ASL is invoked, the fixed
length ACPI buffer is copied to a small operating region.  The ASL
will invoke the SMI, and SMM will only have access to this operating
region.  When the ASL returns the buffer is copied back for the OS
to process.

This method of communication should also deprecate the usage of the
dcdbas kernel module and software dependent upon it's interface.
Instead offer a character device interface for communicating with this
ASL method to allow userspace to use instead.

To faciliate that this patch series introduces a generic way for WMI
drivers to be able to create discoverable character devices with
a predictable IOCTL interface through the WMI bus when desired.
Requiring WMI drivers to explicitly ask for this functionality will
act as an effective vendor whitelist to character device creation.

Some of this work is the basis for what will be a proper interpreter
of MOF in the kernel and controls for what drivers will be able to
do with that MOF.

NOTE: This patch series is intended to go on top of platform-drivers-x86
linux-next.

For convenience the entire series including those is also available here:
https://github.com/superm1/linux/tree/wmi-smbios
changes between v5 and v6:
 * Adjust Kconfig for dell-smbios to not depend on anything.
   - SMM and WMI drivers will both select dell-smbios
   - Technically the module can run on it's own (it's just not useful)
 * Add a new tokens sysfs interface
 * Rework blacklisting into easily expandable structures
 * Lock modules during ioctl call from WMI bus
 * drop references to compat ioctl in both WMI and dell-smbios-wmi
   drivers. (Made sure ioctl worked in both 32 and 64 userspace though)
 * dell-smbios-wmi 
   - s/buffer_size/req_buf_size/ to make it clearer that userspace
 doesn't get to set this, it's set internally.
   - read just buffer length before reading whole structure from 
 userspace
   - if buffer length is too big, show a warning
   - I tried to rename argument for unlocked_ioctl, but this caused
 problems in matching initialization lists, so still casting.
   - Add comments clarifying everything happening in ioctl
changes between v4 and v5:
 * Remove Andy's S suggested by in sysfs tokens patch
 * Make some output in dell-wmi-descriptor debug only
 * Adjust various Kconfig dependencies as recommended by Darren
 * Drop patch to set dell-smbios to default on ACPI_WMI,
   it's not needed after the Kconfig dependencies rework
 * Move WSMT check patch to after WMI driver is introduced.
 * Make common smbios call return value int as there could be
   errors now with drivers not being loaded.
 * Make SMBIOS call methods for all drivers return status
 * Reorder patches 2 and 4.
 * Don't export symbols for calling functions on dispatchers
 * wmi patch:
   - use sprintf instead of strcpy
   - remove needless bool for tracking found
   - adjust logic to look for instance_count - 1, it's zero
 based not 1 based.
   - Pass a callback to unlocked_ioctl instead of full file
 operations object
   - ioctl: Don't fail on no bound WMI driver
   - Add missing header for uapi
   - Make helper macros include data types
   - add compat ioctl
 * dell-smbios:
   - Add filtering functionality for SMBIOS calling interface
   - Use dev_dbg rather than pr_debug where possible
 * dell-smbios-wmi:
   - test for handle on b1 table
   - correct misc flags comment
   - drop access checks
   - use dev_dbg instead of pr_* calls
   - use filtering functionality
   - add mutexes around list add/remove
   - switch from get_first_device to get_first_priv and inline
   - add mutex locking to prevent unloading mid-call.
   - update to new ioctl passing
   - fix userspace uapi to use __u32 instead of u32
   - Don't use a header file for internal only use
   - make sure it works with compat ioctl
 * dell-laptop: make dell_smbios_send_request local function
   for boilerplate calls.
 * ioctl patch
   - Change API to have a simpler structure to pass back and
 forth
   - Rename header file
   - Export to sysfs properly
   - Add the size of the length variable into the requested
 buffersize to sysfs, do math in the driver when copying
 data around.
changes between v3 and v4:
 * Make Dell WMI notifications driver fail notifications fail
   when WMI descriptor 

Re: [PATCH] bcache: mark expected switch fall-throughs in STRTO_H

2017-10-09 Thread Michael Lyle

On 10/09/2017 02:54 PM, Gustavo A. R. Silva wrote:

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Cc: Kent Overstreet 
Cc: Shaohua Li 
Cc: linux-bca...@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva 
Reviewed-by: Michael Lyle 

Looks good to me.

Thanks,

Mike


[PATCH v6 01/14] platform/x86: wmi: Add new method wmidev_evaluate_method

2017-10-09 Thread Mario Limonciello
Drivers properly using the wmibus can pass their wmi_device
pointer rather than the GUID back to the WMI bus to evaluate
the proper methods.

Any "new" drivers added that use the WMI bus should use this
rather than the old wmi_evaluate_method that would take the
GUID.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/wmi.c | 28 
 include/linux/wmi.h|  6 ++
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
index 7a05843aff19..4d73a87c2ddf 100644
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -200,6 +200,28 @@ static acpi_status wmi_method_enable(struct wmi_block 
*wblock, int enable)
  */
 acpi_status wmi_evaluate_method(const char *guid_string, u8 instance,
 u32 method_id, const struct acpi_buffer *in, struct acpi_buffer *out)
+{
+   struct wmi_block *wblock = NULL;
+
+   if (!find_guid(guid_string, &wblock))
+   return AE_ERROR;
+   return wmidev_evaluate_method(&wblock->dev, instance, method_id,
+ in, out);
+}
+EXPORT_SYMBOL_GPL(wmi_evaluate_method);
+
+/**
+ * wmidev_evaluate_method - Evaluate a WMI method
+ * @wdev: A wmi bus device from a driver
+ * @instance: Instance index
+ * @method_id: Method ID to call
+ * &in: Buffer containing input for the method call
+ * &out: Empty buffer to return the method results
+ *
+ * Call an ACPI-WMI method
+ */
+acpi_status wmidev_evaluate_method(struct wmi_device *wdev, u8 instance,
+   u32 method_id, const struct acpi_buffer *in, struct acpi_buffer *out)
 {
struct guid_block *block = NULL;
struct wmi_block *wblock = NULL;
@@ -209,9 +231,7 @@ u32 method_id, const struct acpi_buffer *in, struct 
acpi_buffer *out)
union acpi_object params[3];
char method[5] = "WM";
 
-   if (!find_guid(guid_string, &wblock))
-   return AE_ERROR;
-
+   wblock = container_of(wdev, struct wmi_block, dev);
block = &wblock->gblock;
handle = wblock->acpi_device->handle;
 
@@ -246,7 +266,7 @@ u32 method_id, const struct acpi_buffer *in, struct 
acpi_buffer *out)
 
return status;
 }
-EXPORT_SYMBOL_GPL(wmi_evaluate_method);
+EXPORT_SYMBOL_GPL(wmidev_evaluate_method);
 
 static acpi_status __query_block(struct wmi_block *wblock, u8 instance,
 struct acpi_buffer *out)
diff --git a/include/linux/wmi.h b/include/linux/wmi.h
index cd0d7734dc49..2cd10c3b89e9 100644
--- a/include/linux/wmi.h
+++ b/include/linux/wmi.h
@@ -26,6 +26,12 @@ struct wmi_device {
bool setable;
 };
 
+/* evaluate the ACPI method associated with this device */
+extern acpi_status wmidev_evaluate_method(struct wmi_device *wdev,
+ u8 instance, u32 method_id,
+ const struct acpi_buffer *in,
+ struct acpi_buffer *out);
+
 /* Caller must kfree the result. */
 extern union acpi_object *wmidev_block_query(struct wmi_device *wdev,
 u8 instance);
-- 
2.14.1



[PATCH v6 14/14] platform/x86: dell-smbios-wmi: introduce userspace interface

2017-10-09 Thread Mario Limonciello
It's important for the driver to provide a R/W ioctl to ensure that
two competing userspace processes don't race to provide or read each
others data.

This userspace character device will be used to perform SMBIOS calls
from any applications.

It provides an ioctl that will allow passing the WMI calling
interface buffer between userspace and kernel space.

This character device is intended to deprecate the dcdbas kernel module
and the interface that it provides to userspace.

To use the character device the buffer needed for the machine will
also be needed.  This information is exported to a sysfs attribute.

The API for interacting with this interface is defined in documentation
as well as a uapi header provides the format of the structures.

Signed-off-by: Mario Limonciello 
---
 Documentation/ABI/testing/dell-smbios-wmi  |  41 
 .../ABI/testing/sysfs-platform-dell-smbios-wmi |  10 ++
 MAINTAINERS|   1 +
 drivers/platform/x86/dell-smbios-wmi.c | 107 ++---
 drivers/platform/x86/dell-smbios.h |  11 +--
 include/uapi/linux/dell-smbios.h   |  42 
 6 files changed, 191 insertions(+), 21 deletions(-)
 create mode 100644 Documentation/ABI/testing/dell-smbios-wmi
 create mode 100644 Documentation/ABI/testing/sysfs-platform-dell-smbios-wmi
 create mode 100644 include/uapi/linux/dell-smbios.h

diff --git a/Documentation/ABI/testing/dell-smbios-wmi 
b/Documentation/ABI/testing/dell-smbios-wmi
new file mode 100644
index ..e067e955fcc9
--- /dev/null
+++ b/Documentation/ABI/testing/dell-smbios-wmi
@@ -0,0 +1,41 @@
+What:  /dev/wmi/dell-smbios
+Date:  November 2017
+KernelVersion: 4.15
+Contact:   "Mario Limonciello" 
+Description:
+   Perform SMBIOS calls on supported Dell machines.
+   through the Dell ACPI-WMI interface.
+
+   IOCTL's and buffer formats are defined in:
+   
+
+   1) To perform a call from userspace, you'll need to first
+   determine the minimum size of the calling interface buffer
+   for your machine.
+   Platforms that contain larger buffers can return larger
+   objects from the system firmware.
+   Commonly this size is either 4k or 32k.
+
+   To determine the size of the buffer, refer to:
+   sysfs-platform-dell-smbios-wmi
+
+   2) After you've determined the minimum size of the calling
+   interface buffer, you can allocate a structure that represents
+   the structure documented above.
+
+   3) In the 'length' object store the size of the buffer you
+   determined above and allocated.
+
+   4) In this buffer object, prepare as necessary for the SMBIOS
+   call you're interested in.  Typically SMBIOS buffers have
+   "class", "select", and "input" defined to values that coincide
+   with the data you are interested in.
+   Documenting class/select/input values is outside of the scope
+   of this documentation. Check with the libsmbios project for
+   further documentation on these values.
+
+   6) Run the call by using ioctl() as described in the header.
+
+   7) The output will be returned in the buffer object.
+
+   8) Be sure to free up your allocated object.
diff --git a/Documentation/ABI/testing/sysfs-platform-dell-smbios-wmi 
b/Documentation/ABI/testing/sysfs-platform-dell-smbios-wmi
new file mode 100644
index ..33f0fb73f785
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-dell-smbios-wmi
@@ -0,0 +1,10 @@
+What:  /sys/devices/platform//required_buffer_size
+Date:  November 2017
+KernelVersion: 4.15
+Contact:   "Mario Limonciello" 
+Description:
+   A read-only description of the size of a calling
+   interface buffer that can be passed to Dell
+   firmware.
+
+   Commonly this size is either 4k or 32k.
diff --git a/MAINTAINERS b/MAINTAINERS
index 2a99ee9fd883..4940f3c7481b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3986,6 +3986,7 @@ M:Mario Limonciello 
 L: platform-driver-...@vger.kernel.org
 S: Maintained
 F: drivers/platform/x86/dell-smbios-wmi.c
+F: include/uapi/linux/dell-smbios.h
 
 DELL LAPTOP DRIVER
 M: Matthew Garrett 
diff --git a/drivers/platform/x86/dell-smbios-wmi.c 
b/drivers/platform/x86/dell-smbios-wmi.c
index e8a2eb858677..ec38fbf945d5 100644
--- a/drivers/platform/x86/dell-smbios-wmi.c
+++ b/drivers/platform/x86/dell-smbios-wmi.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "dell-smbios.h"
 #include "dell-wmi-descriptor.h"
 static DEFINE_MUTEX(call_mutex);
@@ -29,19 +30,9 @@ struct misc_bios_flags_structure {
 
 #define DELL_WMI_SMBIOS_GUID "A805

[PATCH v6 13/14] platform/x86: wmi: create character devices when requested by drivers

2017-10-09 Thread Mario Limonciello
For WMI operations that are only Set or Query read or write sysfs
attributes created by WMI vendor drivers make sense.

For other WMI operations that are run on Method, there needs to be a
way to guarantee to userspace that the results from the method call
belong to the data request to the method call.  Sysfs attributes don't
work well in this scenario because two userspace processes may be
competing at reading/writing an attribute and step on each other's
data.

When a WMI vendor driver declares an ioctl callback in the wmi_driver
the WMI bus driver will create a character device that maps to that
function.

That character device will correspond to this path:
/dev/wmi/$driver

The WMI bus driver will interpret the IOCTL calls, test them for
a valid instance and pass them on to the vendor driver to run.

This creates an implicit policy that only driver per character
device.  If a module matches multiple GUID's, the wmi_devices
will need to be all handled by the same wmi_driver if the same
character device is used.

The WMI vendor drivers will be responsible for managing access to
this character device and proper locking on it.

When a WMI vendor driver is unloaded the WMI bus driver will clean
up the character device.

Signed-off-by: Mario Limonciello 
---
 MAINTAINERS|  1 +
 drivers/platform/x86/wmi.c | 69 ++
 include/linux/wmi.h|  3 ++
 include/uapi/linux/wmi.h   | 19 +
 4 files changed, 92 insertions(+)
 create mode 100644 include/uapi/linux/wmi.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e7514b616e13..2a99ee9fd883 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -372,6 +372,7 @@ ACPI WMI DRIVER
 L: platform-driver-...@vger.kernel.org
 S: Orphan
 F: drivers/platform/x86/wmi.c
+F: include/uapi/linux/wmi.h
 
 AD1889 ALSA SOUND DRIVER
 M: Thibaut Varene 
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
index bcb41c1c7f52..a22b0d333c8f 100644
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -69,6 +70,7 @@ struct wmi_block {
struct wmi_device dev;
struct list_head list;
struct guid_block gblock;
+   struct miscdevice misc_dev;
struct acpi_device *acpi_device;
wmi_notify_handler handler;
void *handler_data;
@@ -765,12 +767,74 @@ static int wmi_dev_match(struct device *dev, struct 
device_driver *driver)
return 0;
 }
 
+static long wmi_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+   struct wmi_driver *wdriver = NULL;
+   struct wmi_block *wblock = NULL;
+   const char *driver_name;
+   struct list_head *p;
+   int ret;
+
+   if (_IOC_TYPE(cmd) != WMI_IOC)
+   return -ENOTTY;
+
+   driver_name = filp->f_path.dentry->d_iname;
+
+   list_for_each(p, &wmi_block_list) {
+   wblock = list_entry(p, struct wmi_block, list);
+   wdriver = container_of(wblock->dev.dev.driver,
+   struct wmi_driver, driver);
+   if (!wdriver)
+   continue;
+   if (strcmp(driver_name, wdriver->driver.name) == 0)
+   break;
+   }
+
+   if (!wdriver)
+   return -ENODEV;
+
+   /* make sure we're not calling a higher instance than exists*/
+   if (_IOC_NR(cmd) > wblock->gblock.instance_count - 1)
+   return -EINVAL;
+
+   if (!try_module_get(wdriver->driver.owner))
+   return -EBUSY;
+
+   ret = wdriver->unlocked_ioctl(&wblock->dev, cmd, arg);
+   module_put(wdriver->driver.owner);
+
+   return ret;
+}
+
+static const struct file_operations wmi_fops = {
+   .owner  = THIS_MODULE,
+   .unlocked_ioctl = wmi_ioctl,
+};
+
 static int wmi_dev_probe(struct device *dev)
 {
struct wmi_block *wblock = dev_to_wblock(dev);
struct wmi_driver *wdriver =
container_of(dev->driver, struct wmi_driver, driver);
int ret = 0;
+   char *buf;
+
+   /* driver wants a character device made */
+   if (wdriver->unlocked_ioctl) {
+   buf = kmalloc(strlen(wdriver->driver.name) + 4, GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
+   sprintf(buf, "wmi/%s", wdriver->driver.name);
+   wblock->misc_dev.minor = MISC_DYNAMIC_MINOR;
+   wblock->misc_dev.name = buf;
+   wblock->misc_dev.fops = &wmi_fops;
+   ret = misc_register(&wblock->misc_dev);
+   if (ret) {
+   dev_warn(dev, "failed to register char dev: %d", ret);
+   kfree(buf);
+   return -ENOMEM;
+   }
+   }
 
if (ACPI_FAILURE(wmi_method_enable(wblock, 1)))
dev_warn(dev, "failed to enable device -- 

[PATCH v6 09/14] platform/x86: dell-smbios: Introduce dispatcher for SMM calls

2017-10-09 Thread Mario Limonciello
This splits up the dell-smbios driver into two drivers:
* dell-smbios
* dell-smbios-smm

dell-smbios can operate with multiple different dispatcher drivers to
perform SMBIOS operations.

Also modify the interface that dell-laptop and dell-wmi use align to this
model more closely.  Rather than a single global buffer being allocated
for all drivers, each driver will allocate and be responsible for it's own
buffer. The pointer will be passed to the calling function and each
dispatcher driver will then internally copy it to the proper location to
perform it's call.

Signed-off-by: Mario Limonciello 
---
 MAINTAINERS|   6 +
 drivers/platform/x86/Kconfig   |  15 +-
 drivers/platform/x86/Makefile  |   1 +
 drivers/platform/x86/dell-laptop.c | 255 -
 drivers/platform/x86/dell-smbios-smm.c | 136 ++
 drivers/platform/x86/dell-smbios.c | 113 +--
 drivers/platform/x86/dell-smbios.h |  13 +-
 drivers/platform/x86/dell-wmi.c|  11 +-
 8 files changed, 329 insertions(+), 221 deletions(-)
 create mode 100644 drivers/platform/x86/dell-smbios-smm.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2e3f2aea0370..8faf08ebcfee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3974,6 +3974,12 @@ L:   platform-driver-...@vger.kernel.org
 S: Maintained
 F: drivers/platform/x86/dell-smbios.*
 
+DELL SMBIOS SMM DRIVER
+M: Mario Limonciello 
+L: platform-driver-...@vger.kernel.org
+S: Maintained
+F: drivers/platform/x86/dell-smbios-smm.c
+
 DELL LAPTOP DRIVER
 M: Matthew Garrett 
 M: Pali Rohár 
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 7722923c968c..53a2de781de6 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -93,12 +93,19 @@ config ASUS_LAPTOP
 
 config DELL_SMBIOS
tristate
-   select DCDBAS
+
+config DELL_SMBIOS_SMM
+   tristate "Dell SMBIOS calling interface (SMM implementation)"
+   depends on DCDBAS
+   default DCDBAS
+   select DELL_SMBIOS
---help---
-   This module provides common functions for kernel modules using
-   Dell SMBIOS.
+   This provides an implementation for the Dell SMBIOS calling interface
+   communicated over SMI/SMM.
 
-   If you have a Dell laptop, say Y or M here.
+   If you have a Dell computer from <=2017 you should say Y or M here.
+   If you aren't sure and this module doesn't work for your computer
+   it just won't load.
 
 config DELL_LAPTOP
tristate "Dell Laptop Extras"
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 8636f5d3424f..e743615241f8 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_MSI_LAPTOP)  += msi-laptop.o
 obj-$(CONFIG_ACPI_CMPC)+= classmate-laptop.o
 obj-$(CONFIG_COMPAL_LAPTOP)+= compal-laptop.o
 obj-$(CONFIG_DELL_SMBIOS)  += dell-smbios.o
+obj-$(CONFIG_DELL_SMBIOS_SMM)  += dell-smbios-smm.o
 obj-$(CONFIG_DELL_LAPTOP)  += dell-laptop.o
 obj-$(CONFIG_DELL_WMI) += dell-wmi.o
 obj-$(CONFIG_DELL_WMI_DESCRIPTOR)  += dell-wmi-descriptor.o
diff --git a/drivers/platform/x86/dell-laptop.c 
b/drivers/platform/x86/dell-laptop.c
index f42159fd2031..2f65f5c15d53 100644
--- a/drivers/platform/x86/dell-laptop.c
+++ b/drivers/platform/x86/dell-laptop.c
@@ -85,6 +85,7 @@ static struct platform_driver platform_driver = {
}
 };
 
+static struct calling_interface_buffer *buffer;
 static struct platform_device *platform_device;
 static struct backlight_device *dell_backlight_device;
 static struct rfkill *wifi_rfkill;
@@ -283,6 +284,23 @@ static const struct dmi_system_id dell_quirks[] 
__initconst = {
{ }
 };
 
+int dell_send_request(u16 class, u16 select, u32 arg0, u32 arg1, u32 arg2,
+  u32 arg3)
+{
+   int ret;
+
+   buffer->class = class;
+   buffer->select = select;
+   buffer->input[0] = arg0;
+   buffer->input[1] = arg1;
+   buffer->input[2] = arg2;
+   buffer->input[3] = arg3;
+   ret = dell_smbios_call(buffer);
+   if (ret != 0)
+   return ret;
+   return dell_smbios_error(buffer->output[0]);
+}
+
 /*
  * Derived from information in smbios-wireless-ctl:
  *
@@ -405,7 +423,6 @@ static const struct dmi_system_id dell_quirks[] __initconst 
= {
 
 static int dell_rfkill_set(void *data, bool blocked)
 {
-   struct calling_interface_buffer *buffer;
int disable = blocked ? 1 : 0;
unsigned long radio = (unsigned long)data;
int hwswitch_bit = (unsigned long)data - 1;
@@ -413,20 +430,14 @@ static int dell_rfkill_set(void *data, bool blocked)
int status;
int ret;
 
-   buffer = dell_smbios_get_buffer();
-
-   dell_smbios_send_request(17, 11);
-   ret = buffer->output[0];
+   ret = dell_send_request(17, 11, 0, 0, 0, 0);
+   if (

[PATCH v3 2/2] KVM: VMX: Don't expose unrestricted_guest is enabled if ept is disabled

2017-10-09 Thread Wanpeng Li
From: Wanpeng Li 

SDM mentioned:

 "If either the “unrestricted guest” VM-execution control or the “mode-based
  execute control for EPT” VM- execution control is 1, the “enable EPT”
  VM-execution control must also be 1."

However, we can still observe unrestricted_guest is Y after inserting the 
kvm-intel.ko
w/ ept=N. It depends on later starts a guest in order that the function
vmx_compute_secondary_exec_control() can be executed, then both the module 
parameter
and exec control fields will be amended.

This patch fixes it by amending module parameter immediately during vmcs data 
setup.

Reviewed-by: Jim Mattson 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Jim Mattson 
Signed-off-by: Wanpeng Li 
---
v1 -> v2: 
 * reduce to just "enable_ept = 0"

 arch/x86/kvm/vmx.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ab0f16e..dedff59 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6730,16 +6730,13 @@ static __init int hardware_setup(void)
if (!cpu_has_vmx_ept() ||
!cpu_has_vmx_ept_4levels() ||
!cpu_has_vmx_ept_mt_wb() ||
-   !cpu_has_vmx_invept_global()) {
+   !cpu_has_vmx_invept_global())
enable_ept = 0;
-   enable_unrestricted_guest = 0;
-   enable_ept_ad_bits = 0;
-   }
 
if (!cpu_has_vmx_ept_ad_bits() || !enable_ept)
enable_ept_ad_bits = 0;
 
-   if (!cpu_has_vmx_unrestricted_guest())
+   if (!cpu_has_vmx_unrestricted_guest() || !enable_ept)
enable_unrestricted_guest = 0;
 
if (!cpu_has_vmx_flexpriority())
-- 
2.7.4



[PATCH v3 1/2] KVM: X86: Processor States following Reset or INIT

2017-10-09 Thread Wanpeng Li
From: Wanpeng Li 

- XCR0 is reset to 1 by RESET but not INIT
- XSS is zeroed by both RESET and INIT
- BNDCFGU, BND0-BND3, BNDCFGS are zeroed by both RESET and INIT

This patch does this according to SDM.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Jim Mattson 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * fix null pointer deference
 * fix patch description
v1 -> v2:
 * XCR0 is not zeroed by INIT
 * XSS, BNDCFGU, BND0-BND3, BNDCFGS are zeroed by both RESET and INIT

 arch/x86/kvm/vmx.c |  2 ++
 arch/x86/kvm/x86.c | 15 +++
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 244e366..ab0f16e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5581,6 +5581,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, 0);
+   if (kvm_mpx_supported())
+   vmcs_write64(GUEST_BNDCFGS, 0);
 
setup_msrs(vmx);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b0d2915..b1c1755 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7804,18 +7804,33 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
kvm_async_pf_hash_reset(vcpu);
vcpu->arch.apf.halted = false;
 
+   if (kvm_mpx_supported()) {
+   void *mpx_state_buffer;
+
+   mpx_state_buffer = 
get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, XFEATURE_MASK_BNDREGS);
+   if (mpx_state_buffer)
+   memset(mpx_state_buffer, 0, sizeof(struct 
mpx_bndreg_state));
+   mpx_state_buffer = 
get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, XFEATURE_MASK_BNDCSR);
+   if (mpx_state_buffer)
+   memset(mpx_state_buffer, 0, sizeof(u64));
+   }
+
if (!init_event) {
kvm_pmu_reset(vcpu);
vcpu->arch.smbase = 0x3;
 
vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
vcpu->arch.msr_misc_features_enables = 0;
+
+   vcpu->arch.xcr0 = XFEATURE_MASK_FP;
}
 
memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
vcpu->arch.regs_avail = ~0;
vcpu->arch.regs_dirty = ~0;
 
+   vcpu->arch.ia32_xss = 0;
+
kvm_x86_ops->vcpu_reset(vcpu, init_event);
 }
 
-- 
2.7.4



Re: [PATCH] bitfield: Use __ffs64(x) to fix missing __ffsdi2()

2017-10-09 Thread Jakub Kicinski
On Mon,  9 Oct 2017 10:40:49 +0200, Geert Uytterhoeven wrote:
> On most architectures[*], gcc turns __builtin_ffsll() into a call to
> __ffsdi2(), which is not provided by any architecture, leading to
> failures like:
> 
> rcar-gen3-cpg.c:(.text+0x289): undefined reference to `__ffsdi2'
> 
> To fix this, use __ffs64() instead, which is available on all
> architectures.
> 
> [*] Known exceptions are some 64-bit architectures like amd64, arm64,
> ia64, powerpc64, and tilegx.
> 
> Reported-by: kbuild test robot 
> Fixes: 3e9b3112ec74f192 ("add basic register-field manipulation macros")
> Signed-off-by: Geert Uytterhoeven 
> ---
>  include/linux/bitfield.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
> index 8b9d6fff002db113..0a827677372756fa 100644
> --- a/include/linux/bitfield.h
> +++ b/include/linux/bitfield.h
> @@ -15,6 +15,7 @@
>  #ifndef _LINUX_BITFIELD_H
>  #define _LINUX_BITFIELD_H
>  
> +#include 
>  #include 
>  
>  /*
> @@ -46,7 +47,7 @@
>   *  reg |= FIELD_PREP(REG_FIELD_C, c);
>   */
>  
> -#define __bf_shf(x) (__builtin_ffsll(x) - 1)
> +#define __bf_shf(x) __ffs64(x)

Hm.  The build bot failure made me think.  I think rcar-gen3-cpg.c may
be doing something wrong here, could you point me at the patch in
question?  I don't see any FIELD_* there in Linus's tree.

__bf_shf() is supposed to be used with constant masks only, therefore
the call must be optimized away completely.

>  #define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx)\
>   ({  \



[PATCH v6 11/14] platform/x86: dell-smbios-wmi: Add new WMI dispatcher driver

2017-10-09 Thread Mario Limonciello
The dell-smbios stack only currently uses an SMI interface which grants
direct access to physical memory to the firmware SMM methods via a pointer.

This dispatcher driver adds a WMI-ACPI interface that is detected by WMI
probe and preferred over the SMI interface in dell-smbios.

Changing this to operate over WMI-ACPI will use an ACPI OperationRegion
for a buffer of data storage when SMM calls are performed.

This is a safer approach to use in kernel drivers as the SMM will
only have access to that OperationRegion.

Signed-off-by: Mario Limonciello 
---
 MAINTAINERS|   6 +
 drivers/platform/x86/Kconfig   |  14 ++
 drivers/platform/x86/Makefile  |   1 +
 drivers/platform/x86/dell-smbios-wmi.c | 229 +
 4 files changed, 250 insertions(+)
 create mode 100644 drivers/platform/x86/dell-smbios-wmi.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 8faf08ebcfee..e7514b616e13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3980,6 +3980,12 @@ L:   platform-driver-...@vger.kernel.org
 S: Maintained
 F: drivers/platform/x86/dell-smbios-smm.c
 
+DELL SMBIOS WMI DRIVER
+M: Mario Limonciello 
+L: platform-driver-...@vger.kernel.org
+S: Maintained
+F: drivers/platform/x86/dell-smbios-wmi.c
+
 DELL LAPTOP DRIVER
 M: Matthew Garrett 
 M: Pali Rohár 
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 53a2de781de6..83aee9068c64 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -94,6 +94,20 @@ config ASUS_LAPTOP
 config DELL_SMBIOS
tristate
 
+config DELL_SMBIOS_WMI
+   tristate "Dell SMBIOS calling interface (WMI implementation)"
+   depends on ACPI_WMI
+   select DELL_WMI_DESCRIPTOR
+   default ACPI_WMI
+   select DELL_SMBIOS
+   ---help---
+   This provides an implementation for the Dell SMBIOS calling interface
+   communicated over ACPI-WMI.
+
+   If you have a Dell computer from >2007 you should say Y or M here.
+   If you aren't sure and this module doesn't work for your computer
+   it just won't load.
+
 config DELL_SMBIOS_SMM
tristate "Dell SMBIOS calling interface (SMM implementation)"
depends on DCDBAS
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index e743615241f8..1c4234861de0 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_MSI_LAPTOP)  += msi-laptop.o
 obj-$(CONFIG_ACPI_CMPC)+= classmate-laptop.o
 obj-$(CONFIG_COMPAL_LAPTOP)+= compal-laptop.o
 obj-$(CONFIG_DELL_SMBIOS)  += dell-smbios.o
+obj-$(CONFIG_DELL_SMBIOS_WMI)  += dell-smbios-wmi.o
 obj-$(CONFIG_DELL_SMBIOS_SMM)  += dell-smbios-smm.o
 obj-$(CONFIG_DELL_LAPTOP)  += dell-laptop.o
 obj-$(CONFIG_DELL_WMI) += dell-wmi.o
diff --git a/drivers/platform/x86/dell-smbios-wmi.c 
b/drivers/platform/x86/dell-smbios-wmi.c
new file mode 100644
index ..e8a2eb858677
--- /dev/null
+++ b/drivers/platform/x86/dell-smbios-wmi.c
@@ -0,0 +1,229 @@
+/*
+ *  WMI methods for use with dell-smbios
+ *
+ *  Copyright (c) 2017 Dell Inc.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 as
+ *  published by the Free Software Foundation.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "dell-smbios.h"
+#include "dell-wmi-descriptor.h"
+static DEFINE_MUTEX(call_mutex);
+static DEFINE_MUTEX(list_mutex);
+static int wmi_supported;
+
+struct misc_bios_flags_structure {
+   struct dmi_header header;
+   u16 flags0;
+} __packed;
+#define FLAG_HAS_ACPI_WMI 0x02
+
+#define DELL_WMI_SMBIOS_GUID "A80593CE-A997-11DA-B012-B622A1EF5492"
+
+struct wmi_extensions {
+   __u32 argattrib;
+   __u32 blength;
+   __u8 data[];
+} __packed;
+
+struct wmi_smbios_buffer {
+   struct calling_interface_buffer std;
+   struct wmi_extensions ext;
+} __packed;
+
+struct wmi_smbios_priv {
+   struct wmi_smbios_buffer *buf;
+   struct list_head list;
+   struct wmi_device *wdev;
+   struct device *child;
+   u32 req_buf_size;
+};
+static LIST_HEAD(wmi_list);
+
+static inline struct wmi_smbios_priv *get_first_smbios_priv(void)
+{
+   return list_first_entry_or_null(&wmi_list,
+   struct wmi_smbios_priv,
+   list);
+}
+
+static int run_smbios_call(struct wmi_device *wdev)
+{
+   struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
+   struct wmi_smbios_priv *priv;
+   struct acpi_buffer input;
+   union acpi_object *obj;
+   acpi_status status;
+
+   priv = dev_get_drvdata(&wdev->dev);
+   input.length = priv->req_buf_size;
+   input.pointer = &priv->buf->std;
+
+   dev_dbg(&wdev->dev, "evaluating: %x/%x [%x,%x,%x,

Re: [PATCH v2] Userfaultfd: Add description for UFFD_FEATURE_SIGBUS

2017-10-09 Thread Andrea Arcangeli
On Mon, Oct 09, 2017 at 03:45:51PM -0700, Prakash Sangappa wrote:
> Userfaultfd feature UFFD_FEATURE_SIGBUS was merged recently and should
> be available in Linux 4.14 release. This patch is for the manpage
> changes documenting this API.
> 
> Documents the following commit:
> 
> commit 2d6d6f5a09a96cc1fec7ed992b825e05f64cb50e
> Author: Prakash Sangappa 
> Date: Wed Sep 6 16:23:39 2017 -0700
> 
>  mm: userfaultfd: add feature to request for a signal delivery
> 
> Signed-off-by: Prakash Sangappa 

Reviewed-by: Andrea Arcangeli 


[PATCH v6 08/14] platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens

2017-10-09 Thread Mario Limonciello
Currently userspace tools can access system tokens via the dcdbas
kernel module and a SMI call that will cause the platform to execute
SMM code.

With a goal in mind of deprecating the dcdbas kernel module a different
method for accessing these tokens from userspace needs to be created.

This is intentionally marked to only be readable as root as it can
contain sensitive information about the platform's configuration.

While adding this interface I found that some tokens are duplicated.
These need to be ignored from sysfs to avoid duplicate files.

MAINTAINERS was missing for this driver.  Add myself and Pali to
maintainers list for it.

Signed-off-by: Mario Limonciello 
---
 .../ABI/testing/sysfs-platform-dell-smbios |  20 ++
 MAINTAINERS|   7 +
 drivers/platform/x86/dell-smbios.c | 205 -
 3 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ABI/testing/sysfs-platform-dell-smbios

diff --git a/Documentation/ABI/testing/sysfs-platform-dell-smbios 
b/Documentation/ABI/testing/sysfs-platform-dell-smbios
new file mode 100644
index ..6700189832f5
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-dell-smbios
@@ -0,0 +1,20 @@
+What:  /sys/devices/platform//tokens/*
+Date:  November 2017
+KernelVersion: 4.15
+Contact:   "Mario Limonciello" 
+Description:
+   A read-only description of Dell platform tokens
+   available on the machine.
+
+   Each token attribute is available as a pair of
+   sysfs attributes readable by root.
+
+   For example the token ID "5" would be available
+   as the following attributes:
+
+   0005_location
+   0005_value
+
+   Tokens will vary from machine to machine, and
+   only tokens available on that machine will be
+   displayed.
diff --git a/MAINTAINERS b/MAINTAINERS
index 659dbeec4191..2e3f2aea0370 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3967,6 +3967,13 @@ M:   "Maciej W. Rozycki" 
 S: Maintained
 F: drivers/net/fddi/defxx.*
 
+DELL SMBIOS DRIVER
+M: Pali Rohár 
+M: Mario Limonciello 
+L: platform-driver-...@vger.kernel.org
+S: Maintained
+F: drivers/platform/x86/dell-smbios.*
+
 DELL LAPTOP DRIVER
 M: Matthew Garrett 
 M: Pali Rohár 
diff --git a/drivers/platform/x86/dell-smbios.c 
b/drivers/platform/x86/dell-smbios.c
index 7e779278d054..ade248646578 100644
--- a/drivers/platform/x86/dell-smbios.c
+++ b/drivers/platform/x86/dell-smbios.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include "../../firmware/dcdbas.h"
+#include 
 #include "dell-smbios.h"
 
 struct calling_interface_structure {
@@ -39,7 +40,11 @@ static DEFINE_MUTEX(buffer_mutex);
 static int da_command_address;
 static int da_command_code;
 static int da_num_tokens;
+static struct platform_device *platform_device;
 static struct calling_interface_token *da_tokens;
+static struct device_attribute *token_location_attrs;
+static struct device_attribute *token_value_attrs;
+static struct attribute **token_attrs;
 
 int dell_smbios_error(int value)
 {
@@ -157,6 +162,26 @@ static void __init parse_da_table(const struct dmi_header 
*dm)
da_num_tokens += tokens;
 }
 
+static void zero_duplicates(struct device *dev)
+{
+   int i, j;
+
+   for (i = 0; i < da_num_tokens; i++) {
+   for (j = i+1; j < da_num_tokens; j++) {
+   if (da_tokens[i].tokenID == 0 ||
+   da_tokens[j].tokenID == 0)
+   continue;
+   if (da_tokens[i].tokenID == da_tokens[j].tokenID) {
+   dev_dbg(dev, "Zeroing dup token ID %x(%x/%x)\n",
+   da_tokens[j].tokenID,
+   da_tokens[j].location,
+   da_tokens[j].value);
+   da_tokens[j].tokenID = 0;
+   }
+   }
+   }
+}
+
 static void __init find_tokens(const struct dmi_header *dm, void *dummy)
 {
switch (dm->type) {
@@ -170,6 +195,148 @@ static void __init find_tokens(const struct dmi_header 
*dm, void *dummy)
}
 }
 
+static int match_attribute(struct device *dev,
+  struct device_attribute *attr)
+{
+   int i;
+
+   for (i = 0; i < da_num_tokens * 2; i++) {
+   if (!token_attrs[i])
+   continue;
+   if (strcmp(token_attrs[i]->name, attr->attr.name) == 0)
+   return i/2;
+   }
+   dev_dbg(dev, "couldn't match: %s\n", attr->attr.name);
+   return -EINVAL;
+}
+
+static ssize_t location_show(struct device *dev,
+struct device_attribute *attr, char *buf)
+{
+   int i;
+
+   i = match_attribute(dev, attr);
+   if (

[PATCH v6 12/14] platform/x86: dell-smbios-smm: test for WSMT

2017-10-09 Thread Mario Limonciello
WSMT is as an attestation to the OS that the platform won't
modify memory outside of pre-defined areas.

If a platform has WSMT enabled in BIOS setup, SMM calls through
dcdbas will fail.  The only way to access platform data in these
instances is through the WMI SMBIOS calling interface.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/dell-smbios-smm.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/platform/x86/dell-smbios-smm.c 
b/drivers/platform/x86/dell-smbios-smm.c
index 223531e43fea..ba315753e847 100644
--- a/drivers/platform/x86/dell-smbios-smm.c
+++ b/drivers/platform/x86/dell-smbios-smm.c
@@ -25,6 +25,8 @@ static struct calling_interface_buffer *buffer;
 struct platform_device *platform_device;
 static DEFINE_MUTEX(smm_mutex);
 
+#define WSMT_EN_TOKEN  0x04EC
+
 static const struct dmi_system_id dell_device_table[] __initconst = {
{
.ident = "Dell laptop",
@@ -76,6 +78,30 @@ int dell_smbios_smm_call(struct calling_interface_buffer 
*input)
return 0;
 }
 
+static int test_wsmt_enabled(void)
+{
+   struct calling_interface_token *token;
+
+   /* if token doesn't exist, SMM will work */
+   token = dell_smbios_find_token(WSMT_EN_TOKEN);
+   if (!token)
+   return 0;
+
+   /* if token exists, try to access over SMM */
+   buffer->class = 0;
+   buffer->select = 0;
+   memset(buffer, 0, sizeof(struct calling_interface_buffer));
+   buffer->input[0] = token->location;
+   dell_smbios_smm_call(buffer);
+
+   /* if lookup failed, we know WSMT was enabled */
+   if (buffer->output[0] != 0)
+   return 1;
+
+   /* query token status if it didn't fail */
+   return (buffer->output[1] == token->value);
+}
+
 static int __init dell_smbios_smm_init(void)
 {
int ret;
@@ -88,6 +114,13 @@ static int __init dell_smbios_smm_init(void)
return -ENOMEM;
dell_smbios_get_smm_address(&da_command_address, &da_command_code);
 
+   ret = test_wsmt_enabled();
+   pr_debug("WSMT enable test: %d\n", ret);
+   if (ret) {
+   ret = -ENODEV;
+   goto fail_wsmt;
+   }
+
platform_device = platform_device_alloc("dell-smbios", 1);
if (!platform_device) {
ret = -ENOMEM;
@@ -111,6 +144,7 @@ static int __init dell_smbios_smm_init(void)
 fail_platform_device_add:
platform_device_put(platform_device);
 
+fail_wsmt:
 fail_platform_device_alloc:
free_page((unsigned long)buffer);
return ret;
-- 
2.14.1



[PATCH v6 07/14] platform/x86: dell-smbios: only run if proper oem string is detected

2017-10-09 Thread Mario Limonciello
The proper way to indicate that a system is a 'supported' Dell System
is by the presence of this string in OEM strings.

Allowing the driver to load on non-Dell systems will have undefined
results.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/dell-smbios.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/platform/x86/dell-smbios.c 
b/drivers/platform/x86/dell-smbios.c
index e9b1ca07c872..7e779278d054 100644
--- a/drivers/platform/x86/dell-smbios.c
+++ b/drivers/platform/x86/dell-smbios.c
@@ -172,8 +172,15 @@ static void __init find_tokens(const struct dmi_header 
*dm, void *dummy)
 
 static int __init dell_smbios_init(void)
 {
+   const struct dmi_device *valid;
int ret;
 
+   valid = dmi_find_device(DMI_DEV_TYPE_OEM_STRING, "Dell System", NULL);
+   if (!valid) {
+   pr_err("Unable to run on non-Dell system\n");
+   return -ENODEV;
+   }
+
dmi_walk(find_tokens, NULL);
 
if (!da_tokens)  {
-- 
2.14.1



[PATCH v6 10/14] platform/x86: dell-smbios: add filtering capability for requests

2017-10-09 Thread Mario Limonciello
There are some categories of tokens and SMBIOS calls that it makes
sense to protect userspace from accessing.  These are calls that
may write to one time use fields or activate hardware debugging
capabilities.  They are not intended for general purpose use.

This same functionality may be be later extended to also intercept
calls that may cause kernel functionality to get out of sync if
the same functions are used by other drivers.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/dell-smbios.c | 123 +
 drivers/platform/x86/dell-smbios.h |   2 +
 2 files changed, 125 insertions(+)

diff --git a/drivers/platform/x86/dell-smbios.c 
b/drivers/platform/x86/dell-smbios.c
index 386130811a34..3279d3f07be8 100644
--- a/drivers/platform/x86/dell-smbios.c
+++ b/drivers/platform/x86/dell-smbios.c
@@ -32,6 +32,7 @@ struct calling_interface_structure {
struct calling_interface_token tokens[];
 } __packed;
 
+static u32 da_supported_commands;
 static int da_command_address;
 static int da_command_code;
 static int da_num_tokens;
@@ -48,6 +49,54 @@ struct smbios_device {
int (*call_fn)(struct calling_interface_buffer *);
 };
 
+
+/* calls that should be blacklisted. may contain diagnostics,
+ * debugging information or are write once functions
+ */
+struct smbios_call {
+   int class;
+   int select;
+};
+
+static struct smbios_call call_blacklist[] = {
+   {01, 07},
+   {06, 05},
+   {11, 03},
+   {11, 07},
+   {11, 11},
+   {19, -1},
+};
+
+/* tokens corresponding to diagnostics or write once locations */
+struct token_range {
+   u16 exact_value;
+   u16 min;
+   u16 max;
+};
+
+static struct token_range token_blacklist[] = {
+   {0x, 0x0175, 0x0176},
+   {0x, 0x0195, 0x0197},
+   {0x, 0x01DC, 0x01DD},
+   {0x, 0x027D, 0x0284},
+   {0x02E3, 0x, 0x},
+   {0x02FF, 0x, 0x},
+   {0x, 0x0300, 0x0302},
+   {0x, 0x0325, 0x0326},
+   {0x, 0x0332, 0x0335},
+   {0x0350, 0x, 0x},
+   {0x0363, 0x, 0x},
+   {0x0368, 0x, 0x},
+   {0x, 0x03F6, 0x03F7},
+   {0x, 0x049E, 0x049F},
+   {0x, 0x04A0, 0x04A3},
+   {0x, 0x04E6, 0x04E7},
+   {0x, 0x4000, 0x7FFF},
+   {0x, 0x9000, 0x9001},
+   {0x, 0xA000, 0xBFFF},
+   {0x, 0xEFF0, 0xEFFF},
+};
+
 static LIST_HEAD(smbios_device_list);
 
 void dell_smbios_get_smm_address(int *address, int *code)
@@ -107,6 +156,72 @@ void dell_smbios_unregister_device(struct device *d)
 }
 EXPORT_SYMBOL_GPL(dell_smbios_unregister_device);
 
+int dell_smbios_call_filter(struct device *d,
+   struct calling_interface_buffer *buffer)
+{
+   int i;
+   u16 t = 0;
+
+   /* can't make calls over 30 */
+   if (buffer->class > 30) {
+   dev_dbg(d, "buffer->class too big: %d\n", buffer->class);
+   return -EINVAL;
+   }
+
+   /* supported calls on the particular system */
+   if (!(da_supported_commands & (1 << buffer->class))) {
+   dev_dbg(d, "invalid command, supported commands: 0x%8x\n",
+   da_supported_commands);
+   return -EINVAL;
+   }
+
+   /* match against call blacklist  */
+   for (i = 0; i < ARRAY_SIZE(call_blacklist); i++) {
+   if (buffer->class != call_blacklist[i].class)
+   continue;
+   if (buffer->select != call_blacklist[i].select &&
+   call_blacklist[i].select != -1)
+   continue;
+   dev_dbg(d, "blacklisted command: %d/%d\n",
+   buffer->class, buffer->select);
+   return -EINVAL;
+   }
+
+   /* if a token call, find token ID */
+   if ((buffer->class == 0 && buffer->select < 3) ||
+   (buffer->class == 1 && buffer->select < 3)) {
+   for (i = 0; i < da_num_tokens; i++) {
+   /* find the matching token ID */
+   if (da_tokens[i].location != buffer->input[0])
+   continue;
+   t = da_tokens[i].tokenID;
+   break;
+   }
+   /* not a token call */
+   } else
+   return 0;
+
+   /* token call; but token didn't exist */
+   if (!t) {
+   dev_dbg(d, "token at location %u doesn't exist\n",
+   buffer->input[0]);
+   return -EINVAL;
+   }
+   /* match against token blacklist */
+   for (i = 0; i < ARRAY_SIZE(token_blacklist); i++) {
+   if (token_blacklist[i].exact_value &&
+   t == token_blacklist[i].exact_value)
+   return -EINVAL;
+   if (!token_blacklist[i].min || !token_blacklist[i].max)
+   continue;
+   if (t >= token_blacklist[i].min && t <=

[PATCH v6 06/14] platform/x86: wmi: Don't allow drivers to get each other's GUIDs

2017-10-09 Thread Mario Limonciello
The only driver using this was dell-wmi, and it really was a hack.
The driver was getting a data attribute from another driver and this
type of action should not be encouraged.

Rather drivers that need to interact with one another should pass
data back and forth via exported functions.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/wmi.c | 17 -
 include/linux/wmi.h|  4 
 2 files changed, 21 deletions(-)

diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
index 4d73a87c2ddf..bcb41c1c7f52 100644
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -368,23 +368,6 @@ union acpi_object *wmidev_block_query(struct wmi_device 
*wdev, u8 instance)
 }
 EXPORT_SYMBOL_GPL(wmidev_block_query);
 
-struct wmi_device *wmidev_get_other_guid(struct wmi_device *wdev,
-const char *guid_string)
-{
-   struct wmi_block *this_wb = container_of(wdev, struct wmi_block, dev);
-   struct wmi_block *other_wb;
-
-   if (!find_guid(guid_string, &other_wb))
-   return NULL;
-
-   if (other_wb->acpi_device != this_wb->acpi_device)
-   return NULL;
-
-   get_device(&other_wb->dev.dev);
-   return &other_wb->dev;
-}
-EXPORT_SYMBOL_GPL(wmidev_get_other_guid);
-
 /**
  * wmi_set_block - Write to a WMI block
  * @guid_string: 36 char string of the form 
fa50ff2b-f2e8-45de-83fa-65417f2f49ba
diff --git a/include/linux/wmi.h b/include/linux/wmi.h
index 2cd10c3b89e9..ddee427e0721 100644
--- a/include/linux/wmi.h
+++ b/include/linux/wmi.h
@@ -36,10 +36,6 @@ extern acpi_status wmidev_evaluate_method(struct wmi_device 
*wdev,
 extern union acpi_object *wmidev_block_query(struct wmi_device *wdev,
 u8 instance);
 
-/* Gets another device on the same bus.  Caller must put_device the result. */
-extern struct wmi_device *wmidev_get_other_guid(struct wmi_device *wdev,
-   const char *guid_string);
-
 struct wmi_device_id {
const char *guid_string;
 };
-- 
2.14.1



[PATCH v6 05/14] platform/x86: dell-wmi-descriptor: split WMI descriptor into it's own driver

2017-10-09 Thread Mario Limonciello
All communication on individual GUIDs should occur in separate drivers.
Allowing a driver to communicate with the bus to another GUID is just
a hack that discourages drivers to adopt the bus model.

The information found from the WMI descriptor driver is now exported
for use by other drivers.

Signed-off-by: Mario Limonciello 
---
 MAINTAINERS|   5 +
 drivers/platform/x86/Kconfig   |   5 +
 drivers/platform/x86/Makefile  |   1 +
 drivers/platform/x86/dell-wmi-descriptor.c | 162 +
 drivers/platform/x86/dell-wmi-descriptor.h |  18 
 drivers/platform/x86/dell-wmi.c|  89 ++--
 6 files changed, 198 insertions(+), 82 deletions(-)
 create mode 100644 drivers/platform/x86/dell-wmi-descriptor.c
 create mode 100644 drivers/platform/x86/dell-wmi-descriptor.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 08b96f77f618..659dbeec4191 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4002,6 +4002,11 @@ M:   Pali Rohár 
 S: Maintained
 F: drivers/platform/x86/dell-wmi.c
 
+DELL WMI DESCRIPTOR DRIVER
+M: Mario Limonciello 
+S: Maintained
+F: drivers/platform/x86/dell-wmi-descriptor.c
+
 DELTA ST MEDIA DRIVER
 M: Hugues Fruchet 
 L: linux-me...@vger.kernel.org
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 1f7959ff055c..7722923c968c 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -121,6 +121,7 @@ config DELL_WMI
depends on DMI
depends on INPUT
depends on ACPI_VIDEO || ACPI_VIDEO = n
+   select DELL_WMI_DESCRIPTOR
select DELL_SMBIOS
select INPUT_SPARSEKMAP
---help---
@@ -129,6 +130,10 @@ config DELL_WMI
  To compile this driver as a module, choose M here: the module will
  be called dell-wmi.
 
+config DELL_WMI_DESCRIPTOR
+   tristate
+   depends on ACPI_WMI
+
 config DELL_WMI_AIO
tristate "WMI Hotkeys for Dell All-In-One series"
depends on ACPI_WMI
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2b315d0df3b7..8636f5d3424f 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_COMPAL_LAPTOP)   += compal-laptop.o
 obj-$(CONFIG_DELL_SMBIOS)  += dell-smbios.o
 obj-$(CONFIG_DELL_LAPTOP)  += dell-laptop.o
 obj-$(CONFIG_DELL_WMI) += dell-wmi.o
+obj-$(CONFIG_DELL_WMI_DESCRIPTOR)  += dell-wmi-descriptor.o
 obj-$(CONFIG_DELL_WMI_AIO) += dell-wmi-aio.o
 obj-$(CONFIG_DELL_WMI_LED) += dell-wmi-led.o
 obj-$(CONFIG_DELL_SMO8800) += dell-smo8800.o
diff --git a/drivers/platform/x86/dell-wmi-descriptor.c 
b/drivers/platform/x86/dell-wmi-descriptor.c
new file mode 100644
index ..72e317cf0365
--- /dev/null
+++ b/drivers/platform/x86/dell-wmi-descriptor.c
@@ -0,0 +1,162 @@
+/*
+ * Dell WMI descriptor driver
+ *
+ * Copyright (C) 2017 Dell Inc. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License version 2 as published
+ *  by the Free Software Foundation.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include "dell-wmi-descriptor.h"
+
+#define DELL_WMI_DESCRIPTOR_GUID "8D9DDCBC-A997-11DA-B012-B622A1EF5492"
+
+struct descriptor_priv {
+   struct list_head list;
+   u32 interface_version;
+   u32 size;
+};
+static LIST_HEAD(wmi_list);
+
+bool dell_wmi_get_interface_version(u32 *version)
+{
+   struct descriptor_priv *priv;
+
+   priv = list_first_entry_or_null(&wmi_list,
+   struct descriptor_priv,
+   list);
+   if (!priv)
+   return false;
+   *version = priv->interface_version;
+   return true;
+}
+EXPORT_SYMBOL_GPL(dell_wmi_get_interface_version);
+
+bool dell_wmi_get_size(u32 *size)
+{
+   struct descriptor_priv *priv;
+
+   priv = list_first_entry_or_null(&wmi_list,
+   struct descriptor_priv,
+   list);
+   if (!priv)
+   return false;
+   *size = priv->size;
+   return true;
+}
+EXPORT_SYMBOL_GPL(dell_wmi_get_size);
+
+/*
+ * Descriptor buffer is 128 byte long and contains:
+ *
+ *   Name Offset  Length  Value
+ * Vendor Signature  0   4"DELL"
+ * Object Signature  4   4" WMI"
+ * WMI Interface Version 8   4
+ * WMI buffer length12   44096 or 32768
+ */
+static int dell_wmi_descriptor_probe(struct wmi_device *wdev)
+{
+  

[PATCH v6 03/14] platform/x86: dell-wmi: clean up wmi descriptor check

2017-10-09 Thread Mario Limonciello
Some cases the wrong type was used for errors and checks can be
done more cleanly.

Signed-off-by: Mario Limonciello 
Reviewed-by: Edward O'Callaghan 
Suggested-by: Andy Shevchenko 
---
 drivers/platform/x86/dell-wmi.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/platform/x86/dell-wmi.c b/drivers/platform/x86/dell-wmi.c
index 2cfaaa8faf0a..ece2fe341f01 100644
--- a/drivers/platform/x86/dell-wmi.c
+++ b/drivers/platform/x86/dell-wmi.c
@@ -663,19 +663,19 @@ static int dell_wmi_check_descriptor_buffer(struct 
wmi_device *wdev)
 
buffer = (u32 *)obj->buffer.pointer;
 
-   if (buffer[0] != 0x4C4C4544 && buffer[1] != 0x494D5720) {
-   dev_err(&wdev->dev, "Dell descriptor buffer has invalid 
signature (%*ph)\n",
-   8, buffer);
+   if (strncmp(obj->string.pointer, "DELL WMI", 8) != 0) {
+   dev_err(&wdev->dev, "Dell descriptor buffer has invalid 
signature (%8ph)\n",
+   buffer);
ret = -EINVAL;
goto out;
}
 
if (buffer[2] != 0 && buffer[2] != 1)
-   dev_warn(&wdev->dev, "Dell descriptor buffer has unknown 
version (%d)\n",
+   dev_warn(&wdev->dev, "Dell descriptor buffer has unknown 
version (%u)\n",
buffer[2]);
 
if (buffer[3] != 4096)
-   dev_warn(&wdev->dev, "Dell descriptor buffer has invalid buffer 
length (%d)\n",
+   dev_warn(&wdev->dev, "Dell descriptor buffer has invalid buffer 
length (%u)\n",
buffer[3]);
 
priv->interface_version = buffer[2];
-- 
2.14.1



[PATCH v6 04/14] platform/x86: dell-wmi: allow 32k return size in the descriptor

2017-10-09 Thread Mario Limonciello
Some platforms this year will be adopting 32k WMI buffer, so don't
complain when encountering those.

Signed-off-by: Mario Limonciello 
---
 drivers/platform/x86/dell-wmi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/dell-wmi.c b/drivers/platform/x86/dell-wmi.c
index ece2fe341f01..c8c7f4f9326c 100644
--- a/drivers/platform/x86/dell-wmi.c
+++ b/drivers/platform/x86/dell-wmi.c
@@ -624,7 +624,7 @@ static void dell_wmi_input_destroy(struct wmi_device *wdev)
  * Vendor Signature  0   4"DELL"
  * Object Signature  4   4" WMI"
  * WMI Interface Version 8   4
- * WMI buffer length12   44096
+ * WMI buffer length12   44096 or 32768
  */
 static int dell_wmi_check_descriptor_buffer(struct wmi_device *wdev)
 {
@@ -674,7 +674,7 @@ static int dell_wmi_check_descriptor_buffer(struct 
wmi_device *wdev)
dev_warn(&wdev->dev, "Dell descriptor buffer has unknown 
version (%u)\n",
buffer[2]);
 
-   if (buffer[3] != 4096)
+   if (desc_buffer[3] != 4096 && desc_buffer[3] != 32768)
dev_warn(&wdev->dev, "Dell descriptor buffer has invalid buffer 
length (%u)\n",
buffer[3]);
 
-- 
2.14.1



Re: [PATCH v3 03/22] dt-bindings: arm: scmi: add ARM MHU specific mailbox client bindings

2017-10-09 Thread Rob Herring
On Mon, Oct 9, 2017 at 9:46 AM, Jassi Brar  wrote:
> On Mon, Oct 9, 2017 at 7:22 PM, Rob Herring  wrote:
>> On Fri, Oct 6, 2017 at 9:26 PM, Jassi Brar  wrote:
>>> On Fri, Oct 6, 2017 at 9:24 PM, Rob Herring  wrote:
 On Fri, Oct 6, 2017 at 6:01 AM, Jassi Brar  
 wrote:
> On Fri, Oct 6, 2017 at 4:50 AM, Rob Herring  wrote:
>> On Thu, Sep 28, 2017 at 02:11:27PM +0100, Sudeep Holla wrote:
>>>
>>
>>> +- mbox-data : For each phandle listed in mboxes property, an unsigned 
>>> 32-bit
>>> +   data as expected by the mailbox controller
>>
>> Shouldn't that be cells as part of mboxes property?
>>
> A MHU client can send any number of commands (such u32 values) over a 
> channel.
> This client (SCMI) sends just one command over a channel, but other
> clients may/do send two or more.
>>
>> The above definition doesn't support 2 or more as it is 1-1 with channels.
>>
> I thought you suggested to make controller driver accept the command
> as another cell in client's mboxes property.
> Which we can't do.

Yes, agreed. But I'm wondering since a client may need more than one,
how would that be expressed?

 Okay, then I guess I don't understand why this is in DT.

>>> Yeah the client has to provide code (u32 value) for the commands it
>>> sends, and that value is going to be platform specific. For example,
>>> on Juno the ITS_AN_SCMI_COMMAND may be defined as BIT(7) while on my
>>> platform it may be 0x4567
>>>
>>> For MHU based platforms, it becomes easy if the u32 is passed from DT.
>>> And that should be ok since that is like a h/w parameter - a value
>>> chosen/expected by the remote firmware.
>>
>> Could it ever be more than 1 cell?
>>
> SCMI sends sub-commands via SHMEM, so it is always going to be 1cell for 
> _scmi_.
> However many firmwares are unlikely to use just one command over a
> channel - say, the protocol is trivial or the linux and remote have no
> SHMEM.

I'd hope the normal case is not enumerating commands and sub-commands
in DT and this is special case of a "generic" protocol with platform
specific aspects. It could be solved with a specific compatible for
each platform/implementation. We'll probably regret not doing that,
but I'm going to pretend that this time SoC vendors won't mess it up.

>> I guess being in DT is fine, but I'm still not sure about the naming.
>> The current name suggests it is part of the mbox binding. Do we want
>> that or should it be SCMI specific? Then "data" is vague. Perhaps
>> "scmi-commands"?
>>
> Sure. I have no problem with whatever we wanna call it.

Okay. That should have an "arm" prefix too BTW.

Rob


Re: [PATCH] bitfield: Use __ffs64(x) to fix missing __ffsdi2()

2017-10-09 Thread kbuild test robot
Hi Geert,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.14-rc4]
[cannot apply to next-20171009]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Geert-Uytterhoeven/bitfield-Use-__ffs64-x-to-fix-missing-__ffsdi2/20171010-050009
config: x86_64-allyesdebian (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from arch/x86/include/asm/current.h:4:0,
from include/linux/sched.h:11,
from drivers/net/wireless/intel/iwlwifi/pcie/rx.c:31:
   In function 'iwl_pcie_rx_mq_hw_init',
   inlined from 'iwl_pcie_rx_init' at 
drivers/net/wireless/intel/iwlwifi/pcie/rx.c:976:3:
>> include/linux/compiler.h:576:38: error: call to '__compiletime_assert_857' 
>> declared with attribute error: BUILD_BUG_ON failed: (((0xF00) + (1ULL << 
>> __ffs64(0xF00))) & (((0xF00) + (1ULL << __ffs64(0xF00))) - 1)) != 0
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^
   include/linux/compiler.h:556:4: note: in definition of macro 
'__compiletime_assert'
   prefix ## suffix();\
   ^~
   include/linux/compiler.h:576:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/build_bug.h:70:2: note: in expansion of macro 
'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
 ^~~~
   include/linux/build_bug.h:19:2: note: in expansion of macro 'BUILD_BUG_ON'
 BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
 ^~~~
   include/linux/bitfield.h:62:3: note: in expansion of macro 
'__BUILD_BUG_ON_NOT_POWER_OF_2'
  __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +   \
  ^
   include/linux/bitfield.h:89:3: note: in expansion of macro '__BF_FIELD_CHECK'
  __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \
  ^~~~
>> drivers/net/wireless/intel/iwlwifi/iwl-fh.h:487:34: note: in expansion of 
>> macro 'FIELD_PREP'
#define RFH_GEN_CFG_VAL(_n, _v)  FIELD_PREP(RFH_GEN_CFG_ ## _n, _v)
 ^~
>> drivers/net/wireless/intel/iwlwifi/pcie/rx.c:857:11: note: in expansion of 
>> macro 'RFH_GEN_CFG_VAL'
  RFH_GEN_CFG_VAL(DEFAULT_RXQ_NUM, 0) |
  ^~~
>> include/linux/compiler.h:576:38: error: call to '__compiletime_assert_862' 
>> declared with attribute error: BUILD_BUG_ON failed: 1UL << (4))) + (1ULL 
>> << __ffs64((1UL << (4) & 1UL << (4))) + (1ULL << __ffs64((1UL << 
>> (4) - 1)) != 0
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^
   include/linux/compiler.h:556:4: note: in definition of macro 
'__compiletime_assert'
   prefix ## suffix();\
   ^~
   include/linux/compiler.h:576:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/build_bug.h:70:2: note: in expansion of macro 
'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
 ^~~~
   include/linux/build_bug.h:19:2: note: in expansion of macro 'BUILD_BUG_ON'
 BUILD_BUG_ON(((n) & ((n) - 1)) != 0)
 ^~~~
   include/linux/bitfield.h:62:3: note: in expansion of macro 
'__BUILD_BUG_ON_NOT_POWER_OF_2'
  __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +   \
  ^
   include/linux/bitfield.h:89:3: note: in expansion of macro '__BF_FIELD_CHECK'
  __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \
  ^~~~
>> drivers/net/wireless/intel/iwlwifi/iwl-fh.h:487:34: note: in expansion of 
>> macro 'FIELD_PREP'
#define RFH_GEN_CFG_VAL(_n, _v)  FIELD_PREP(RFH_GEN_CF

Re: [PATCH] cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan

2017-10-09 Thread David Miller
From: Aleksander Morgado 
Date: Mon,  9 Oct 2017 14:05:12 +0200

> The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback.
> This module allows switching to different USB profiles with the
> 'AT+UUSBCONF' command, and provides a ECM network interface when the
> 'AT+UUSBCONF=2' profile is selected.
> 
> The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB
> configuration includes a ECM network interface.
> 
> Both these modules are controlled via AT commands through one of the
> TTYs exposed. Connecting these modules may be done just by activating
> the desired PDP context with 'AT+CGACT=1,' and then running DHCP
> on the ECM interface.
> 
> Signed-off-by: Aleksander Morgado 

Applied, thank you.


Re: linux-next: manual merge of the cgroup tree with the net-next tree

2017-10-09 Thread Alexei Starovoitov
On Mon, Oct 09, 2017 at 07:38:36PM +0100, Mark Brown wrote:
> Hi Tejun,
> 
> Today's linux-next merge of the cgroup tree got a conflict in:
> 
>   kernel/cgroup/cgroup.c
> 
> between commit:
> 
>   324bda9e6c5ad ("bpf: multi program support for cgroup+bpf")
> 
> from the net-next tree and commit:
> 
>   041cd640b2f3c ("cgroup: Implement cgroup2 basic CPU usage accounting")
> 
> from the cgroup tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> diff --cc kernel/cgroup/cgroup.c
> index 00f5b358aeac,c3421ee0d230..
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@@ -4765,8 -4785,9 +4788,11 @@@ static struct cgroup *cgroup_create(str
>   
>   return cgrp;
>   
>  +out_idr_free:
>  +cgroup_idr_remove(&root->cgroup_idr, cgrp->id);
> + out_stat_exit:
> + if (cgroup_on_dfl(parent))
> + cgroup_stat_exit(cgrp);

thanks. I did the same merge conflict resolution for our combined tree.



Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.

2017-10-09 Thread Alexei Starovoitov
On Mon, Oct 09, 2017 at 08:26:34PM +, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >>
> >> This was found using make coccicheck M=net/core on linux next
> >> tag next-2017092
> >>
> >> Signed-off-by: Tim Hansen 
> >> ---
> >>  net/core/skbuff.c | 15 ++-
> >>  1 file changed, 6 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, 
> >> gfp_t gfp_mask)
> >>/* Set the tail pointer and length */
> >>skb_put(n, skb->len);
> >>
> >> -  if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
> >> -  BUG();
> >> +  BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
> >
> >I'm concerned with this change.
> >1. Calling non-trivial bit of code inside the macro is a poor coding style 
> >(imo)
> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and 
> >implementation
> >of BUG and BUG_ON look quite different.
> 
> For these archs, wouldn't it then be more efficient to use BUG_ON rather than 
> BUG()?

why more efficient? any data to prove that?
I'm pointing that the change is not equivalent and
this code has been around forever (pre-git days), so I see
no reason to risk changing it.



Re: [PATCH 02/13] EDAC, altera: kill off ACCESS_ONCE()

2017-10-09 Thread Thor Thayer

On 10/09/2017 01:28 PM, Mark Rutland wrote:

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't currently harmful.

However, for some features it is necessary to instrument reads and
writes separately, which is not possible with ACCESS_ONCE(). This
distinction is critical to correct operation.

It's possible to transform the bulk of kernel code using the Coccinelle
script below. However, this doesn't handle comments, leaving references
to ACCESS_ONCE() instances which have been removed. As a preparatory
step, this patch converts the Altera EDAC code and comments to use
{READ,WRITE}_ONCE() consistently.


virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)


Signed-off-by: Mark Rutland 
Cc: Borislav Petkov 
Cc: Thor Thayer 
---
  drivers/edac/altera_edac.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c
index 346c498..11d6419 100644
--- a/drivers/edac/altera_edac.c
+++ b/drivers/edac/altera_edac.c
@@ -175,11 +175,11 @@ static ssize_t altr_sdr_mc_err_inject_write(struct file 
*file,
/*
 * To trigger the error, we need to read the data back
 * (the data was written with errors above).
-* The ACCESS_ONCE macros and printk are used to prevent the
+* The READ_ONCE macros and printk are used to prevent the
 * the compiler optimizing these reads out.
 */
-   reg = ACCESS_ONCE(ptemp[0]);
-   read_reg = ACCESS_ONCE(ptemp[1]);
+   reg = READ_ONCE(ptemp[0]);
+   read_reg = READ_ONCE(ptemp[1]);
/* Force Read */
rmb();
  
@@ -618,7 +618,7 @@ static ssize_t altr_edac_device_trig(struct file *file,

for (i = 0; i < (priv->trig_alloc_sz / sizeof(*ptemp)); i++) {
/* Read data so we're in the correct state */
rmb();
-   if (ACCESS_ONCE(ptemp[i]))
+   if (READ_ONCE(ptemp[i]))
result = -1;
/* Toggle Error bit (it is latched), leave ECC enabled */
writel(error_mask, (drvdata->base + priv->set_err_ofst));
@@ -635,7 +635,7 @@ static ssize_t altr_edac_device_trig(struct file *file,
  
  	/* Read out written data. ECC error caused here */

for (i = 0; i < ALTR_TRIGGER_READ_WRD_CNT; i++)
-   if (ACCESS_ONCE(ptemp[i]) != i)
+   if (READ_ONCE(ptemp[i]) != i)
edac_printk(KERN_ERR, EDAC_DEVICE,
"Read doesn't match written data\n");
  



Tested on Cyclone5 and Arria10. Thanks!
Acked-by: Thor Thayer 


[GIT] Networking

2017-10-09 Thread David Miller

1) Fix object leak on IPSEC offload failure, from Steffen Klassert.

2) Fix range checks in ipset address range addition operations,
   from Jozsef Kadlecsik.

3) Fix pernet ops unregistration order in ipset, from Florian
   Westphal.

4) Add missing netlink attribute policy for nl80211 packet pattern
   attrs, from Peng Xu.

5) Fix PPP device destruction race, from Guillaume Nault.

6) Write marks get lost when BPF verifier processes R1=R2 register
   assignments, causing incorrect liveness information and less
   state pruning.  Fix from Alexei Starovoitov.

7) Fix blockhole routes so that they are marked dead and therefore
   not cached in sockets, otherwise IPSEC stops working.  From
   Steffen Klassert.

8) Fix broadcast handling of UDP socket early demux, from Paolo
   Abeni.

Please pull, thanks a lot!

The following changes since commit 7a92616c0bac849e790283723b36c399668a1d9f:

  Merge tag 'pm-4.14-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm (2017-10-05 
15:51:37 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to fdfbad3256918fc5736d68384331d2dbf45ccbd6:

  cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan (2017-10-09 16:03:32 
-0700)


Aleksander Morgado (1):
  cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan

Alexei Starovoitov (1):
  bpf: fix liveness marking

Alexey Kodanev (2):
  vti: fix NULL dereference in xfrm_input()
  gso: fix payload length when gso_size is zero

Artem Savkov (2):
  xfrm: don't call xfrm_policy_cache_flush under xfrm_state_lock
  netfilter: ebtables: fix race condition in frame_filter_net_init()

Arvind Yadav (1):
  netfilter: nf_tables: Release memory obtained by kasprintf

Axel Beckert (1):
  doc: Fix typo "8023.ad" in bonding documentation

Dan Carpenter (1):
  selftests/net: rxtimestamp: Fix an off by one

David S. Miller (4):
  Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
  Merge tag 'mac80211-for-davem-2017-10-09' of 
git://git.kernel.org/.../jberg/mac80211
  Merge branch '10GbE' of git://git.kernel.org/.../jkirsher/net-queue
  Merge git://git.kernel.org/.../pablo/nf

Ding Tianhong (2):
  Revert commit 1a8b6d76dc5b ("net:add one common config...")
  net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

Eric Dumazet (1):
  netfilter: x_tables: avoid stack-out-of-bounds read in 
xt_copy_counters_from_user

Florian Westphal (1):
  netfilter: ipset: pernet ops must be unregistered last

Guillaume Nault (1):
  ppp: fix race in ppp device destruction

Gustavo A. R. Silva (1):
  net: thunderx: mark expected switch fall-throughs in nicvf_main()

Ido Schimmel (1):
  mlxsw: spectrum_router: Avoid expensive lookup during route removal

Jason A. Donenfeld (1):
  netlink: do not set cb_running if dump's start() errs

JingPiao Chen (1):
  netfilter: nf_tables: fix update chain error

John Fastabend (1):
  ixgbe: incorrect XDP ring accounting in ethtool tx_frame param

Jon Maloy (2):
  tipc: correct initialization of skb list
  tipc: Unclone message at secondary destination lookup

Jozsef Kadlecsik (1):
  netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 
addresses

Lin Zhang (1):
  netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook

Mark D Rustad (1):
  ixgbe: Return error when getting PHY address if PHY access is not 
supported

Matteo Croce (1):
  ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real

Pablo Neira Ayuso (1):
  netfilter: nf_tables: do not dump chain counters if not enabled

Paolo Abeni (1):
  udp: fix bcast packet reception

Peng Xu (1):
  nl80211: Define policy for packet pattern attributes

Ross Lagerwall (1):
  netfilter: ipset: Fix race between dump and swap

Sabrina Dubroca (1):
  ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register

Shmulik Ladkani (1):
  netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'

Steffen Klassert (4):
  xfrm: Fix deletion of offloaded SAs on failure.
  xfrm: Fix negative device refcount on offload failure.
  ipv6: Fix traffic triggered IPsec connections.
  ipv4: Fix traffic triggered IPsec connections.

Subash Abhinov Kasiviswanathan (1):
  netfilter: xt_socket: Restore mark from full sockets only

Vadim Fedorenko (1):
  netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel

 Documentation/networking/bonding.txt  |  2 +-
 arch/Kconfig  |  3 ---
 arch/sparc/Kconfig|  1 -
 drivers/net/ethernet/cavium/thunder/nicvf_main.c  |  2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c| 22 
--
 drivers/net/et

Re: [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of migrations and delays

2017-10-09 Thread Paul E. McKenney
On Mon, Oct 09, 2017 at 10:16:37AM +0200, Peter Zijlstra wrote:
> On Sat, Oct 07, 2017 at 11:28:57AM -0700, Paul E. McKenney wrote:
> > But if you are saying that it would be good to have wait_for_completion()
> > and complete() directly modeled at some point, no argument.  In addition,
> > I hope that the memory model is applied to other tools that analyze kernel
> > code.
> 
> > > I'm not sure I got the point across; so I'll try once more. Without
> > > providing this ordering the completion would be fundamentally broken. It
> > > _must_ provide this ordering.
> > 
> > OK, I now understand what you are getting at, and I do very much like
> > that guarantee.
> 
> Right, so maybe we should update the completion comments a bit to call
> out this property, because I'm not sure its there.

That would be very good!

> Also, with this, I think the smp_store_release() / smp_load_acquire() is
> a perfectly fine abstraction of it, I don't think the model needs to be
> taught about the completion interface.

Agreed, if we pull too much stuff into the model it might well become
difficult to maintain in and of itself.  So we do need to choose
carefully based on model performance and ease of use.

For a performance example, note that locking can be easily modeled with
xchg_acquire() for lock acquisition and smp_store_release() for lock
release.  But here is the resulting performance compared to directly
modeling locking within the cat code:

xchg_acquire() and
# Processes smp_store_release() Direct modeling

2   0.061 s 0.007 s

3   3.542 s 0.039 s

4 569.189 s 0.397 s

5   > 113,853.000 s 5.187 s

Combinatorial explosion is a real pain...  The problem is that the
xchg_acquire() / smp_store_release() approach requires the herd7 tool
to consider and then reject all the scenarios where the lock critical
sections overlap.  In contrast, when direct modeling, those overlapping
critical-section scenarios are implicitly rejected by the cat-code that
models locking.  Don't get me wrong, there is still combinatorial
explosion, as you can see from the rightmost column of numbers.
After all, the herd7 tool still needs to consider all possible
lock-acquisition orders.  But the explosion is much less explosive.  ;-)

Given that we expect locking to be very heavily used, it makes a lot
of sense to directly model it, which we (mostly Alan, Andrea, and Luc)
have done.  Direct modeling of RCU gets similar speedups over emulation,
plus the fact that accurate emulation of RCU requires some surprisingly
unnatural acts.  ("Ghost" accesses unaffected by normal memory-ordering
primitives, and "ghost-buster memory barriers" that affect both "ghost"
and normal accesses.  Plus the transformation is complex, so much so
that I broke down and wrote a script, which was used to validate the
early versions of the RCU model.)

In constrast, I have no reason to believe that direct modeling could make
wait_for_completion() / complete() go any faster than the equivalent
setup using smp_store_release() / smp_load_acquire(), that is, I don't
see anything that could be optimized away byt cat-code.  So the only
reason to directly model wait_for_completion() / complete() would be
to get rid of the need to add the extra term to the "exists" clause.
And agreed, we should wait until people are actually being significantly
inconvenienced by this, which I do not expect any time soon.

> > > Why not? In what sort of cases does it go wobbly?
> > 
> > For one, when it conflicts with maintainability.  For example, it would
> > probably be OK for some of RCU's rcu_node ->lock acquisitions to skip the
> > smp_mb__after_unlock_lock() invocations.  But those are slowpaths, and the
> > small speedup on only one architecture is just not worth the added pain.
> > Especially given the nice wrapper functions that you provided.
> > 
> > But of course if this were instead (say) rcu_read_lock() or common-case
> > rcu_read_unlock(), I would be willing to undergo much more pain.  On the
> > other hand, for that exact reason, that common-case code path doesn't
> > acquire locks in the first place.  ;-)
> 
> Ah, so for models I would go absolutely minimal; it helps understand
> what the strict requirements are and where we over-provide etc..

Agreed, we should be careful how much we promise lest we get locked
into needlessly heavyweight implementations.

> For actual code you're entirely right, there's no point in trying to be
> cute with the rcu-node locks. Simplicity rules etc..

Hear, hear!  ;-)

Thanx, Paul



Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.

2017-10-09 Thread Levin, Alexander (Sasha Levin)
On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
>On Mon, Oct 09, 2017 at 08:26:34PM +, Levin, Alexander (Sasha Levin) wrote:
>> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
>> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
>> >> Fix BUG() calls to use BUG_ON(conditional) macros.
>> >>
>> >> This was found using make coccicheck M=net/core on linux next
>> >> tag next-2017092
>> >>
>> >> Signed-off-by: Tim Hansen 
>> >> ---
>> >>  net/core/skbuff.c | 15 ++-
>> >>  1 file changed, 6 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
>> >> --- a/net/core/skbuff.c
>> >> +++ b/net/core/skbuff.c
>> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, 
>> >> gfp_t gfp_mask)
>> >>   /* Set the tail pointer and length */
>> >>   skb_put(n, skb->len);
>> >>
>> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
>> >> - BUG();
>> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
>> >
>> >I'm concerned with this change.
>> >1. Calling non-trivial bit of code inside the macro is a poor coding style 
>> >(imo)
>> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and 
>> >implementation
>> >of BUG and BUG_ON look quite different.
>>
>> For these archs, wouldn't it then be more efficient to use BUG_ON rather 
>> than BUG()?
>
>why more efficient? any data to prove that?

Just guessing.

Either way, is there a particular reason for not using BUG_ON() here
besides that it's implementation is "quite different"?

>I'm pointing that the change is not equivalent and
>this code has been around forever (pre-git days), so I see
>no reason to risk changing it.

Do you know that BUG_ON() is broken on any archs?

If not, "this code has been around forever" is really not an excuse to
not touch code.

If BUG_ON() behavior is broken somewhere, then it needs to get fixed.

-- 

Thanks,
Sasha

Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.

2017-10-09 Thread David Miller
From: Alexei Starovoitov 
Date: Mon, 9 Oct 2017 16:06:20 -0700

>> For these archs, wouldn't it then be more efficient to use BUG_ON
>> rather than BUG()?
> 
> why more efficient? any data to prove that?

It can completely eliminate a branch.

For example on powerpc if you use BUG() then the code generated is:

test condition
branch_not_true 1f
unconditional_trap
1:

Whereas with BUG_ON() it's just:

test condition
trap_if_true

Which is a lot better even when the branches in the first case are
well predicted.


Re: [PATCH v4 1/2] pid: Replace pid bitmap implementation with IDR API

2017-10-09 Thread Andrew Morton
On Mon,  9 Oct 2017 17:13:43 -0400 Gargi Sharma  wrote:

> This patch replaces the current bitmap implemetation for
> Process ID allocation. Functions that are no longer required,
> for example, free_pidmap(), alloc_pidmap(), etc. are removed.
> The rest of the functions are modified to use the IDR API.
> The change was made to make the PID allocation less complex by
> replacing custom code with calls to generic API.
> 

I can't say I really understand the locking here.

>
> ...
>
> @@ -240,17 +230,11 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
>*
>*/
>   read_lock(&tasklist_lock);
> - nr = next_pidmap(pid_ns, 1);
> - while (nr > 0) {
> - rcu_read_lock();
> -
> - task = pid_task(find_vpid(nr), PIDTYPE_PID);
> + nr = 2;
> + idr_for_each_entry_continue(&pid_ns->idr, pid, nr) {
> + task = pid_task(pid, PIDTYPE_PID);
>   if (task && !__fatal_signal_pending(task))
>   send_sig_info(SIGKILL, SEND_SIG_FORCED, task);
> -
> - rcu_read_unlock();
> -
> - nr = next_pidmap(pid_ns, nr);
>   }
>   read_unlock(&tasklist_lock);

Especially here.  I don't think pidmap_lock is held.  Is that IDR
iteration safe?




Re: linux-next: manual merge of the drivers-x86 tree with the net-next tree

2017-10-09 Thread Darren Hart
On Mon, Oct 09, 2017 at 08:56:34PM +0100, Mark Brown wrote:
> On Mon, Oct 09, 2017 at 10:43:01PM +0300, Mika Westerberg wrote:
> 
> > If possible, I would rather move this chapter to be before "Networking
> > over Thunderbolt cable". Reason is that it then follows NVM flashing
> > chapter which is typically where you need to force power in the first
> > place.
> 

Agreed.

> I guess that's something best sorted out either in the relevant trees or
> during the merge window?

I'm not sure how we would deal with it in the trees. Best to note this during
the merge window - whichever goes in second. Test merge will identify the merge
conflict, and we can include a note to Linus on the preference.

-- 
Darren Hart
VMware Open Source Technology Center


Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.

2017-10-09 Thread Alexei Starovoitov
On Mon, Oct 09, 2017 at 11:15:40PM +, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 08:26:34PM +, Levin, Alexander (Sasha Levin) 
> >wrote:
> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >> >>
> >> >> This was found using make coccicheck M=net/core on linux next
> >> >> tag next-2017092
> >> >>
> >> >> Signed-off-by: Tim Hansen 
> >> >> ---
> >> >>  net/core/skbuff.c | 15 ++-
> >> >>  1 file changed, 6 insertions(+), 9 deletions(-)
> >> >>
> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> >> --- a/net/core/skbuff.c
> >> >> +++ b/net/core/skbuff.c
> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff 
> >> >> *skb, gfp_t gfp_mask)
> >> >> /* Set the tail pointer and length */
> >> >> skb_put(n, skb->len);
> >> >>
> >> >> -   if (skb_copy_bits(skb, -headerlen, n->head, headerlen + 
> >> >> skb->len))
> >> >> -   BUG();
> >> >> +   BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + 
> >> >> skb->len));
> >> >
> >> >I'm concerned with this change.
> >> >1. Calling non-trivial bit of code inside the macro is a poor coding 
> >> >style (imo)
> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and 
> >> >implementation
> >> >of BUG and BUG_ON look quite different.
> >>
> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather 
> >> than BUG()?
> >
> >why more efficient? any data to prove that?
> 
> Just guessing.
> 
> Either way, is there a particular reason for not using BUG_ON() here
> besides that it's implementation is "quite different"?
> 
> >I'm pointing that the change is not equivalent and
> >this code has been around forever (pre-git days), so I see
> >no reason to risk changing it.
> 
> Do you know that BUG_ON() is broken on any archs?
> 
> If not, "this code has been around forever" is really not an excuse to
> not touch code.
> 
> If BUG_ON() behavior is broken somewhere, then it needs to get fixed.

no idea whether it's broken. My main objection is #1.
imo it's a very poor coding style to put functions with
side-effects into macros. Especially debug/bug/warn-like.
For example llvm has DEBUG() macro and everything inside
will disappear depending on compilation flags.
I wouldn't be surprised if somebody for the name
of security (to avoid crash on BUG_ON) will replace
BUG/BUG_ON with some other implementation or nop
and will have real bugs, since skb_copy_bits() is somehow
not called or called in different context.



Re: [locking] 892ad5acca [ 12.849409] WARNING: bad unlock balance detected!

2017-10-09 Thread Linus Torvalds
On Mon, Oct 9, 2017 at 8:55 AM, Fengguang Wu  wrote:
>
> 0day kernel testing robot got the below dmesg and the first bad commit is

This makes no sense.

> [   12.849409] WARNING: bad unlock balance detected!
> [   12.850157] 4.12.0-00420-g892ad5a #1 Not tainted
> [   12.850870] -
> [   12.851574] rhashtable_thra/109 is trying to release lock 
> (rcu_preempt_state) at:
> [   12.852701] [] rcu_read_unlock_special+0x3de/0x4a7
> [   12.853707] but there are no more locks to release!

The only RCU release is in the inlined rhashtable_lookup_fast(), but
that one looks trivially and obviously correct. It literally just does

rcu_read_lock();
obj = rhashtable_lookup(ht, key, params);
rcu_read_unlock();

> [   12.864927]  rcu_read_unlock_special+0x3de/0x4a7
> [   12.865660]  __rcu_read_unlock+0x62/0x9f
> [   12.866262]  thread_lookup_test+0x400/0x50b
> [   12.866911]  threadfunc+0x84f/0x8e0

.. so I'm wondering if there's something wrong in the test harness, or
some *serious* issue that causes RCU state to do the wrong thing over
preemption or with unlucky interrupt timing.

Because that thread_lookup_test() thing really is so trivial as to say
the problem is not in the caller.

Adding Paul to the cc list. Was there some known problem during the
4.13 merge window that could have caused this?

The fact that it came in with the locking merge might also just mean
that it's a bug in the locking balance detection, presumably
interacting badly with something else.

There *is* another email (from Sebastian Andrzej Siewior) that shows
that same "WARNING: bad unlock balance detected", with the subject
line "RCU boosting and lockdep are not playing nice", and that one
blames commit cde50a67397c ("locking/rtmutex: Don't initialize lockdep
when not required").

Maybe this was already fixed elsewhere, but I'm not seeing any replies
to that other email either (it was send back in mid-August).

Oh, and I see that the 0day robot also fingered that commit
cde50a67397c in _another_ report, so it's just this one that seems to
have oddly bisected to the locking merge. Presumably it's all the same
thing, just in two different guises.

Linus


Re: [PATCH 2/6] ARM: sun8i: r40: add USB host port nodes for R40

2017-10-09 Thread Icenowy Zheng


于 2017年10月10日 GMT+08:00 上午5:03:40, Maxime Ripard 
 写到:
>On Sun, Oct 08, 2017 at 04:29:02AM +, Icenowy Zheng wrote:
>> From: Icenowy Zheng 
>> 
>> Allwinner R40 SoC features a USB OTG port and two USB HOST ports.
>> 
>> Add support for the host ports in the DTSI file.
>> 
>> The OTG controller still cannot work with existing compatibles, and
>needs
>> more investigation. So it's not added yet.
>> 
>> Signed-off-by: Icenowy Zheng 
>> ---
>>  arch/arm/boot/dts/sun8i-r40.dtsi | 78
>
>>  1 file changed, 78 insertions(+)
>> 
>> diff --git a/arch/arm/boot/dts/sun8i-r40.dtsi
>b/arch/arm/boot/dts/sun8i-r40.dtsi
>> index d5a6745409ae..f6c917cbbaac 100644
>> --- a/arch/arm/boot/dts/sun8i-r40.dtsi
>> +++ b/arch/arm/boot/dts/sun8i-r40.dtsi
>> @@ -173,6 +173,84 @@
>>  #size-cells = <0>;
>>  };
>>  
>> +usbphy: phy@1c13400 {
>> +compatible = "allwinner,sun8i-r40-usb-phy";
>> +reg = <0x01c13400 0x14>,
>> +  <0x01c14800 0x4>,
>> +  <0x01c19800 0x4>,
>> +  <0x01c1c800 0x4>;
>> +reg-names = "phy_ctrl",
>> +"pmu0",
>> +"pmu1",
>> +"pmu2";
>> +clocks = <&ccu CLK_USB_PHY0>,
>> + <&ccu CLK_USB_PHY1>,
>> + <&ccu CLK_USB_PHY2>;
>> +clock-names = "usb0_phy",
>> +  "usb1_phy",
>> +  "usb2_phy";
>> +resets = <&ccu RST_USB_PHY0>,
>> + <&ccu RST_USB_PHY1>,
>> + <&ccu RST_USB_PHY2>;
>> +reset-names = "usb0_reset",
>> +  "usb1_reset",
>> +  "usb2_reset";
>> +status = "disabled";
>> +#phy-cells = <1>;
>> +};
>> +
>> +ehci1: usb@1c19000 {
>> +compatible = "allwinner,sun8i-r40-ehci", "generic-ehci";
>> +reg = <0x01c19000 0x100>;
>
>What is the actual size here?

The OHCI/EHCI/PHY-PHY three parts are listed in the user manual
as one MMIO zone.

The size can be at most 0x400, as the OHCI is at offset 0x400.

>
>> +interrupts = ;
>> +clocks = <&ccu CLK_BUS_OHCI1>,
>> + <&ccu CLK_BUS_EHCI1>,
>> + <&ccu CLK_USB_OHCI1>;
>> +resets = <&ccu RST_BUS_OHCI1>,
>> + <&ccu RST_BUS_EHCI1>;
>
>Why do you need to take the OHCI resources too?

AW's strange design -- without OHCI resources taken EHCI
won't work.

>
>Maxime


Re: [PATCH 3/6] ARM: sun8i: r40: add 5V regulator for Banana Pi M2 Ultra

2017-10-09 Thread Icenowy Zheng


于 2017年10月10日 GMT+08:00 上午5:04:07, Maxime Ripard 
 写到:
>On Sun, Oct 08, 2017 at 04:29:03AM +, Icenowy Zheng wrote:
>> On newer revisions of the Banana Pi M2 Ultra boards, the 5V power
>output
>> (used by HDMI, SATA and USB) is controller via a GPIO.
>> 
>> Add the regulator node for it.
>> 
>> Older revisions just have the 5V power output always on, and the GPIO
>is
>> reserved on these boards. So it won't affect the older revisions.
>> 
>> Signed-off-by: Icenowy Zheng 
>> ---
>>  arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts | 9 +
>>  1 file changed, 9 insertions(+)
>> 
>> diff --git a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
>b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
>> index 7b52608cebe6..035599d870b9 100644
>> --- a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
>> +++ b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
>> @@ -78,6 +78,15 @@
>>  };
>>  };
>>  
>> +reg_vcc5v0: vcc5v0 {
>> +compatible = "regulator-fixed";
>> +regulator-name = "vcc5v0";
>> +regulator-min-microvolt = <500>;
>> +regulator-max-microvolt = <500>;
>> +gpio = <&pio 7 23 GPIO_ACTIVE_HIGH>; /* PH23 */
>> +enable-active-high;
>
>This is redundant with the GPIO flag

I remember someone suggested me to add this
property (maybe wens). It's said that the driver may not
process the GPIO_ACTIVE_HIGH correctly.

>
>Maxime


  1   2   3   4   5   6   7   8   9   10   >