date:20160214

Re: [PATCH 1/3] tpm: Hold the kref during tpm_chip_find_get

2016-02-14 Thread Jarkko Sakkinen

On Sat, Feb 13, 2016 at 11:50:08PM -0700, Jason Gunthorpe wrote:
> On Sun, Feb 14, 2016 at 06:55:12AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Feb 12, 2016 at 05:04:29PM -0700, Jason Gunthorpe wrote:
> > > This was missed during the struct device conversion, we
> > > need to hold a kref on the chip to make sure it isn't freed.
> > > 
> > > Signed-off-by: Jason Gunthorpe 
> > 
> > I'm bit confused about this patch. What is the regression if this
> > needs
> 
> The patch is simply totally broken, the placement of the get_device is
> wrong:
> 
> > > @@ -53,6 +53,8 @@ struct tpm_chip *tpm_chip_find_get(int chip_num)
> > >   chip = pos;
> > >   break;
> > >   }
> > > +
> > > + get_device(&chip->dev);
> 
> It needs to be moved up two lines before the break, into the if
> statement.

Right.

> As for the urgency - today the tpm core relies on module locking to
> try and prevent tpm_chip_unregister from racing with stuff like the
> above. That is totally broken in modern kernels, but it is what the
> core tries to do. Within that framework the get/put are not needed
> because of the module locking.

Right, because that gives the guarantee that device has refcount of
at least one.

> The only time these additional get/put do anything is when we are
> racing with tpm_unregister, but if we are racing with unregister then
> there are much bigger problems and things will crash anyhow.
> 
> So, this patch is just a tiny step.
> 
> The revised version of this patch with the rw_sem attempts to address
> the complete race.

Got it. Yeah, I'll drop this from my next pull request. Thanks for
the explanation.

> Jason

/Jarkko

Re: [PATCH 2/3] tpm: Get rid of chip->pdev

2016-02-14 Thread Jarkko Sakkinen

On Sat, Feb 13, 2016 at 11:57:24PM -0700, Jason Gunthorpe wrote:
> On Sun, Feb 14, 2016 at 07:24:14AM +0200, Jarkko Sakkinen wrote:
> > > This should take care of it for all drivers including vtpm.
> > > 
> > > https://github.com/jgunthorpe/linux/commits/for-jarkko
> > > 
> > > At the very least this turns silent use after free into a null pointer
> > > oops.
> > > 
> > > We should also discuss if we want to continue to have the driver
> > > module locked while /dev/tpmX is open, that is no longer needed for
> > > corectness.
> > 
> > I'm happy the patch that was sent before although I didn't give it
> > Reviewed-by because it had couple of style errors. If those two
> > style errors are the *only* issues I can fix up them.
> 
> This patch replaces the get/put_device patch entirely, if Stefan is
> happy with it we can just go ahead in this direction for 4.6
> 
> There was also a 0day build error on the devname patch, so the whole
> series will be reposted.

Perfect, thank you.

> Jason

/Jarkko

Re: [PATCH 1/2] clk: add device tree binding for artpec-6 pll1 clock

2016-02-14 Thread Lars Persson




On 02/12/2016 05:39 PM, Rob Herring wrote:

On Thu, Feb 11, 2016 at 05:01:03PM +0100, Lars Persson wrote:

Add device tree documentation for the main PLL in the Artpec-6 SoC.

Roughly how many clocks does this SoC have?
It will have 17 clocks declared in the device tree and three 
SoC-specific clock drivers.





Signed-off-by: Lars Persson 
---
  Documentation/devicetree/bindings/clock/artpec6.txt | 16 
  1 file changed, 16 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/clock/artpec6.txt

diff --git a/Documentation/devicetree/bindings/clock/artpec6.txt 
b/Documentation/devicetree/bindings/clock/artpec6.txt
new file mode 100644
index 000..521fec8
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/artpec6.txt
@@ -0,0 +1,16 @@
+* Clock bindings for Axis ARTPEC-6 chip
+
+Required properties:
+- #clock-cells: Should be <0>
+- compatible: Should be "axis,artpec6-pll1-clock"
+- reg: Address and length of the DEVSTAT register.
+- clocks: The PLL's input clock.
+
+Examples:
+
+pll1_clk: pll1_clk {
+   #clock-cells = <0>;
+   compatible = "axis,artpec6-pll1-clock";
+   reg = <0xf800 4>;
+   clocks = <&ext_clk>;
+};
--
2.1.4

Re: [PATCH V3 06/10] irqchip, GICv3, ITS: Refator ITS dt init code to prepare for ACPI.

2016-02-14 Thread Hanjun Guo


On 2016/2/10 18:47, Marc Zyngier wrote:

On 19/01/16 13:11, Tomasz Nowicki wrote:

Similarly to GICv3 core, we need to extract common code before adding
ACPI support. No functional changes.

Signed-off-by: Hanjun Guo 
Signed-off-by: Tomasz Nowicki 
---
  drivers/irqchip/irq-gic-v3-its.c   | 82 +++---
  drivers/irqchip/irq-gic-v3.c   |  6 +--
  include/linux/irqchip/arm-gic-v3.h |  2 +-
  3 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 2bbed18..fecb7a6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -813,7 +813,7 @@ static void its_free_tables(struct its_node *its)
}
  }

-static int its_alloc_tables(const char *node_name, struct its_node *its)
+static int its_alloc_tables(struct its_node *its)
  {
int err;
int i;
@@ -868,8 +868,8 @@ static int its_alloc_tables(const char *node_name, struct 
its_node *its)
order);
if (order >= MAX_ORDER) {
order = MAX_ORDER - 1;
-   pr_warn("%s: Device Table too large, reduce its page 
order to %u\n",
-   node_name, order);
+   pr_warn("ITS@0x%lx: Device Table too large, reduce 
its page order to %u\n",
+   its->phys_base, order);
}
}

@@ -878,8 +878,8 @@ static int its_alloc_tables(const char *node_name, struct 
its_node *its)
if (alloc_pages > GITS_BASER_PAGES_MAX) {
alloc_pages = GITS_BASER_PAGES_MAX;
order = get_order(GITS_BASER_PAGES_MAX * psz);
-   pr_warn("%s: Device Table too large, reduce its page order 
to %u (%u pages)\n",
-   node_name, order, alloc_pages);
+   pr_warn("ITS@0x%lx: Device Table too large, reduce its page 
order to %u (%u pages)\n",
+   its->phys_base, order, alloc_pages);
}

base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
@@ -948,8 +948,8 @@ retry_baser:
}

if (val != tmp) {
-   pr_err("ITS: %s: GITS_BASER%d doesn't stick: %lx %lx\n",
-  node_name, i,
+   pr_err("ITS@0x%lx: GITS_BASER%d doesn't stick: %lx 
%lx\n",
+  its->phys_base, i,
   (unsigned long) val, (unsigned long) tmp);
err = -ENXIO;
goto out_free;
@@ -1424,10 +1424,11 @@ static void its_enable_quirks(struct its_node *its)
gic_enable_quirks(iidr, its_quirks, its);
  }

-static int __init its_probe(struct device_node *node,
-   struct irq_domain *parent)
+static int __init its_probe_one(phys_addr_t phys_base, unsigned long size,
+   struct irq_domain *parent,
+   bool is_msi_controller,


I really question the fact that you are keeping this msi_controller
thing. Let's face it: if this is not an MSI controller, then the whole
thing is absolutely pointless.

So I'd rather you simplify the whole in a separate patch, and just don't
bother initializing the ITS if it cannot be used for MSIs.


Agree, that will simplify the code a lot.

Thanks
Hanjun

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Michael Welling

On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git chardev
> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a userspace 
> chardev ABI for GPIOs")
> 
> 
> [1.951191] user_verbs: couldn't register device number

Looks like user_verbs is using a static device node setup.

enum {
IB_UVERBS_MAJOR   = 231,
IB_UVERBS_BASE_MINOR  = 192,
IB_UVERBS_MAX_DEVICES = 32
};

#define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR)

Something tells me that a new GPIO chardev is taking this spot.

It looks like the device is documented to be using the range:
https://www.kernel.org/doc/Documentation/devices.txt

Could you run cat /proc/devices?

> [1.952527] ucm: couldn't register device number
> 
> 
> Thanks,
> Ying Huang

> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/x86_64 4.5.0-rc1 Kernel Configuration
> #
> CONFIG_64BIT=y
> CONFIG_X86_64=y
> CONFIG_X86=y
> CONFIG_INSTRUCTION_DECODER=y
> CONFIG_PERF_EVENTS_INTEL_UNCORE=y
> CONFIG_OUTPUT_FORMAT="elf64-x86-64"
> CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_MMU=y
> CONFIG_ARCH_MMAP_RND_BITS_MIN=28
> CONFIG_ARCH_MMAP_RND_BITS_MAX=32
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
> CONFIG_NEED_DMA_MAP_STATE=y
> CONFIG_NEED_SG_DMA_LENGTH=y
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_GENERIC_BUG=y
> CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
> CONFIG_GENERIC_HWEIGHT=y
> CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_GENERIC_CALIBRATE_DELAY=y
> CONFIG_ARCH_HAS_CPU_RELAX=y
> CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
> CONFIG_HAVE_SETUP_PER_CPU_AREA=y
> CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> CONFIG_ARCH_HIBERNATION_POSSIBLE=y
> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
> CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
> CONFIG_ZONE_DMA32=y
> CONFIG_AUDIT_ARCH=y
> CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_X86_64_SMP=y
> CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi 
> -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 
> -fcall-saved-r10 -fcall-saved-r11"
> CONFIG_ARCH_SUPPORTS_UPROBES=y
> CONFIG_FIX_EARLYCON_MEM=y
> CONFIG_PGTABLE_LEVELS=4
> CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
> CONFIG_IRQ_WORK=y
> CONFIG_BUILDTIME_EXTABLE_SORT=y
> 
> #
> # General setup
> #
> CONFIG_INIT_ENV_ARG_LIMIT=32
> CONFIG_CROSS_COMPILE=""
> # CONFIG_COMPILE_TEST is not set
> CONFIG_LOCALVERSION=""
> CONFIG_LOCALVERSION_AUTO=y
> CONFIG_HAVE_KERNEL_GZIP=y
> CONFIG_HAVE_KERNEL_BZIP2=y
> CONFIG_HAVE_KERNEL_LZMA=y
> CONFIG_HAVE_KERNEL_XZ=y
> CONFIG_HAVE_KERNEL_LZO=y
> CONFIG_HAVE_KERNEL_LZ4=y
> CONFIG_KERNEL_GZIP=y
> # CONFIG_KERNEL_BZIP2 is not set
> # CONFIG_KERNEL_LZMA is not set
> # CONFIG_KERNEL_XZ is not set
> # CONFIG_KERNEL_LZO is not set
> # CONFIG_KERNEL_LZ4 is not set
> CONFIG_DEFAULT_HOSTNAME="(none)"
> CONFIG_SWAP=y
> CONFIG_SYSVIPC=y
> CONFIG_SYSVIPC_SYSCTL=y
> CONFIG_POSIX_MQUEUE=y
> CONFIG_POSIX_MQUEUE_SYSCTL=y
> CONFIG_CROSS_MEMORY_ATTACH=y
> CONFIG_FHANDLE=y
> # CONFIG_USELIB is not set
> # CONFIG_AUDIT is not set
> CONFIG_HAVE_ARCH_AUDITSYSCALL=y
> 
> #
> # IRQ subsystem
> #
> CONFIG_GENERIC_IRQ_PROBE=y
> CONFIG_GENERIC_IRQ_SHOW=y
> CONFIG_GENERIC_PENDING_IRQ=y
> CONFIG_GENERIC_IRQ_CHIP=y
> CONFIG_IRQ_DOMAIN=y
> CONFIG_IRQ_DOMAIN_HIERARCHY=y
> CONFIG_GENERIC_MSI_IRQ=y
> CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
> CONFIG_IRQ_DOMAIN_DEBUG=y
> CONFIG_IRQ_FORCED_THREADING=y
> CONFIG_SPARSE_IRQ=y
> CONFIG_CLOCKSOURCE_WATCHDOG=y
> CONFIG_ARCH_CLOCKSOURCE_DATA=y
> CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
> CONFIG_GENERIC_TIME_VSYSCALL=y
> CONFIG_GENERIC_CLOCKEVENTS=y
> CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
> CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
> CONFIG_GENERIC_CMOS_UPDATE=y
> 
> #
> # Timers subsystem
> #
> CONFIG_HZ_PERIODIC=y
> # CONFIG_NO_HZ_IDLE is not set
> # CONFIG_NO_HZ_FULL is not set
> CONFIG_NO_HZ=y
> # CONFIG_HIGH_RES_TIMERS is not set
> 
> #
> # CPU/Task time and stats accounting
> #
> # CONFIG_TICK_CPU_ACCOUNTING is not set
> # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
> CONFIG_IRQ_TIME_ACCOUNTING=y
> # CONFIG_BSD_PROCESS_ACCT is not set
> # CONFIG_TASKSTATS is not set
> 
> #
> # RCU Subsystem
> #
> CONFIG_TREE_RCU=y
> # CONFIG_RCU_EXPERT is not set
> CONFIG_SRCU=y
> # CONFIG_TASKS_RCU is not set
> CONFIG_RCU_STALL_COMMON=y
> # CONFIG_TREE_RCU_TRACE is not set
> # CONFIG_RCU_EXPEDITE_BOOT is not set
> CONFIG_BUILD_BIN2C=y
> CONFIG_IKCONFIG=y
> # CONFIG_IKCONFIG_PROC is not set
> CONFIG_LOG_BUF_SHIFT=17
> CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
> CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
> CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
> CONFIG_ARCH_SUPPORTS_INT128=y
> # CONFIG_NUMA

Re: [PATCH v3 3/7] debugfs: add support for self-protecting attribute file fops

2016-02-14 Thread Nicolai Stange

Julia Lawall  writes:

> On Sun, 14 Feb 2016, Nicolai Stange wrote:
>
>> In order to protect them against file removal issues, debugfs_create_file()
>> creates a lifetime managing proxy around each struct file_operations
>> handed in.
>> 
>> In cases where this struct file_operations is able to manage file lifetime
>> by itself already, the proxy created by debugfs is a waste of resources.
>> 
>> The most common class of struct file_operations given to debugfs are those
>> defined by means of the DEFINE_SIMPLE_ATTRIBUTE() macro.
>> 
>> Introduce a DEFINE_DEBUGFS_ATTRIBUTE() macro to allow any
>> struct file_operations of this class to be easily made file lifetime aware
>> and thus, to be operated unproxied.
>> 
>> Specifically, introduce debugfs_attr_read() and debugfs_attr_write()
>> which wrap simple_attr_read() and simple_attr_write() under the protection
>> of a debugfs_use_file_start()/debugfs_use_file_finish() pair.
>> 
>> Make DEFINE_DEBUGFS_ATTRIBUTE() set the defined struct file_operations'
>> ->read() and ->write() members to these wrappers.
>> 
>> Export debugfs_create_file_unsafe() in order to allow debugfs users to
>> create their files in non-proxying operation mode.
>> 
>> Finally, add a Coccinelle script chasing down possible candidates
>> for a DEFINE_SIMPLE_ATTRIBUTE()/debugfs_create_file() to
>> DEFINE_DEBUGFS_ATTRIBUTE()/debugfs_create_file_unsafe() migration.
>> 
>> Signed-off-by: Nicolai Stange 
>> ---
>>  fs/debugfs/file.c  | 28 +
>>  fs/debugfs/inode.c | 28 +
>>  include/linux/debugfs.h| 26 +
>>  .../api/debugfs/debugfs_simple_attr.cocci  | 68 
>> ++
>
> Shouldn't the .cocci file be in a different patch, since it has a 
> different maintainer?

Certainly. I'll split this off in v4. Before resending I'll wait for
other reviews though.

Is the .cocci file itself Ok in that it matches the expected
style/conventions?

Thank you,

Nicolai

>>  4 files changed, 150 insertions(+)
>>  create mode 100644 scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci
>> 
>> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
>> index f638dbc..2da5fb0 100644
>> --- a/fs/debugfs/file.c
>> +++ b/fs/debugfs/file.c
>> @@ -285,6 +285,34 @@ const struct file_operations 
>> debugfs_full_proxy_file_operations = {
>>  .open = full_proxy_open,
>>  };
>>  
>> +ssize_t debugfs_attr_read(struct file *file, char __user *buf,
>> +size_t len, loff_t *ppos)
>> +{
>> +ssize_t ret;
>> +int srcu_idx;
>> +
>> +ret = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
>> +if (likely(!ret))
>> +ret = simple_attr_read(file, buf, len, ppos);
>> +debugfs_use_file_finish(srcu_idx);
>> +return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(debugfs_attr_read);
>> +
>> +ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
>> + size_t len, loff_t *ppos)
>> +{
>> +ssize_t ret;
>> +int srcu_idx;
>> +
>> +ret = debugfs_use_file_start(F_DENTRY(file), &srcu_idx);
>> +if (likely(!ret))
>> +ret = simple_attr_write(file, buf, len, ppos);
>> +debugfs_use_file_finish(srcu_idx);
>> +return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(debugfs_attr_write);
>> +
>>  static struct dentry *debugfs_create_mode(const char *name, umode_t mode,
>>struct dentry *parent, void *value,
>>const struct file_operations *fops,
>> diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
>> index 42a9b34..f95e355 100644
>> --- a/fs/debugfs/inode.c
>> +++ b/fs/debugfs/inode.c
>> @@ -368,6 +368,33 @@ struct dentry *debugfs_create_file(const char *name, 
>> umode_t mode,
>>  }
>>  EXPORT_SYMBOL_GPL(debugfs_create_file);
>>  
>> +/**
>> + * debugfs_create_file_unsafe - create a file in the debugfs filesystem
>> + * @name: a pointer to a string containing the name of the file to create.
>> + * @mode: the permission that the file should have.
>> + * @parent: a pointer to the parent dentry for this file.  This should be a
>> + *  directory dentry if set.  If this parameter is NULL, then the
>> + *  file will be created in the root of the debugfs filesystem.
>> + * @data: a pointer to something that the caller will want to get to later
>> + *on.  The inode.i_private pointer will point to this value on
>> + *the open() call.
>> + * @fops: a pointer to a struct file_operations that should be used for
>> + *this file.
>> + *
>> + * debugfs_create_file_unsafe() is completely analogous to
>> + * debugfs_create_file(), the only difference being that the fops
>> + * handed it will not get protected against file removals by the
>> + * debugfs core.
>> + *
>> + * It is your responsibility to protect your struct file_operation
>> + * methods against file removals by means of debugfs_use_file_start()
>> + * and debugfs_

Re: [PATCH v2] ARM: dts: vfxxx: Add iio_hwmon node for ADC temperature channel

2016-02-14 Thread Shawn Guo

On Fri, Feb 12, 2016 at 05:53:00PM +0530, Sanchayan Maity wrote:
> Add iio_hwmon node to expose the temperature channel on Vybrid as
> hardware monitor device using the iio_hwmon driver.
> 
> Signed-off-by: Sanchayan Maity 
> ---
> 
> Hello,
> 
> The first version of the patch was send quite a while ago.
> https://lkml.org/lkml/2015/9/16/932
> 
> Shawn you had requested that hyphen rather than underscore should
> be used in node name. I looked into that.
> 
> The iio_hwmon driver calls hwmon_device register_with_groups inside
> hwmon.c and this
> http://lxr.free-electrons.com/source/drivers/hwmon/hwmon.c#L103
> 
> does not allow hyphen in hwmon name attribute. I was not aware of
> this but while trying to test the change, the device probe failed
> with EINVAL. I think we should stick to the existing use of the
> bindings or we need to change the hwmon code as well along with the
> existing device tree files and binding documentation.

I disagree.

If hyphen is invalid to be part of hwmon name attribute, the following
code in iio_hwmon_probe() is plain wrong, because hyphen is very valid
to be part of node names in device tree.

if (dev->of_node && dev->of_node->name)
name = dev->of_node->name;

Shawn

> 
> Changes since v1:
> 1. Expose ADC1 temperature channel as well
> 2. Move the entry outside of the aips1 bus node
> 
> Best Regards,
> Sanchayan Maity.
> ---
>  arch/arm/boot/dts/vfxxx.dtsi | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/vfxxx.dtsi b/arch/arm/boot/dts/vfxxx.dtsi
> index a5f07e3..8ed8e47 100644
> --- a/arch/arm/boot/dts/vfxxx.dtsi
> +++ b/arch/arm/boot/dts/vfxxx.dtsi
> @@ -673,5 +673,10 @@
>   status = "disabled";
>   };
>   };
> +
> + iio_hwmon {
> + compatible = "iio-hwmon";
> + io-channels = <&adc0 16>, <&adc1 16>;
> + };
>   };
>  };
> -- 
> 2.7.1
> 
>

Re: [PATCH] ARM: dts: ls2080a: Add quirk for Erratum A009116

2016-02-14 Thread Shawn Guo

On Tue, Feb 09, 2016 at 05:08:07PM -0600, Lijun Pan wrote:
> Add "snps,quirk-frame-length-adjustment" property to
> USB3 node for erratum A009116. This property provides
> value of GFLADJ_30MHZ for post silicon frame length
> adjustment.
> 
> Signed-off-by: Lijun Pan 
> ---
>  arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |2 ++
>  1 file changed, 2 insertions(+)

Forgot to mention that arm64 patches use a different subject prefix from
ARM ones.  I changed the patch subject to "arm64: dts: ..." when
applying.

Shawn

> 
> diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi 
> b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> index 6e9e033..fa506f5 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> +++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> @@ -501,6 +501,7 @@
>   reg = <0x0 0x310 0x0 0x1>;
>   interrupts = <0 80 0x4>; /* Level high type */
>   dr_mode = "host";
> + snps,quirk-frame-length-adjustment = <0x20>;
>   };
>  
>   usb1: usb3@311 {
> @@ -509,6 +510,7 @@
>   reg = <0x0 0x311 0x0 0x1>;
>   interrupts = <0 81 0x4>; /* Level high type */
>   dr_mode = "host";
> + snps,quirk-frame-length-adjustment = <0x20>;
>   };
>  
>   ccn@400 {
> -- 
> 1.7.9.5
> 
>

Re: Another proposal for DAX fault locking

2016-02-14 Thread Boaz Harrosh

On 02/11/2016 12:38 PM, Jan Kara wrote:
> On Wed 10-02-16 19:38:21, Boaz Harrosh wrote:
>> On 02/09/2016 07:24 PM, Jan Kara wrote:
>>> Hello,
>>>
<>
>>>
>>> DAX will have an array of mutexes (the array can be made per device but
>>> initially a global one should be OK). We will use mutexes in the array as a
>>> replacement for page lock - we will use hashfn(mapping, index) to get
>>> particular mutex protecting our offset in the mapping. On fault / page
>>> mkwrite, we'll grab the mutex similarly to page lock and release it once we
>>> are done updating page tables. This deals with races in [1]. When flushing
>>> caches we grab the mutex before clearing writeable bit in page tables
>>> and clearing dirty bit in the radix tree and drop it after we have flushed
>>> caches for the pfn. This deals with races in [2].
>>>
>>> Thoughts?
>>>
>>
>> You could also use one of the radix-tree's special-bits as a bit lock.
>> So no need for any extra allocations.
> 
> Yes and I've suggested that once as well. But since we need sleeping
> locks, you need some wait queues somewhere as well. So some allocations are
> going to be needed anyway. 

They are already sleeping locks and there are all the proper "wait queues"
in place. I'm talking about
   lock:
err = wait_on_bit_lock(&some_long, SOME_BIT_LOCK, ...);
and
   unlock:
WARN_ON(!test_and_clear_bit(SOME_BIT_LOCK, &some_long));
wake_up_bit(&some_long, SOME_BIT_LOCK);

> And mutexes have much better properties than

Just saying that page-locks are implemented just this way these days
so it is the performance and characteristics we already know.
(You are replacing page locks, no?)

> bit-locks so I prefer mutexes over cramming bit locks into radix tree. Plus
> you'd have to be careful so that someone doesn't remove the bit from the
> radix tree while you are working with it.
> 

Sure! need to be careful, is our middle name.

That said. Is your call. Thank you for working on this. Your plan sounds
very good as well, and is very much needed, because DAX's mmap performance
success right now.
[Maybe one small enhancement perhaps allocate an array of mutexes per NUMA
 node and access the proper array through numa_node_id()]

>   Honza
> 

Thanks
Boaz

Re: [PATCH] ARM: exynos: clarify KEYBOARD_SAMSUNG selection

2016-02-14 Thread Krzysztof Kozlowski

W dniu 12.02.2016 o 00:30, Arnd Bergmann pisze:
> The samsung-keypad driver is implicitly selected by ARCH_EXYNOS4 (why?),
> but this fails if CONFIG_INPUT is a loadable module:

How about removing the select entirely and adding it in defconfigs? It
was introduced by 49b999711ee7 ("ARM: EXYNOS: change HAVE_SAMSUNG_KEYPAD
to KEYBOARD_SAMSUNG") which looks like a mistake. The intention was to
indicate a HAVE, not to select a driver.

Moreover the Exynos3250 also has keypad but it is not selected.

Can you send a patch removing select and changing exynos+multi_v7
defconfigs?

Best regards,
Krzysztof

> 
> drivers/input/built-in.o: In function `samsung_keypad_remove':
> drivers/input/keyboard/samsung-keypad.c:461: undefined reference to 
> `input_unregister_device'
> drivers/input/built-in.o: In function `samsung_keypad_irq':
> drivers/input/keyboard/samsung-keypad.c:137: undefined reference to 
> `input_event'
> drivers/input/built-in.o: In function `samsung_keypad_irq':
> include/linux/input.h:389: undefined reference to `input_event'
> drivers/input/built-in.o: In function `samsung_keypad_probe':
> drivers/input/keyboard/samsung-keypad.c:358: undefined reference to 
> `devm_input_allocate_device'
> drivers/input/built-in.o:(.debug_addr+0x34): undefined reference to 
> `input_set_capability'
> 
> This changes the 'select' statement so we don't do it if CONFIG_INPUT=m.
> The problem does not happen on mainline kernels, as we don't normally
> build built-in input drivers when CONFIG_INPUT=m, but I am experimenting
> with a patch to change this, and the samsung keypad driver showed up
> as one example that was silently broken before.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm/mach-exynos/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
> index 8434a0f6334c..b63e64581c24 100644
> --- a/arch/arm/mach-exynos/Kconfig
> +++ b/arch/arm/mach-exynos/Kconfig
> @@ -59,7 +59,7 @@ config ARCH_EXYNOS4
>   select CLKSRC_SAMSUNG_PWM if CPU_EXYNOS4210
>   select CPU_EXYNOS4210
>   select GIC_NON_BANKED
> - select KEYBOARD_SAMSUNG if INPUT_KEYBOARD
> + select KEYBOARD_SAMSUNG if INPUT=y && INPUT_KEYBOARD
>   select MIGHT_HAVE_CACHE_L2X0
>   help
> Samsung EXYNOS4 (Cortex-A9) SoC based systems
>

[PATCH RFC] Introduce atomic and per-cpu add-max and sub-min operations

2016-02-14 Thread Konstantin Khlebnikov

bool atomic_add_max(atomic_t *var, int add, int max);
bool atomic_sub_min(atomic_t *var, int sub, int min);

bool this_cpu_add_max(var, add, max);
bool this_cpu_sub_min(var, sub, min);

They add/subtract only if result will be not bigger than max/lower that min.
Returns true if operation was done and false otherwise.

Inside they check that (add <= max - var) and (sub <= var - min). Signed
operations work if all possible values fits into range which length fits
into non-negative range of that type: 0..INT_MAX, INT_MIN+1..0, -1000..1000.
Unsigned operations work if value always in valid range: min <= var <= max.
Char and short automatically casts to int, they never overflows.

Patch adds the same for atomic_long_t, atomic64_t, local_t, local64_t.
And unsigned variants: atomic_u32_add_max atomic_u32_sub_min for atomic_t,
atomic_u64_add_max atomic_u64_sub_min for atomic64_t.

Patch comes with test which hopefully covers all possible cornercases,
see CONFIG_ATOMIC64_SELFTEST and CONFIG_PERCPU_TEST.

All this allows to build any kind of counter in several lines:

- Simple atomic resource counter

atomic_t usage;
int limit;

result = atomic_add_max(&usage, charge, limit);

atomic_sub(uncharge, &usage);

- Event counter with per-cpu batch

atomic_t events;
DEFINE_PER_CPU(int, cpu_events);
int batch;

if (!this_cpu_add_max(cpu_events, count, batch))
atomic_add(this_cpu_xchg(cpu_events, 0) + count,  &events);

- Object counter with per-cpu part

atomic_t objects;
DEFINE_PER_CPU(int, cpu_objects);
int batch;

if (!this_cpu_add_max(cpu_objects, 1, batch))
atomic_add(this_cpu_xchg(cpu_events, 0) + 1,  &objects);

if (!this_cpu_sub_min(cpu_objects, 1, -batch))
atomic_add(this_cpu_xchg(cpu_events, 0) - 1,  &objects);

- Positive object counter with negative per-cpu parts

atomic_t objects;
DEFINE_PER_CPU(int, cpu_objects);
int batch;

if (!this_cpu_add_max(cpu_objects, 1, 0))
atomic_add(this_cpu_xchg(cpu_events, -batch / 2) + 1,  &objects);

if (!this_cpu_sub_min(cpu_objects, 1, -batch))
atomic_add(this_cpu_xchg(cpu_events, -batch / 2) - 1,  &objects);

- Resource counter with per-cpu precharge

atomic_t usage;
int limit;
DEFINE_PER_CPU(int, precharge);
int batch;

result = this_cpu_sub_min(precharge, charge, 0);
if (!result) {
preempt_disable();
charge += batch / 2 - __this_cpu_read(precharge);
result = atomic_add_max(&usage, charge, limit);
if (result)
__this_cpu_write(precharge, batch / 2);
preempt_enable();
}

if (!this_cpu_add_max(precharge, uncharge, batch)) {
preempt_disable();
if (__this_cpu_read(precharge) > batch / 2) {
uncharge += __this_cpu_read(precharge) - batch / 2;
__this_cpu_write(precharge, batch / 2);
}
atomic_sub(uncharge, &usage);
preempt_enable();
}

- Each operation easily split into static-inline per-cpu fast-path and
  atomic slow-path which could be hidden in separate function which
  performs resource reclaim, logging, etc.
- Types of global atomic part and per-cpu part might differs: for example
  like in vmstat counters atomit_long_t global and s8 local part.
- Resource could be counted upwards to the limit or downwards to the zero.
- Bounds min=INT_MIN/max=INT_MAX could be used for catching und/overflows.

Signed-off-by: Konstantin Khlebnikov 
---
 arch/x86/include/asm/local.h  |2 +
 include/asm-generic/local.h   |2 +
 include/asm-generic/local64.h |4 ++
 include/linux/atomic.h|   52 +
 include/linux/percpu-defs.h   |   56 +++
 lib/atomic64_test.c   |   49 
 lib/percpu_test.c |   84 +
 7 files changed, 249 insertions(+)

diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 4ad6560847b1..c97e0c0b3f48 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -149,6 +149,8 @@ static inline long local_sub_return(long i, local_t *l)
 })
 #define local_inc_not_zero(l) local_add_unless((l), 1, 0)
 
+ATOMIC_MINMAX_OP(local, local, long)
+
 /* On x86_32, these are no better than the atomic variants.
  * On x86-64 these are better than the atomic variants on SMP kernels
  * because they dont use a lock prefix.
diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
index 9ceb03b4f466..e46d9dfb7c21 100644
--- a/include/asm-generic/local.h
+++ b/include/asm-generic/local.h
@@ -44,6 +44,8 @@ typedef struct
 #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
 #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
 #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
+#define local_add_max(l, add, max) atomic_long_add_max(&(l)->a, add, max)
+#define local_sub_min(l, sub, min) atomic_long_sub_min(&(l)->a, sub, min)
 
 /* Non-atomic variants, ie. preemption disabled and won't be to

[GIT pull] x86 updates for 4.5

2016-02-14 Thread Thomas Gleixner

Linus,

please pull the latest x86-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
x86-urgent-for-linus

Two small fixlets for x86:

 - Prevent a KASAN false positive in thread_saved_pc()

 - Fix a 32-bit truncation problem in the x86 numa code

Thanks,

tglx

-->
Dmitry Vyukov (1):
  x86: Fix KASAN false positives in thread_saved_pc()

Ingo Molnar (1):
  x86/mm/numa: Fix 32-bit memblock range truncation bug on 32-bit NUMA 
kernels


 arch/x86/include/asm/processor.h | 2 +-
 arch/x86/mm/numa.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2d5a50cb61a2..20c11d1aa4cc 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -766,7 +766,7 @@ extern unsigned long thread_saved_pc(struct task_struct 
*tsk);
  * Return saved PC of a blocked thread.
  * What is this good for? it will be always the scheduler or ret_from_fork.
  */
-#define thread_saved_pc(t) (*(unsigned long *)((t)->thread.sp - 8))
+#define thread_saved_pc(t) READ_ONCE_NOCHECK(*(unsigned long 
*)((t)->thread.sp - 8))
 
 #define task_pt_regs(tsk)  ((struct pt_regs *)(tsk)->thread.sp0 - 1)
 extern unsigned long KSTK_ESP(struct task_struct *task);
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index c3b3f653ed0c..d04f8094bc23 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -469,7 +469,7 @@ static void __init numa_clear_kernel_node_hotplug(void)
 {
int i, nid;
nodemask_t numa_kernel_nodes = NODE_MASK_NONE;
-   unsigned long start, end;
+   phys_addr_t start, end;
struct memblock_region *r;
 
/*

[PATCH v7] mtd: spi-nor: add hisilicon spi-nor flash controller driver

2016-02-14 Thread Jiancheng Xue

Add hisilicon spi-nor flash controller driver

Signed-off-by: Binquan Peng 
Signed-off-by: Jiancheng Xue 
Acked-by: Rob Herring 
Reviewed-by: Ezequiel Garcia 
---
change log
v7:
Rebased to v4.5-rc3.
Fixed issues pointed by Ezequiel Garcia.
v6:
Based on v4.5-rc2 
Fixed issues pointed by Ezequiel Garcia.
v5:
Fixed a compile error.
v4:
Rebased to v4.5-rc1
v3:
Added a compatible string "hisilicon,hi3519-sfc".
v2:
Fixed some compiling warings.

 .../devicetree/bindings/spi/spi-hisi-sfc.txt   |  25 ++
This file has been acked by Rob Herring . 
 drivers/mtd/spi-nor/Kconfig|   6 +
 drivers/mtd/spi-nor/Makefile   |   1 +
 drivers/mtd/spi-nor/hisi-sfc.c | 494 +
 4 files changed, 526 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/spi/spi-hisi-sfc.txt
 create mode 100644 drivers/mtd/spi-nor/hisi-sfc.c

diff --git a/Documentation/devicetree/bindings/spi/spi-hisi-sfc.txt 
b/Documentation/devicetree/bindings/spi/spi-hisi-sfc.txt
new file mode 100644
index 000..7407147
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/spi-hisi-sfc.txt
@@ -0,0 +1,25 @@
+HiSilicon SPI-NOR Flash Controller
+
+Required properties:
+- compatible : Should be "hisilicon,hisi-sfc" and one of the following strings:
+   "hisilicon,hi3519-sfc"
+- address-cells : number of cells required to define a chip select
+address on the SPI bus. Should be set to 1. See spi-bus.txt.
+- size-cells : Should be 0.
+- reg : Offset and length of the register set for the controller device.
+- reg-names : Must include the following two entries: "control", "memory".
+- clocks : handle to spi-nor flash controller clock.
+
+Example:
+spi-nor-controller@1000 {
+   compatible = "hisilicon,hi3519-sfc", "hisilicon,hisi-sfc";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0x1000 0x1000>, <0x1400 0x100>;
+   reg-names = "control", "memory";
+   clocks = <&clock HI3519_FMC_CLK>;
+   spi-nor@0 {
+   compatible = "jedec,spi-nor";
+   reg = <0>;
+   };
+};
diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig
index 0dc9275..c86d7cf 100644
--- a/drivers/mtd/spi-nor/Kconfig
+++ b/drivers/mtd/spi-nor/Kconfig
@@ -37,6 +37,12 @@ config SPI_FSL_QUADSPI
  This controller does not support generic SPI. It only supports
  SPI NOR.
 
+config SPI_HISI_SFC
+   tristate "Hisilicon SPI-NOR Flash Controller(SFC)"
+   depends on ARCH_HISI || COMPILE_TEST
+   help
+ This enables support for hisilicon SPI-NOR flash controller.
+
 config SPI_NXP_SPIFI
tristate "NXP SPI Flash Interface (SPIFI)"
depends on OF && (ARCH_LPC18XX || COMPILE_TEST)
diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile
index 0bf3a7f8..8a6fa69 100644
--- a/drivers/mtd/spi-nor/Makefile
+++ b/drivers/mtd/spi-nor/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_MTD_SPI_NOR)  += spi-nor.o
 obj-$(CONFIG_SPI_FSL_QUADSPI)  += fsl-quadspi.o
+obj-$(CONFIG_SPI_HISI_SFC) += hisi-sfc.o
 obj-$(CONFIG_MTD_MT81xx_NOR)+= mtk-quadspi.o
 obj-$(CONFIG_SPI_NXP_SPIFI)+= nxp-spifi.o
diff --git a/drivers/mtd/spi-nor/hisi-sfc.c b/drivers/mtd/spi-nor/hisi-sfc.c
new file mode 100644
index 000..79baabf
--- /dev/null
+++ b/drivers/mtd/spi-nor/hisi-sfc.c
@@ -0,0 +1,494 @@
+/*
+ * HiSilicon SPI Nor Flash Controller Driver
+ *
+ * Copyright (c) 2015-2016 HiSilicon Technologies Co., Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Hardware register offsets and field definitions */
+#define FMC_CFG0x00
+#define SPI_NOR_ADDR_MODE  BIT(10)
+#define FMC_GLOBAL_CFG 0x04
+#define FMC_GLOBAL_CFG_WP_ENABLE   BIT(6)
+#define FMC_SPI_TIMING_CFG 0x08
+#define TIMING_CFG_TCSH(nr)(((nr) & 0xf) << 8)
+#define TIMING_CFG_TCSS(nr)(((nr) & 0xf) << 4)
+#define TIMING_CFG_TSHSL(nr)   ((nr) & 0xf)
+#define CS_HOLD_TIME   0x6
+#define CS_SETUP_TIME  0x6
+#define CS_DESELECT_TIME   0xf
+#define FMC_INT0x18
+#define FMC_INT_OP_DONEBIT(0)
+#define FMC_INT_CL

Re: [PATCH v11 3/4] ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi

2016-02-14 Thread Fu Wei

Hi Suravee,

On 11 February 2016 at 04:56, Suravee Suthikulpanit
 wrote:
> Hi Fu Wei,
>
> On 2/10/16 00:00, fu@linaro.org wrote:
>>
>> From: Fu Wei 
>>
>> This can be a example of adding SBSA Generic Watchdog device node
>> into some dts files for the Soc which contains SBSA Generic Watchdog.
>>
>> Acked-by: Arnd Bergmann 
>> Signed-off-by: Suravee Suthikulpanit 
>> Signed-off-by: Fu Wei 
>> ---
>>   arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi | 9 +
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
>> b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
>> index 2874d92..67eb636 100644
>> --- a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
>> +++ b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
>> @@ -84,6 +84,15 @@
>> clock-names = "uartclk", "apb_pclk";
>> };
>>
>> +   watchdog0: watchdog@e0bb {
>> +   compatible = "arm,sbsa-gwdt";
>> +   reg = <0x0 0xe0bc 0 0x1000>,
>> +   <0x0 0xe0bb 0 0x1000>;
>> +   interrupts = <0 337 4>;
>> +   timeout-sec = <15>;
>> +   status = "disabled";
>
>
> Could you please remove this status line? I do not think it is necessary for
> this one here anymore.

OK, will do
:-)

>
> Thanks,
> Suravee
>
>
>> +   };
>> +
>> spi0: ssp@e102 {
>> status = "disabled";
>> compatible = "arm,pl022", "arm,primecell";
>>
>



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

[GIT pull] irq updates for 4.5

2016-02-14 Thread Thomas Gleixner

Linus,

please pull the latest irq-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
irq-urgent-for-linus

Another set of ARM SoC related irqchip fixes:

 - Plug a memory leak in gicv3-its

 - Limit features to the root gic interrupt controller

 - Add a missing barrier in the gic-v3 IAR access

 - Another compile test fix for sun4i

Thanks,

tglx

-->
Andre Przywara (1):
  irqchip/sun4i: Fix compilation outside of arch/arm

Jon Hunter (2):
  irqchip/gic: Only populate set_affinity for the root controller
  irqchip/gic: Only set the EOImodeNS bit for the root controller

Shanker Donthineni (1):
  irqchip/gicv3-its: Fix memory leak in its_free_tables()

Tirumalesh Chalamarla (1):
  irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on 
redestributor


 arch/arm64/include/asm/arch_gicv3.h |  1 +
 drivers/irqchip/irq-gic-v3-its.c| 17 +++--
 drivers/irqchip/irq-gic.c   | 13 ++---
 drivers/irqchip/irq-sun4i.c |  1 -
 4 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/arch_gicv3.h 
b/arch/arm64/include/asm/arch_gicv3.h
index 2731d3b25ed2..8ec88e5b290f 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -103,6 +103,7 @@ static inline u64 gic_read_iar_common(void)
u64 irqstat;
 
asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
+   dsb(sy);
return irqstat;
 }
 
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 3447549fcc93..0a73632b28d5 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -66,7 +66,10 @@ struct its_node {
unsigned long   phys_base;
struct its_cmd_block*cmd_base;
struct its_cmd_block*cmd_write;
-   void*tables[GITS_BASER_NR_REGS];
+   struct {
+   void*base;
+   u32 order;
+   } tables[GITS_BASER_NR_REGS];
struct its_collection   *collections;
struct list_headits_device_list;
u64 flags;
@@ -807,9 +810,10 @@ static void its_free_tables(struct its_node *its)
int i;
 
for (i = 0; i < GITS_BASER_NR_REGS; i++) {
-   if (its->tables[i]) {
-   free_page((unsigned long)its->tables[i]);
-   its->tables[i] = NULL;
+   if (its->tables[i].base) {
+   free_pages((unsigned long)its->tables[i].base,
+  its->tables[i].order);
+   its->tables[i].base = NULL;
}
}
 }
@@ -890,7 +894,8 @@ retry_alloc_baser:
goto out_free;
}
 
-   its->tables[i] = base;
+   its->tables[i].base = base;
+   its->tables[i].order = order;
 
 retry_baser:
val = (virt_to_phys(base)|
@@ -940,7 +945,7 @@ retry_baser:
 * something is horribly wrong...
 */
free_pages((unsigned long)base, order);
-   its->tables[i] = NULL;
+   its->tables[i].base = NULL;
 
switch (psz) {
case SZ_16K:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 911758c056c1..8f9ebf714e2b 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -384,9 +384,6 @@ static struct irq_chip gic_chip = {
.irq_unmask = gic_unmask_irq,
.irq_eoi= gic_eoi_irq,
.irq_set_type   = gic_set_type,
-#ifdef CONFIG_SMP
-   .irq_set_affinity   = gic_set_affinity,
-#endif
.irq_get_irqchip_state  = gic_irq_get_irqchip_state,
.irq_set_irqchip_state  = gic_irq_set_irqchip_state,
.flags  = IRQCHIP_SET_TYPE_MASKED |
@@ -400,9 +397,6 @@ static struct irq_chip gic_eoimode1_chip = {
.irq_unmask = gic_unmask_irq,
.irq_eoi= gic_eoimode1_eoi_irq,
.irq_set_type   = gic_set_type,
-#ifdef CONFIG_SMP
-   .irq_set_affinity   = gic_set_affinity,
-#endif
.irq_get_irqchip_state  = gic_irq_get_irqchip_state,
.irq_set_irqchip_state  = gic_irq_set_irqchip_state,
.irq_set_vcpu_affinity  = gic_irq_set_vcpu_affinity,
@@ -443,7 +437,7 @@ static void gic_cpu_if_up(struct gic_chip_data *gic)
u32 bypass = 0;
u32 mode = 0;
 
-   if (static_key_true(&supports_deactivate))
+   if (gic == &gic_data[0] && static_key_true(&supports_deactivate))
mode = GIC_CPU_CTRL_EOImodeNS;
 
/*
@@ -1039,6 +1033,11 @@ static void __init __gic_init_bases(unsigned int gic_nr, 
int irq_start,
gic->chip.

[GIT pull] timer updates for 4.5

2016-02-14 Thread Thomas Gleixner

Linus,

please pull the latest timers-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
timers-urgent-for-linus

A single fix preventing a 32bit overflow in timespec/val to cputime
conversions on 32bit machines.

Thanks,

tglx

-->
zengtao (1):
  cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()


 include/asm-generic/cputime_nsecs.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/cputime_nsecs.h 
b/include/asm-generic/cputime_nsecs.h
index 0419485891f2..0f1c6f315cdc 100644
--- a/include/asm-generic/cputime_nsecs.h
+++ b/include/asm-generic/cputime_nsecs.h
@@ -75,7 +75,7 @@ typedef u64 __nocast cputime64_t;
  */
 static inline cputime_t timespec_to_cputime(const struct timespec *val)
 {
-   u64 ret = val->tv_sec * NSEC_PER_SEC + val->tv_nsec;
+   u64 ret = (u64)val->tv_sec * NSEC_PER_SEC + val->tv_nsec;
return (__force cputime_t) ret;
 }
 static inline void cputime_to_timespec(const cputime_t ct, struct timespec 
*val)
@@ -91,7 +91,8 @@ static inline void cputime_to_timespec(const cputime_t ct, 
struct timespec *val)
  */
 static inline cputime_t timeval_to_cputime(const struct timeval *val)
 {
-   u64 ret = val->tv_sec * NSEC_PER_SEC + val->tv_usec * NSEC_PER_USEC;
+   u64 ret = (u64)val->tv_sec * NSEC_PER_SEC +
+   val->tv_usec * NSEC_PER_USEC;
return (__force cputime_t) ret;
 }
 static inline void cputime_to_timeval(const cputime_t ct, struct timeval *val)

Re: [RFC PATCH] SPI/ACPI: DesignWare: Add ACPI support for Designware SPI driver

2016-02-14 Thread Jiang Qiu


Hi Mark,

Many thanks for your review, I'm so sorry for late reply because The 
Chinese

new year holiday. See my replies below.

Best Regards
Jiang

在 2016/2/5 19:09, Mark Brown 写道:

On Fri, Feb 05, 2016 at 03:11:20PM +0800, qiujiang wrote:


This patch added ACPI support for DesignWare SPI mmio driver. It
was based the corresponding DT driver and compatible for this two
way. This patch has been tested on Hisilicon D02 board. It relies
on the GPIO patchset.

Intel are heavy users of this driver on their systems which also use
ACPI.  Have you discussed this binding with them?  I've copied Andy and
Jarkko who've worked on the driver recently.

 I'm going to ask Andy to get some ideas that how to use this spi-dw-mmio
 driver by ACPI binding.


Please use subject lines matching the style for the subsystem.  This
makes it easier for people to identify relevant patches.

Thanks for the reminder, I will fix it in the next version.



+   char propname[32];

That's a magic number, where did it come from and why is it a magic
nummber?

I'm sorry for here without any comments. This number define is come from
gpiolib.c. It means the max size of gpio property name. The reference code
located in line 1815 of gpiolib.c.

+   if (ACPI_COMPANION(&pdev->dev)) {
+   for (i = 0; i < dws->num_cs; i++) {
+   snprintf(propname, sizeof(propname), "cs%d", i);
+   gpiod = devm_gpiod_get(&pdev->dev,
+   propname, GPIOD_ASIS);
+   if (IS_ERR(gpiod)) {
+   dev_err(&pdev->dev, "Get gpio desc failed!\n");
+   return PTR_ERR(gpiod);
+   }
+   }
+   }

I'm not seeing anywhere where we store the GPIO in this loop.  It is
therefore unclear to me how the chip select is going to work?
In DT binding, of_get_named_gpio and devm_gpio_request were used to 
parse gpio
pins defined in DTs and then request these pins. Similarly, for ACPI, 
devm_gpiod_get
can do that two operation in a single function. It is a unified 
interface to ACPI and DT

binding.

If the gpiod is valid, the corresponding gpio pins has been requested. 
We do not need

to save this gpiod any more.

which gpio pin was used is defined in spi_device, named cs_gpio, the 
configuration to the
gpio pins will be done in the setup callback routine of each device. 
What the spi master

should do is just request these pins to the gpio subsystem.

+static const struct acpi_device_id dw_spi_mmio_acpi_match[] = {
+   {"HISI0171", 0},
+   { }
+};
+MODULE_DEVICE_TABLE(acpi, dw_spi_mmio_acpi_match);

I really do wish ACPI had some more sensible system for allocating
device IDs so the tables were a little more legible. :(

This is really a question, I will do this feedback to ACPI maintainers.

[GIT pull] locking updates for 4.5

2016-02-14 Thread Thomas Gleixner

Linus,

please pull the latest locking-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
locking-urgent-for-linus

A single fix for the stack trace caching logic in lockdep, where the duplicate
avoidance managed to store no back trace at all.

Thanks,

tglx

-->
Dmitry Vyukov (1):
  locking/lockdep: Fix stack trace caching logic


 kernel/locking/lockdep.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 60ace56618f6..c7710e4092ef 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1822,7 +1822,7 @@ check_deadlock(struct task_struct *curr, struct held_lock 
*next,
  */
 static int
 check_prev_add(struct task_struct *curr, struct held_lock *prev,
-  struct held_lock *next, int distance, int trylock_loop)
+  struct held_lock *next, int distance, int *stack_saved)
 {
struct lock_list *entry;
int ret;
@@ -1883,8 +1883,11 @@ check_prev_add(struct task_struct *curr, struct 
held_lock *prev,
}
}
 
-   if (!trylock_loop && !save_trace(&trace))
-   return 0;
+   if (!*stack_saved) {
+   if (!save_trace(&trace))
+   return 0;
+   *stack_saved = 1;
+   }
 
/*
 * Ok, all validations passed, add the new lock
@@ -1907,6 +1910,8 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
 * Debugging printouts:
 */
if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
+   /* We drop graph lock, so another thread can overwrite trace. */
+   *stack_saved = 0;
graph_unlock();
printk("\n new dependency: ");
print_lock_name(hlock_class(prev));
@@ -1929,7 +1934,7 @@ static int
 check_prevs_add(struct task_struct *curr, struct held_lock *next)
 {
int depth = curr->lockdep_depth;
-   int trylock_loop = 0;
+   int stack_saved = 0;
struct held_lock *hlock;
 
/*
@@ -1956,7 +1961,7 @@ check_prevs_add(struct task_struct *curr, struct 
held_lock *next)
 */
if (hlock->read != 2 && hlock->check) {
if (!check_prev_add(curr, hlock, next,
-   distance, trylock_loop))
+   distance, &stack_saved))
return 0;
/*
 * Stop after the first non-trylock entry,
@@ -1979,7 +1984,6 @@ check_prevs_add(struct task_struct *curr, struct 
held_lock *next)
if (curr->held_locks[depth].irq_context !=
curr->held_locks[depth-1].irq_context)
break;
-   trylock_loop = 1;
}
return 1;
 out_bug:

[GIT pull] perf updates for 4.5

2016-02-14 Thread Thomas Gleixner

Linus,

please pull the latest perf-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
perf-urgent-for-linus

Another round of fixes for the perf tooling side:

  - Prevent a NULL pointer dereference in tracepoint error handling

  - Fix a thread handling bug in the intel_pt error handling code

  - Search both .eh_frame and .debug_frame sections as toolchains seem to have
random choices of storing the CFI information

  - Fix the perf state interval output values, which got broken when fixing
the overall output

Thanks,

tglx

-->
Adrian Hunter (2):
  perf tools: tracepoint_error() can receive e=NULL, robustify it
  perf tools: Fix thread lifetime related segfaut in intel_pt

Hemant Kumar (1):
  perf probe: Search both .eh_frame and .debug_frame sections for probe 
location

Jiri Olsa (1):
  perf stat: Fix interval output values


 tools/perf/util/intel-pt.c |  9 ++
 tools/perf/util/parse-events.c |  3 ++
 tools/perf/util/probe-finder.c | 62 +-
 tools/perf/util/probe-finder.h |  5 +++-
 tools/perf/util/stat.c | 10 +++
 5 files changed, 63 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 81a2eb77ba7f..05d815851be1 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2068,6 +2068,15 @@ int intel_pt_process_auxtrace_info(union perf_event 
*event,
err = -ENOMEM;
goto err_free_queues;
}
+
+   /*
+* Since this thread will not be kept in any rbtree not in a
+* list, initialize its list node so that at thread__put() the
+* current thread lifetime assuption is kept and we don't segfault
+* at list_del_init().
+*/
+   INIT_LIST_HEAD(&pt->unknown_thread->node);
+
err = thread__set_comm(pt->unknown_thread, "unknown", 0);
if (err)
goto err_delete_thread;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4f7b0efdde2f..813d9b272c81 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -399,6 +399,9 @@ static void tracepoint_error(struct parse_events_error *e, 
int err,
 {
char help[BUFSIZ];
 
+   if (!e)
+   return;
+
/*
 * We get error directly from syscall errno ( > 0),
 * or from encoded pointer's error ( < 0).
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 2be10fb27172..4ce5c5e18f48 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -686,8 +686,9 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct 
probe_finder *pf)
pf->fb_ops = NULL;
 #if _ELFUTILS_PREREQ(0, 142)
} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
-  pf->cfi != NULL) {
-   if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
+  (pf->cfi_eh != NULL || pf->cfi_dbg != NULL)) {
+   if ((dwarf_cfi_addrframe(pf->cfi_eh, pf->addr, &frame) != 0 &&
+(dwarf_cfi_addrframe(pf->cfi_dbg, pf->addr, &frame) != 0)) 
||
dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
pr_warning("Failed to get call frame on 0x%jx\n",
   (uintmax_t)pf->addr);
@@ -1015,8 +1016,7 @@ static int pubname_search_cb(Dwarf *dbg, Dwarf_Global 
*gl, void *data)
return DWARF_CB_OK;
 }
 
-/* Find probe points from debuginfo */
-static int debuginfo__find_probes(struct debuginfo *dbg,
+static int debuginfo__find_probe_location(struct debuginfo *dbg,
  struct probe_finder *pf)
 {
struct perf_probe_point *pp = &pf->pev->point;
@@ -1025,27 +1025,6 @@ static int debuginfo__find_probes(struct debuginfo *dbg,
Dwarf_Die *diep;
int ret = 0;
 
-#if _ELFUTILS_PREREQ(0, 142)
-   Elf *elf;
-   GElf_Ehdr ehdr;
-   GElf_Shdr shdr;
-
-   /* Get the call frame information from this dwarf */
-   elf = dwarf_getelf(dbg->dbg);
-   if (elf == NULL)
-   return -EINVAL;
-
-   if (gelf_getehdr(elf, &ehdr) == NULL)
-   return -EINVAL;
-
-   if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) &&
-   shdr.sh_type == SHT_PROGBITS) {
-   pf->cfi = dwarf_getcfi_elf(elf);
-   } else {
-   pf->cfi = dwarf_getcfi(dbg->dbg);
-   }
-#endif
-
off = 0;
pf->lcache = intlist__new(NULL);
if (!pf->lcache)
@@ -1108,6 +1087,39 @@ found:
return ret;
 }
 
+/* Find probe points from debuginfo */
+static int debuginfo__find_probes(struct debuginfo *dbg,
+ struct probe_finder *pf)
+{
+   int ret = 0;
+
+#if _ELFUTILS_PREREQ(0, 142)
+   Elf *elf;
+   GElf_Ehdr ehdr;
+

Re: [RFC PATCH] SPI/ACPI: DesignWare: Add ACPI support for Designware SPI driver

2016-02-14 Thread Jiang Qiu


Hi Andy,

Sorry for late relpy because Chinese new year holiday. See my replies below.


Best Regards
Jiang

在 2016/2/5 23:55, Andy Shevchenko 写道:

On Fri, Feb 5, 2016 at 9:11 AM, qiujiang  wrote:

This patch added ACPI support for DesignWare SPI mmio driver. It
was based the corresponding DT driver and compatible for this two
way. This patch has been tested on Hisilicon D02 board. It relies
on the GPIO patchset.

My comments below.

As Mark mentioned, I want to ask you how to use this spi-dw-mmio driver for
ACPI binding? Dose it need any other extra patchset?



@@ -84,8 +87,6 @@ static int dw_spi_mmio_probe(struct platform_device *pdev)
 dws->num_cs = num_cs;

 if (pdev->dev.of_node) {
-   int i;
-
 for (i = 0; i < dws->num_cs; i++) {
 int cs_gpio = of_get_named_gpio(pdev->dev.of_node,
 "cs-gpios", i);

It seems the driver was never validated with more than one chip select.
Perhaps someone has to switch to use of_spi_register_master() here.
of_spi_register_master() will be executed in the spi_register_master(), 
but it just saved the

cs_gpios to the spi_master and not used it any more.

@@ -104,6 +105,18 @@ static int dw_spi_mmio_probe(struct platform_device *pdev)
 }
 }

+   if (ACPI_COMPANION(&pdev->dev)) {
+   for (i = 0; i < dws->num_cs; i++) {
+   snprintf(propname, sizeof(propname), "cs%d", i);
+   gpiod = devm_gpiod_get(&pdev->dev,
+   propname, GPIOD_ASIS);
+   if (IS_ERR(gpiod)) {
+   dev_err(&pdev->dev, "Get gpio desc failed!\n");
+   return PTR_ERR(gpiod);
+   }
+   }
+   }

Like Mark noticed there is also same issue. Do you indeed check the
configuration with different chip select signals?
As a spi master driver, it seems that multi-chip select must be 
supported, so this check is

necessary, I think.

Re: [PATCH v2 3/3] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

2016-02-14 Thread zhong jiang

On 2016/2/6 0:11, Joonsoo Kim wrote:
> 2016-02-05 9:49 GMT+09:00 Andrew Morton :
>> On Thu,  4 Feb 2016 15:19:35 +0900 Joonsoo Kim  wrote:
>>
>>> There is a performance drop report due to hugepage allocation and in there
>>> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
>>> In that workload, compaction is triggered to make hugepage but most of
>>> pageblocks are un-available for compaction due to pageblock type and
>>> skip bit so compaction usually fails. Most costly operations in this case
>>> is to find valid pageblock while scanning whole zone range. To check
>>> if pageblock is valid to compact, valid pfn within pageblock is required
>>> and we can obtain it by calling pageblock_pfn_to_page(). This function
>>> checks whether pageblock is in a single zone and return valid pfn
>>> if possible. Problem is that we need to check it every time before
>>> scanning pageblock even if we re-visit it and this turns out to
>>> be very expensive in this workload.
>>>
>>> Although we have no way to skip this pageblock check in the system
>>> where hole exists at arbitrary position, we can use cached value for
>>> zone continuity and just do pfn_to_page() in the system where hole doesn't
>>> exist. This optimization considerably speeds up in above workload.
>>>
>>> Before vs After
>>> Max: 1096 MB/s vs 1325 MB/s
>>> Min: 635 MB/s 1015 MB/s
>>> Avg: 899 MB/s 1194 MB/s
>>>
>>> Avg is improved by roughly 30% [2].
>>>
>>> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
>>> [2]: https://lkml.org/lkml/2015/12/9/23
>>>
>>> ...
>>>
>>> --- a/include/linux/memory_hotplug.h
>>> +++ b/include/linux/memory_hotplug.h
>>> @@ -196,6 +196,9 @@ void put_online_mems(void);
>>>  void mem_hotplug_begin(void);
>>>  void mem_hotplug_done(void);
>>>
>>> +extern void set_zone_contiguous(struct zone *zone);
>>> +extern void clear_zone_contiguous(struct zone *zone);
>>> +
>>>  #else /* ! CONFIG_MEMORY_HOTPLUG */
>>>  /*
>>>   * Stub functions for when hotplug is off
>>
>> Was it really intended that these declarations only exist if
>> CONFIG_MEMORY_HOTPLUG?  Seems unrelated.
> 
> These are called for caching memory layout whether it is contiguous
> or not. So, they are always called in memory initialization. Then,
> hotplug could change memory layout so they should be called
> there, too. So, they are defined in page_alloc.c and exported only
> if CONFIG_MEMORY_HOTPLUG.
> 
>> The i386 allnocofnig build fails in preditable ways so I fixed that up
>> as below, but it seems wrong.
> 
> Yeah, it seems wrong to me. :)
> Here goes fix.
> 
> --->8
>>From ed6add18bc361e00a7ac6746de6eeb62109e6416 Mon Sep 17 00:00:00 2001
> From: Joonsoo Kim 
> Date: Thu, 10 Dec 2015 17:03:54 +0900
> Subject: [PATCH] mm/compaction: speed up pageblock_pfn_to_page() when zone is
>  contiguous
> 
> There is a performance drop report due to hugepage allocation and in there
> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
> In that workload, compaction is triggered to make hugepage but most of
> pageblocks are un-available for compaction due to pageblock type and
> skip bit so compaction usually fails. Most costly operations in this case
> is to find valid pageblock while scanning whole zone range. To check
> if pageblock is valid to compact, valid pfn within pageblock is required
> and we can obtain it by calling pageblock_pfn_to_page(). This function
> checks whether pageblock is in a single zone and return valid pfn
> if possible. Problem is that we need to check it every time before
> scanning pageblock even if we re-visit it and this turns out to
> be very expensive in this workload.
> 
> Although we have no way to skip this pageblock check in the system
> where hole exists at arbitrary position, we can use cached value for
> zone continuity and just do pfn_to_page() in the system where hole doesn't
> exist. This optimization considerably speeds up in above workload.
> 
> Before vs After
> Max: 1096 MB/s vs 1325 MB/s
> Min: 635 MB/s 1015 MB/s
> Avg: 899 MB/s 1194 MB/s
> 
> Avg is improved by roughly 30% [2].
> 
> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
> [2]: https://lkml.org/lkml/2015/12/9/23
> 
> v3
> o remove pfn_valid_within() check for all pages in the pageblock
> because pageblock_pfn_to_page() is only called with pageblock aligned pfn.

I have a question about the zone continuity. because hole exists at
arbitrary position in a page block. Therefore, only pageblock_pf_to_page()
is insufficiency, whether pageblock aligned pfn or not , the pfn_valid_within()
is necessary.

eh: 120M-122M is a range of page block, but the 120.5M-121.5M is holes, only by
pageblock_pfn_to_page() to conclude in the result is inaccurate

Thanks
zhongjiang

> v2
> o checking zone continuity after initialization
> o handle memory-hotplug case
> 
> Reported and Tested-by: Aaron Lu 
> Signed-off-by: Joonsoo Kim 
> ---
>  include/linux/gfp.h|  6 
>  include/linux/memory_hotplug

[PATCH 1/7] f2fs: introduce f2fs_journal struct to wrap journal info

2016-02-14 Thread Chao Yu

Introduce a new structure f2fs_journal to wrap journal info in struct
f2fs_summary_block for readability.

struct f2fs_journal {
union {
__le16 n_nats;
__le16 n_sits;
};
union {
struct nat_journal nat_j;
struct sit_journal sit_j;
struct f2fs_extra_info info;
};
} __packed;

struct f2fs_summary_block {
struct f2fs_summary entries[ENTRIES_IN_SUM];
struct f2fs_journal journal;
struct summary_footer footer;
} __packed;

Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c|  2 +-
 fs/f2fs/f2fs.h  | 39 +--
 fs/f2fs/node.c  | 42 --
 fs/f2fs/segment.c   | 54 -
 fs/f2fs/super.c |  2 +-
 include/linux/f2fs_fs.h | 10 ++---
 6 files changed, 77 insertions(+), 72 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 536bec9..3cdcdaf 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1051,7 +1051,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct 
cp_control *cpc)
if (sb->s_bdev->bd_part)
kbytes_written += BD_PART_WRITTEN(sbi);
 
-   seg_i->sum_blk->info.kbytes_written = cpu_to_le64(kbytes_written);
+   seg_i->sum_blk->journal.info.kbytes_written = 
cpu_to_le64(kbytes_written);
 
if (__remain_node_summaries(cpc->reason)) {
write_node_summaries(sbi, start_blk);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 0d2b1ba..c4c7bf1 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -183,37 +183,37 @@ struct fsync_inode_entry {
block_t last_inode; /* block address locating the last inode */
 };
 
-#define nats_in_cursum(sum)(le16_to_cpu(sum->n_nats))
-#define sits_in_cursum(sum)(le16_to_cpu(sum->n_sits))
+#define nats_in_cursum(jnl)(le16_to_cpu(jnl->n_nats))
+#define sits_in_cursum(jnl)(le16_to_cpu(jnl->n_sits))
 
-#define nat_in_journal(sum, i) (sum->nat_j.entries[i].ne)
-#define nid_in_journal(sum, i) (sum->nat_j.entries[i].nid)
-#define sit_in_journal(sum, i) (sum->sit_j.entries[i].se)
-#define segno_in_journal(sum, i)   (sum->sit_j.entries[i].segno)
+#define nat_in_journal(jnl, i) (jnl->nat_j.entries[i].ne)
+#define nid_in_journal(jnl, i) (jnl->nat_j.entries[i].nid)
+#define sit_in_journal(jnl, i) (jnl->sit_j.entries[i].se)
+#define segno_in_journal(jnl, i)   (jnl->sit_j.entries[i].segno)
 
-#define MAX_NAT_JENTRIES(sum)  (NAT_JOURNAL_ENTRIES - nats_in_cursum(sum))
-#define MAX_SIT_JENTRIES(sum)  (SIT_JOURNAL_ENTRIES - sits_in_cursum(sum))
+#define MAX_NAT_JENTRIES(jnl)  (NAT_JOURNAL_ENTRIES - nats_in_cursum(jnl))
+#define MAX_SIT_JENTRIES(jnl)  (SIT_JOURNAL_ENTRIES - sits_in_cursum(jnl))
 
-static inline int update_nats_in_cursum(struct f2fs_summary_block *rs, int i)
+static inline int update_nats_in_cursum(struct f2fs_journal *journal, int i)
 {
-   int before = nats_in_cursum(rs);
-   rs->n_nats = cpu_to_le16(before + i);
+   int before = nats_in_cursum(journal);
+   journal->n_nats = cpu_to_le16(before + i);
return before;
 }
 
-static inline int update_sits_in_cursum(struct f2fs_summary_block *rs, int i)
+static inline int update_sits_in_cursum(struct f2fs_journal *journal, int i)
 {
-   int before = sits_in_cursum(rs);
-   rs->n_sits = cpu_to_le16(before + i);
+   int before = sits_in_cursum(journal);
+   journal->n_sits = cpu_to_le16(before + i);
return before;
 }
 
-static inline bool __has_cursum_space(struct f2fs_summary_block *sum, int size,
-   int type)
+static inline bool __has_cursum_space(struct f2fs_journal *journal,
+   int size, int type)
 {
if (type == NAT_JOURNAL)
-   return size <= MAX_NAT_JENTRIES(sum);
-   return size <= MAX_SIT_JENTRIES(sum);
+   return size <= MAX_NAT_JENTRIES(journal);
+   return size <= MAX_SIT_JENTRIES(journal);
 }
 
 /*
@@ -1862,8 +1862,7 @@ void f2fs_wait_on_page_writeback(struct page *, enum 
page_type, bool);
 void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *, block_t);
 void write_data_summaries(struct f2fs_sb_info *, block_t);
 void write_node_summaries(struct f2fs_sb_info *, block_t);
-int lookup_journal_in_cursum(struct f2fs_summary_block *,
-   int, unsigned int, int);
+int lookup_journal_in_cursum(struct f2fs_journal *, int, unsigned int, int);
 void flush_sit_entries(struct f2fs_sb_info *, struct cp_control *);
 int build_segment_manager(struct f2fs_sb_info *);
 void destroy_segment_manager(struct f2fs_sb_info *);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 150907f..8230e35 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/

[PATCH 2/7] f2fs: split journal cache from curseg cache

2016-02-14 Thread Chao Yu

In curseg cache, f2fs cached two different parts:
 - datas of current summay block
 - journal info consist of sparse nat/sit entries
And we use the same curseg_mutex lock to protect these two caches.

With this approach, on the one hand, it may cause higher lock contention
when accessing/updating one of two parts of cache since we use the same
lock curseg_mutex to avoid contention. On the other hand, for now, journal
info was cached in current summary block after ->mount or last checkpoint,
so these junk journals could be writebacked into most summary blocks, it
wastes remained space of summary block.

So, in order to fix above issues, we split curseg cache into two parts:
a) journal cache, protected by r/w semaphore journal_rwsem
b) current summary block, protected by mutex lock curseg_mutex

Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c |  2 +-
 fs/f2fs/node.c   | 26 
 fs/f2fs/segment.c| 85 +---
 fs/f2fs/segment.h|  2 ++
 4 files changed, 77 insertions(+), 38 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 3cdcdaf..c6d4259 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1051,7 +1051,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct 
cp_control *cpc)
if (sb->s_bdev->bd_part)
kbytes_written += BD_PART_WRITTEN(sbi);
 
-   seg_i->sum_blk->journal.info.kbytes_written = 
cpu_to_le64(kbytes_written);
+   seg_i->journal->info.kbytes_written = cpu_to_le64(kbytes_written);
 
if (__remain_node_summaries(cpc->reason)) {
write_node_summaries(sbi, start_blk);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 8230e35..94b8016 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -354,7 +354,7 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
struct node_info *ni)
 {
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
-   struct f2fs_journal *journal = &curseg->sum_blk->journal;
+   struct f2fs_journal *journal = curseg->journal;
nid_t start_nid = START_NID(nid);
struct f2fs_nat_block *nat_blk;
struct page *page = NULL;
@@ -381,13 +381,13 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
struct node_info *ni)
down_write(&nm_i->nat_tree_lock);
 
/* Check current segment summary */
-   mutex_lock(&curseg->curseg_mutex);
+   down_read(&curseg->journal_rwsem);
i = lookup_journal_in_cursum(journal, NAT_JOURNAL, nid, 0);
if (i >= 0) {
ne = nat_in_journal(journal, i);
node_info_from_raw_nat(ni, &ne);
}
-   mutex_unlock(&curseg->curseg_mutex);
+   up_read(&curseg->journal_rwsem);
if (i >= 0)
goto cache;
 
@@ -1613,7 +1613,7 @@ static void build_free_nids(struct f2fs_sb_info *sbi)
 {
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
-   struct f2fs_journal *journal = &curseg->sum_blk->journal;
+   struct f2fs_journal *journal = curseg->journal;
int i = 0;
nid_t nid = nm_i->next_scan_nid;
 
@@ -1645,7 +1645,7 @@ static void build_free_nids(struct f2fs_sb_info *sbi)
nm_i->next_scan_nid = nid;
 
/* find free nids from current sum_pages */
-   mutex_lock(&curseg->curseg_mutex);
+   down_read(&curseg->journal_rwsem);
for (i = 0; i < nats_in_cursum(journal); i++) {
block_t addr;
 
@@ -1656,7 +1656,7 @@ static void build_free_nids(struct f2fs_sb_info *sbi)
else
remove_free_nid(nm_i, nid);
}
-   mutex_unlock(&curseg->curseg_mutex);
+   up_read(&curseg->journal_rwsem);
up_read(&nm_i->nat_tree_lock);
 
ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nm_i->next_scan_nid),
@@ -1920,10 +1920,10 @@ static void remove_nats_in_journal(struct f2fs_sb_info 
*sbi)
 {
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
-   struct f2fs_journal *journal = &curseg->sum_blk->journal;
+   struct f2fs_journal *journal = curseg->journal;
int i;
 
-   mutex_lock(&curseg->curseg_mutex);
+   down_write(&curseg->journal_rwsem);
for (i = 0; i < nats_in_cursum(journal); i++) {
struct nat_entry *ne;
struct f2fs_nat_entry raw_ne;
@@ -1939,7 +1939,7 @@ static void remove_nats_in_journal(struct f2fs_sb_info 
*sbi)
__set_nat_cache_dirty(nm_i, ne);
}
update_nats_in_cursum(journal, -i);
-   mutex_unlock(&curseg->curseg_mutex);
+   up_write(&curseg->journal_rwsem);
 }
 
 static void __adjust_nat_entry_set(struct nat_entry_set *nes,
@@ -1964,7 +1964,7 @@ static void __flush_nat_entry_set(struct f2fs_sb_info 
*sbi,
struct nat_entry_set *set)
 {
struct

[PATCH 3/7] f2fs: reorder nat cache lock in cache_nat_entry

2016-02-14 Thread Chao Yu

In cache_nat_entry, if we fail to hit nat cache, we try to load nat
entries from journal of current segment cache or NAT pages for updating,
during the whole updating process, write lock of nat_tree_lock will be
held to avoid inconsistent condition.

But this way may cause low efficient when updating nat cache, because it
serializes accessing in journal cache or reading NAT pages.

Here, we reorder lock and update flow like below:

 - get_node_info
  - down_read(nat_tree_lock)
  - lookup nat cache --- hit -> unlock & return
  - lookup journal cache --- hit -> unlock & goto update
  - up_read(nat_tree_lock)
update:
  - down_write(nat_tree_lock)
  - cache_nat_entry
   - lookup nat cache --- nohit -> update
  - up_write(nat_tree_lock)

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 94b8016..966176b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -257,15 +257,20 @@ static struct nat_entry *grab_nat_entry(struct 
f2fs_nm_info *nm_i, nid_t nid)
return new;
 }
 
-static void cache_nat_entry(struct f2fs_nm_info *nm_i, nid_t nid,
+static void cache_nat_entry(struct f2fs_sb_info *sbi, nid_t nid,
struct f2fs_nat_entry *ne)
 {
+   struct f2fs_nm_info *nm_i = NM_I(sbi);
struct nat_entry *e;
 
e = __lookup_nat_cache(nm_i, nid);
if (!e) {
e = grab_nat_entry(nm_i, nid);
node_info_from_raw_nat(&e->ni, ne);
+   } else {
+   f2fs_bug_on(sbi, nat_get_ino(e) != ne->ino ||
+   nat_get_blkaddr(e) != ne->block_addr ||
+   nat_get_version(e) != ne->version);
}
 }
 
@@ -371,15 +376,12 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
struct node_info *ni)
ni->ino = nat_get_ino(e);
ni->blk_addr = nat_get_blkaddr(e);
ni->version = nat_get_version(e);
-   }
-   up_read(&nm_i->nat_tree_lock);
-   if (e)
+   up_read(&nm_i->nat_tree_lock);
return;
+   }
 
memset(&ne, 0, sizeof(struct f2fs_nat_entry));
 
-   down_write(&nm_i->nat_tree_lock);
-
/* Check current segment summary */
down_read(&curseg->journal_rwsem);
i = lookup_journal_in_cursum(journal, NAT_JOURNAL, nid, 0);
@@ -398,8 +400,10 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
struct node_info *ni)
node_info_from_raw_nat(ni, &ne);
f2fs_put_page(page, 1);
 cache:
+   up_read(&nm_i->nat_tree_lock);
/* cache nat entry */
-   cache_nat_entry(NM_I(sbi), nid, &ne);
+   down_write(&nm_i->nat_tree_lock);
+   cache_nat_entry(sbi, nid, &ne);
up_write(&nm_i->nat_tree_lock);
 }
 
-- 
2.7.0.2.g1b0b6dd

[PATCH 4/7] f2fs: enhance IO path with block plug

2016-02-14 Thread Chao Yu

Try to use block plug in more place as below to let process cache bios
as much as possbile, in order to reduce lock overhead of queue in IO
scheduler.
1) sync_meta_pages
2) ra_meta_pages
3) f2fs_balance_fs_bg

Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c | 12 
 fs/f2fs/segment.c|  9 +++--
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index c6d4259..359a805 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -143,7 +143,6 @@ bool is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t 
blkaddr, int type)
 int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
int type, bool sync)
 {
-   block_t prev_blk_addr = 0;
struct page *page;
block_t blkno = start;
struct f2fs_io_info fio = {
@@ -152,10 +151,12 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t 
start, int nrpages,
.rw = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA,
.encrypted_page = NULL,
};
+   struct blk_plug plug;
 
if (unlikely(type == META_POR))
fio.rw &= ~REQ_META;
 
+   blk_start_plug(&plug);
for (; nrpages-- > 0; blkno++) {
 
if (!is_valid_blkaddr(sbi, blkno, type))
@@ -174,9 +175,6 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, 
int nrpages,
/* get sit block addr */
fio.blk_addr = current_sit_addr(sbi,
blkno * SIT_ENTRY_PER_BLOCK);
-   if (blkno != start && prev_blk_addr + 1 != fio.blk_addr)
-   goto out;
-   prev_blk_addr = fio.blk_addr;
break;
case META_SSA:
case META_CP:
@@ -201,6 +199,7 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, 
int nrpages,
}
 out:
f2fs_submit_merged_bio(sbi, META, READ);
+   blk_finish_plug(&plug);
return blkno - start;
 }
 
@@ -287,9 +286,12 @@ long sync_meta_pages(struct f2fs_sb_info *sbi, enum 
page_type type,
struct writeback_control wbc = {
.for_reclaim = 0,
};
+   struct blk_plug plug;
 
pagevec_init(&pvec, 0);
 
+   blk_start_plug(&plug);
+
while (index <= end) {
int i, nr_pages;
nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
@@ -342,6 +344,8 @@ stop:
if (nwritten)
f2fs_submit_merged_bio(sbi, type, WRITE);
 
+   blk_finish_plug(&plug);
+
return nwritten;
 }
 
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 7cb9a54..5d0e6e6 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -370,8 +370,13 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
excess_prefree_segs(sbi) ||
excess_dirty_nats(sbi) ||
(is_idle(sbi) && f2fs_time_over(sbi, CP_TIME))) {
-   if (test_opt(sbi, DATA_FLUSH))
+   if (test_opt(sbi, DATA_FLUSH)) {
+   struct blk_plug plug;
+
+   blk_start_plug(&plug);
sync_dirty_inodes(sbi, FILE_INODE);
+   blk_finish_plug(&plug);
+   }
f2fs_sync_fs(sbi->sb, true);
stat_inc_bg_cp_count(sbi->stat_info);
}
@@ -2189,7 +2194,7 @@ static void build_sit_entries(struct f2fs_sb_info *sbi)
int sit_blk_cnt = SIT_BLK_CNT(sbi);
unsigned int i, start, end;
unsigned int readed, start_blk = 0;
-   int nrpages = MAX_BIO_BLOCKS(sbi);
+   int nrpages = MAX_BIO_BLOCKS(sbi) * 8;
 
do {
readed = ra_meta_pages(sbi, start_blk, nrpages, META_SIT, true);
-- 
2.7.0.2.g1b0b6dd

[PATCH 5/7] f2fs crypto: set up encryption info for new inodes in f2fs_inherit_context()

2016-02-14 Thread Chao Yu

This patch syncs f2fs with commit 70295532 ("ext4 crypto: set up
encryption info for new inodes in ext4_inherit_context()") from ext4.

Set up the encryption information for newly created inodes immediately
after they inherit their encryption context from their parent
directories.

Signed-off-by: Theodore Ts'o 
Signed-off-by: Chao Yu 
---
 fs/f2fs/crypto_policy.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/crypto_policy.c b/fs/f2fs/crypto_policy.c
index d4a96af..0fb08b0 100644
--- a/fs/f2fs/crypto_policy.c
+++ b/fs/f2fs/crypto_policy.c
@@ -203,7 +203,10 @@ int f2fs_inherit_context(struct inode *parent, struct 
inode *child,
F2FS_KEY_DESCRIPTOR_SIZE);
 
get_random_bytes(ctx.nonce, F2FS_KEY_DERIVATION_NONCE_SIZE);
-   return f2fs_setxattr(child, F2FS_XATTR_INDEX_ENCRYPTION,
+   res = f2fs_setxattr(child, F2FS_XATTR_INDEX_ENCRYPTION,
F2FS_XATTR_NAME_ENCRYPTION_CONTEXT, &ctx,
sizeof(ctx), ipage, XATTR_CREATE);
+   if (!res)
+   res = f2fs_get_encryption_info(child);
+   return res;
 }
-- 
2.7.0.2.g1b0b6dd

[PATCH 6/7] f2fs crypto: make sure the encryption info is initialized on opendir(2)

2016-02-14 Thread Chao Yu

This patch syncs f2fs with commit 6bc445e0ff44 ("ext4 crypto: make
sure the encryption info is initialized on opendir(2)") from ext4.

Signed-off-by: Theodore Ts'o 
Signed-off-by: Chao Yu 
---
 fs/f2fs/dir.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index ca41b2a..8f09da0 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -902,11 +902,19 @@ out:
return err;
 }
 
+static int f2fs_dir_open(struct inode *inode, struct file *filp)
+{
+   if (f2fs_encrypted_inode(inode))
+   return f2fs_get_encryption_info(inode) ? -EACCES : 0;
+   return 0;
+}
+
 const struct file_operations f2fs_dir_operations = {
.llseek = generic_file_llseek,
.read   = generic_read_dir,
.iterate= f2fs_readdir,
.fsync  = f2fs_sync_file,
+   .open   = f2fs_dir_open,
.unlocked_ioctl = f2fs_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl   = f2fs_compat_ioctl,
-- 
2.7.0.2.g1b0b6dd

Re: [BUG] ODEBUG: assert_init not available (active state 0)

2016-02-14 Thread Chris Bainbridge

On 5 February 2016 at 02:52, Zheng, Lv  wrote:
>
> So you could wait for just several days to test it again.
> And report here if you can still see issues.

I didn't notice any revert last week, is there something specific that
I should test? I just noticed that there are graphical glitches in
Firefox on Youtube as well.

[PATCH 7/7] f2fs crypto: handle unexpected lack of encryption keys

2016-02-14 Thread Chao Yu

This patch syncs f2fs with commit abdd438b26b4 ("ext4 crypto: handle
unexpected lack of encryption keys") from ext4.

Fix up attempts by users to try to write to a file when they don't
have access to the encryption key.

Signed-off-by: Theodore Ts'o 
Signed-off-by: Chao Yu 
---
 fs/f2fs/crypto_policy.c | 3 ++-
 fs/f2fs/file.c  | 6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/crypto_policy.c b/fs/f2fs/crypto_policy.c
index 0fb08b0..4d7d7ff 100644
--- a/fs/f2fs/crypto_policy.c
+++ b/fs/f2fs/crypto_policy.c
@@ -192,7 +192,8 @@ int f2fs_inherit_context(struct inode *parent, struct inode 
*child,
return res;
 
ci = F2FS_I(parent)->i_crypt_info;
-   BUG_ON(ci == NULL);
+   if (ci == NULL)
+   return -ENOKEY;
 
ctx.format = F2FS_ENCRYPTION_CONTEXT_FORMAT_V1;
 
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 05f5f2f..8dea195 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -424,6 +424,8 @@ static int f2fs_file_mmap(struct file *file, struct 
vm_area_struct *vma)
err = f2fs_get_encryption_info(inode);
if (err)
return 0;
+   if (!f2fs_encrypted_inode(inode))
+   return -ENOKEY;
}
 
/* we don't need to use inline_data strictly */
@@ -443,7 +445,9 @@ static int f2fs_file_open(struct inode *inode, struct file 
*filp)
if (!ret && f2fs_encrypted_inode(inode)) {
ret = f2fs_get_encryption_info(inode);
if (ret)
-   ret = -EACCES;
+   return -EACCES;
+   if (!f2fs_encrypted_inode(inode))
+   return -ENOKEY;
}
return ret;
 }
-- 
2.7.0.2.g1b0b6dd

Regard of thermal power allocator's coefficients

2016-02-14 Thread Leo Yan

Hi there,

I'm trying to upstreaming IPA patches for 96board Hikey, but so far
there have no standard DT binding for passing IPA coefficients for
power modeling.

So want to firstly to confirm if should we pass coefficients by using
device tree? Is someone working on related work for this?

Here has another more straightforward method is to directly to
include power model's coefficients in thermal sensor driver (such like
drivers/thermal/hisi_thermal.c), but my concern is this method will
include SoC specific data in the common thermal sensor driver,
so is this doable?

Welcome any suggestion.

Thanks,
Leo Yan

Re: [PATCH v6 00/12] arm-cci: PMU driver updates

2016-02-14 Thread Suzuki K. Poulose


On 25/01/16 11:21, Suzuki K. Poulose wrote:

This series includes:

  - Simplified sysfs attribute handling for CCI PMU (Patch 1)
  - Work around for writing to CCI-500/550(introduced later) PMU
counters (Patches 2-10)
  - Support for CCI-550 PMU (11-12) with Acked-bys.



Gentle ping

Cheers
Suzuki

[PATCH v3 09/11] KVM: MMU: simplify mmu_need_write_protect

2016-02-14 Thread Xiao Guangrong

Now, all non-leaf shadow page are page tracked, if gfn is not tracked
there is no non-leaf shadow page of gfn is existed, we can directly
make the shadow page of gfn to unsync

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c | 29 +++--
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e9dbd85..4986615 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2444,7 +2444,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page);
 
-static void __kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
+static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
trace_kvm_mmu_unsync_page(sp);
++vcpu->kvm->stat.mmu_unsync;
@@ -2453,39 +2453,24 @@ static void __kvm_unsync_page(struct kvm_vcpu *vcpu, 
struct kvm_mmu_page *sp)
kvm_mmu_mark_parents_unsync(sp);
 }
 
-static void kvm_unsync_pages(struct kvm_vcpu *vcpu,  gfn_t gfn)
-{
-   struct kvm_mmu_page *s;
-
-   for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn) {
-   if (s->unsync)
-   continue;
-   WARN_ON(s->role.level != PT_PAGE_TABLE_LEVEL);
-   __kvm_unsync_page(vcpu, s);
-   }
-}
-
 static bool mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
   bool can_unsync)
 {
-   struct kvm_mmu_page *s;
-   bool need_unsync = false;
+   struct kvm_mmu_page *sp;
 
if (kvm_page_track_check_mode(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
return true;
 
-   for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn) {
+   for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn) {
if (!can_unsync)
return true;
 
-   if (s->role.level != PT_PAGE_TABLE_LEVEL)
-   return true;
+   if (sp->unsync)
+   continue;
 
-   if (!s->unsync)
-   need_unsync = true;
+   WARN_ON(sp->role.level != PT_PAGE_TABLE_LEVEL);
+   kvm_unsync_page(vcpu, sp);
}
-   if (need_unsync)
-   kvm_unsync_pages(vcpu, gfn);
 
return false;
 }
-- 
1.8.3.1

[PATCH v3 02/11] KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpage

2016-02-14 Thread Xiao Guangrong

Abstract the common operations from account_shadowed() and
unaccount_shadowed(), then introduce kvm_mmu_gfn_disallow_lpage()
and kvm_mmu_gfn_allow_lpage()

These two functions will be used by page tracking in the later patch

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c | 38 +-
 arch/x86/kvm/mmu.h |  3 +++
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index de9e992..e1bb66c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -776,21 +776,39 @@ static struct kvm_lpage_info *lpage_info_slot(gfn_t gfn,
return &slot->arch.lpage_info[level - 2][idx];
 }
 
+static void update_gfn_disallow_lpage_count(struct kvm_memory_slot *slot,
+   gfn_t gfn, int count)
+{
+   struct kvm_lpage_info *linfo;
+   int i;
+
+   for (i = PT_DIRECTORY_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
+   linfo = lpage_info_slot(gfn, slot, i);
+   linfo->disallow_lpage += count;
+   WARN_ON(linfo->disallow_lpage < 0);
+   }
+}
+
+void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   update_gfn_disallow_lpage_count(slot, gfn, 1);
+}
+
+void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   update_gfn_disallow_lpage_count(slot, gfn, -1);
+}
+
 static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
-   struct kvm_lpage_info *linfo;
gfn_t gfn;
-   int i;
 
gfn = sp->gfn;
slots = kvm_memslots_for_spte_role(kvm, sp->role);
slot = __gfn_to_memslot(slots, gfn);
-   for (i = PT_DIRECTORY_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
-   linfo = lpage_info_slot(gfn, slot, i);
-   linfo->disallow_lpage += 1;
-   }
+   kvm_mmu_gfn_disallow_lpage(slot, gfn);
kvm->arch.indirect_shadow_pages++;
 }
 
@@ -798,18 +816,12 @@ static void unaccount_shadowed(struct kvm *kvm, struct 
kvm_mmu_page *sp)
 {
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
-   struct kvm_lpage_info *linfo;
gfn_t gfn;
-   int i;
 
gfn = sp->gfn;
slots = kvm_memslots_for_spte_role(kvm, sp->role);
slot = __gfn_to_memslot(slots, gfn);
-   for (i = PT_DIRECTORY_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
-   linfo = lpage_info_slot(gfn, slot, i);
-   linfo->disallow_lpage -= 1;
-   WARN_ON(linfo->disallow_lpage < 0);
-   }
+   kvm_mmu_gfn_allow_lpage(slot, gfn);
kvm->arch.indirect_shadow_pages--;
 }
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 55ffb7b..de92bed 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -174,4 +174,7 @@ static inline bool permission_fault(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
 
 void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm);
 void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
+
+void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
+void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
 #endif
-- 
1.8.3.1

[PATCH v3 03/11] KVM: MMU: introduce kvm_mmu_slot_gfn_write_protect

2016-02-14 Thread Xiao Guangrong

Split rmap_write_protect() and introduce the function to abstract the write
protection based on the slot

This function will be used in the later patch

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c | 16 +++-
 arch/x86/kvm/mmu.h |  2 ++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e1bb66c..edad3c7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1336,23 +1336,29 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm 
*kvm,
kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
 }
 
-static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
+bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
+   struct kvm_memory_slot *slot, u64 gfn)
 {
-   struct kvm_memory_slot *slot;
struct kvm_rmap_head *rmap_head;
int i;
bool write_protected = false;
 
-   slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-
for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
rmap_head = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(vcpu->kvm, rmap_head, 
true);
+   write_protected |= __rmap_write_protect(kvm, rmap_head, true);
}
 
return write_protected;
 }
 
+static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
+{
+   struct kvm_memory_slot *slot;
+
+   slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
+   return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn);
+}
+
 static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
 {
u64 *sptep;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index de92bed..58fe98a 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -177,4 +177,6 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, 
gfn_t gfn_end);
 
 void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn);
+bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
+   struct kvm_memory_slot *slot, u64 gfn);
 #endif
-- 
1.8.3.1

[PATCH v3 00/11] KVM: x86: track guest page access

2016-02-14 Thread Xiao Guangrong

Changelong in v3:
- refine the code of mmu_need_write_protect() based on Huang Kai's suggestion
- rebase the patchset against current code

Changelog in v2:
- fix a issue that the track memory of memslot is freed if we only move
  the memslot or change the flags of memslot
- do not track the gfn which is not mapped in memslots
- introduce the nolock APIs at the begin of the patchset
- use 'unsigned short' as the track counter to reduce the memory and which
  should be enough for shadow page table and KVMGT

This patchset introduces the feature which allows us to track page
access in guest. Currently, only write access tracking is implemented
in this version.

Four APIs are introduces:
- kvm_page_track_add_page(kvm, gfn, mode), single guest page @gfn is
  added into the track pool of the guest instance represented by @kvm,
  @mode specifies which kind of access on the @gfn is tracked
  
- kvm_page_track_remove_page(kvm, gfn, mode), is the opposed operation
  of kvm_page_track_add_page() which removes @gfn from the tracking pool.
  gfn is no tracked after its last user is gone

- kvm_page_track_register_notifier(kvm, n), register a notifier so that
  the event triggered by page tracking will be received, at that time,
  the callback of n->track_write() will be called

- kvm_page_track_unregister_notifier(kvm, n), does the opposed operation
  of kvm_page_track_register_notifier(), which unlinks the notifier and
  stops receiving the tracked event

The first user of page track is non-leaf shadow page tables as they are
always write protected. It also gains performance improvement because
page track speeds up page fault handler for the tracked pages. The
performance result of kernel building is as followings:

   before   after
real 461.63   real 455.48
user 4529.55  user 4557.88
sys 1995.39   sys 1922.57

Furthermore, it is the infrastructure of other kind of shadow page table,
such as GPU shadow page table introduced in KVMGT (1) and native nested
IOMMU.

This patch can be divided into two parts:
- patch 1 ~ patch 7, implement page tracking
- others patches apply page tracking to non-leaf shadow page table

(1): http://lkml.iu.edu/hypermail/linux/kernel/1510.3/01562.html

Xiao Guangrong (11):
  KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowed
  KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpage
  KVM: MMU: introduce kvm_mmu_slot_gfn_write_protect
  KVM: page track: add the framework of guest page tracking
  KVM: page track: introduce kvm_page_track_{add,remove}_page
  KVM: MMU: let page fault handler be aware tracked page
  KVM: page track: add notifier support
  KVM: MMU: use page track for non-leaf shadow pages
  KVM: MMU: simplify mmu_need_write_protect
  KVM: MMU: clear write-flooding on the fast path of tracked page
  KVM: MMU: apply page track notifier

 Documentation/virtual/kvm/mmu.txt |   6 +-
 arch/x86/include/asm/kvm_host.h   |  12 +-
 arch/x86/include/asm/kvm_page_track.h |  67 +
 arch/x86/kvm/Makefile |   3 +-
 arch/x86/kvm/mmu.c| 209 ++-
 arch/x86/kvm/mmu.h|   5 +
 arch/x86/kvm/page_track.c | 257 ++
 arch/x86/kvm/paging_tmpl.h|   5 +
 arch/x86/kvm/x86.c|  27 ++--
 9 files changed, 512 insertions(+), 79 deletions(-)
 create mode 100644 arch/x86/include/asm/kvm_page_track.h
 create mode 100644 arch/x86/kvm/page_track.c

-- 
1.8.3.1

[PATCH v3 11/11] KVM: MMU: apply page track notifier

2016-02-14 Thread Xiao Guangrong

Register the notifier to receive write track event so that we can update
our shadow page table

It makes kvm_mmu_pte_write() be the callback of the notifier, no function
is changed

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h |  5 +++--
 arch/x86/kvm/mmu.c  | 19 +--
 arch/x86/kvm/x86.c  |  4 ++--
 3 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 254d103..5246f07 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -696,6 +696,7 @@ struct kvm_arch {
 */
struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
+   struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head;
 
struct list_head assigned_dev_head;
@@ -994,6 +995,8 @@ void kvm_mmu_module_exit(void);
 void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_mmu_create(struct kvm_vcpu *vcpu);
 void kvm_mmu_setup(struct kvm_vcpu *vcpu);
+void kvm_mmu_init_vm(struct kvm *kvm);
+void kvm_mmu_uninit_vm(struct kvm *kvm);
 void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
u64 dirty_mask, u64 nx_mask, u64 x_mask);
 
@@ -1133,8 +1136,6 @@ void kvm_pic_clear_all(struct kvm_pic *pic, int 
irq_source_id);
 
 void kvm_inject_nmi(struct kvm_vcpu *vcpu);
 
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-  const u8 *new, int bytes);
 int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
 void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f924e6c..57cf30b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4316,8 +4316,8 @@ static u64 *get_written_sptes(struct kvm_mmu_page *sp, 
gpa_t gpa, int *nspte)
return spte;
 }
 
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-  const u8 *new, int bytes)
+static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
+ const u8 *new, int bytes)
 {
gfn_t gfn = gpa >> PAGE_SHIFT;
struct kvm_mmu_page *sp;
@@ -4531,6 +4531,21 @@ void kvm_mmu_setup(struct kvm_vcpu *vcpu)
init_kvm_mmu(vcpu);
 }
 
+void kvm_mmu_init_vm(struct kvm *kvm)
+{
+   struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
+
+   node->track_write = kvm_mmu_pte_write;
+   kvm_page_track_register_notifier(kvm, node);
+}
+
+void kvm_mmu_uninit_vm(struct kvm *kvm)
+{
+   struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
+
+   kvm_page_track_unregister_notifier(kvm, node);
+}
+
 /* The return value indicates if tlb flush on all vcpus is needed. */
 typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head 
*rmap_head);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 98019b6..319d572 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4369,7 +4369,6 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
ret = kvm_vcpu_write_guest(vcpu, gpa, val, bytes);
if (ret < 0)
return 0;
-   kvm_mmu_pte_write(vcpu, gpa, val, bytes);
kvm_page_track_write(vcpu, gpa, val, bytes);
return 1;
 }
@@ -4628,7 +4627,6 @@ static int emulator_cmpxchg_emulated(struct 
x86_emulate_ctxt *ctxt,
return X86EMUL_CMPXCHG_FAILED;
 
kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
-   kvm_mmu_pte_write(vcpu, gpa, new, bytes);
kvm_page_track_write(vcpu, gpa, new, bytes);
 
return X86EMUL_CONTINUE;
@@ -7751,6 +7749,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
INIT_DELAYED_WORK(&kvm->arch.kvmclock_sync_work, kvmclock_sync_fn);
 
kvm_page_track_init(kvm);
+   kvm_mmu_init_vm(kvm);
 
return 0;
 }
@@ -7878,6 +7877,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kfree(kvm->arch.vioapic);
kvm_free_vcpus(kvm);
kfree(rcu_dereference_check(kvm->arch.apic_map, 1));
+   kvm_mmu_uninit_vm(kvm);
 }
 
 void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
-- 
1.8.3.1

[PATCH v3 10/11] KVM: MMU: clear write-flooding on the fast path of tracked page

2016-02-14 Thread Xiao Guangrong

If the page fault is caused by write access on write tracked page, the
real shadow page walking is skipped, we lost the chance to clear write
flooding for the page structure current vcpu is using

Fix it by locklessly waking shadow page table to clear write flooding
on the shadow page structure out of mmu-lock. So that we change the
count to atomic_t

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu.c  | 22 --
 arch/x86/kvm/paging_tmpl.h  |  4 +++-
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 282bc2f..254d103 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -277,7 +277,7 @@ struct kvm_mmu_page {
 #endif
 
/* Number of writes since the last time traversal visited this page.  */
-   int write_flooding_count;
+   atomic_t write_flooding_count;
 };
 
 struct kvm_pio_request {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4986615..f924e6c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2073,7 +2073,7 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu,
 
 static void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
 {
-   sp->write_flooding_count = 0;
+   atomic_set(&sp->write_flooding_count,  0);
 }
 
 static void clear_sp_write_flooding_count(u64 *spte)
@@ -3407,6 +3407,23 @@ static bool page_fault_handle_page_track(struct kvm_vcpu 
*vcpu,
return false;
 }
 
+static void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
+{
+   struct kvm_shadow_walk_iterator iterator;
+   u64 spte;
+
+   if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
+   return;
+
+   walk_shadow_page_lockless_begin(vcpu);
+   for_each_shadow_entry_lockless(vcpu, addr, iterator, spte) {
+   clear_sp_write_flooding_count(iterator.sptep);
+   if (!is_shadow_present_pte(spte))
+   break;
+   }
+   walk_shadow_page_lockless_end(vcpu);
+}
+
 static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
u32 error_code, bool prefault)
 {
@@ -4236,7 +4253,8 @@ static bool detect_write_flooding(struct kvm_mmu_page *sp)
if (sp->role.level == PT_PAGE_TABLE_LEVEL)
return false;
 
-   return ++sp->write_flooding_count >= 3;
+   atomic_inc(&sp->write_flooding_count);
+   return atomic_read(&sp->write_flooding_count) >= 3;
 }
 
 /*
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index c3a30c2..5985156 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -735,8 +735,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t 
addr, u32 error_code,
return 0;
}
 
-   if (page_fault_handle_page_track(vcpu, error_code, walker.gfn))
+   if (page_fault_handle_page_track(vcpu, error_code, walker.gfn)) {
+   shadow_page_table_clear_flood(vcpu, addr);
return 1;
+   }
 
vcpu->arch.write_fault_to_shadow_pgtable = false;
 
-- 
1.8.3.1

[PATCH v3 05/11] KVM: page track: introduce kvm_page_track_{add,remove}_page

2016-02-14 Thread Xiao Guangrong

These two functions are the user APIs:
- kvm_page_track_add_page(): add the page to the tracking pool after
  that later specified access on that page will be tracked

- kvm_page_track_remove_page(): remove the page from the tracking pool,
  the specified access on the page is not tracked after the last user is
  gone

Both of these are called under the protection of kvm->srcu or
kvm->slots_lock

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_page_track.h |  13 
 arch/x86/kvm/page_track.c | 124 ++
 2 files changed, 137 insertions(+)

diff --git a/arch/x86/include/asm/kvm_page_track.h 
b/arch/x86/include/asm/kvm_page_track.h
index 55200406..c010124 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -10,4 +10,17 @@ void kvm_page_track_free_memslot(struct kvm_memory_slot 
*free,
 struct kvm_memory_slot *dont);
 int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
  unsigned long npages);
+
+void
+kvm_slot_page_track_add_page_nolock(struct kvm *kvm,
+   struct kvm_memory_slot *slot, gfn_t gfn,
+   enum kvm_page_track_mode mode);
+void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
+enum kvm_page_track_mode mode);
+void kvm_slot_page_track_remove_page_nolock(struct kvm *kvm,
+   struct kvm_memory_slot *slot,
+   gfn_t gfn,
+   enum kvm_page_track_mode mode);
+void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
+   enum kvm_page_track_mode mode);
 #endif
diff --git a/arch/x86/kvm/page_track.c b/arch/x86/kvm/page_track.c
index 8c396d0..e17efe9 100644
--- a/arch/x86/kvm/page_track.c
+++ b/arch/x86/kvm/page_track.c
@@ -50,3 +50,127 @@ track_free:
kvm_page_track_free_memslot(slot, NULL);
return -ENOMEM;
 }
+
+static bool check_mode(enum kvm_page_track_mode mode)
+{
+   if (mode < 0 || mode >= KVM_PAGE_TRACK_MAX)
+   return false;
+
+   return true;
+}
+
+static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn,
+enum kvm_page_track_mode mode, short count)
+{
+   int index;
+   unsigned short val;
+
+   index = gfn_to_index(gfn, slot->base_gfn, PT_PAGE_TABLE_LEVEL);
+
+   val = slot->arch.gfn_track[mode][index];
+
+   /* does tracking count wrap? */
+   WARN_ON((count > 0) && (val + count < val));
+   /* the last tracker has already gone? */
+   WARN_ON((count < 0) && (val < !count));
+
+   slot->arch.gfn_track[mode][index] += count;
+}
+
+void
+kvm_slot_page_track_add_page_nolock(struct kvm *kvm,
+   struct kvm_memory_slot *slot, gfn_t gfn,
+   enum kvm_page_track_mode mode)
+{
+
+   WARN_ON(!check_mode(mode));
+
+   update_gfn_track(slot, gfn, mode, 1);
+
+   /*
+* new track stops large page mapping for the
+* tracked page.
+*/
+   kvm_mmu_gfn_disallow_lpage(slot, gfn);
+
+   if (mode == KVM_PAGE_TRACK_WRITE)
+   if (kvm_mmu_slot_gfn_write_protect(kvm, slot, gfn))
+   kvm_flush_remote_tlbs(kvm);
+}
+
+/*
+ * add guest page to the tracking pool so that corresponding access on that
+ * page will be intercepted.
+ *
+ * It should be called under the protection of kvm->srcu or kvm->slots_lock
+ *
+ * @kvm: the guest instance we are interested in.
+ * @gfn: the guest page.
+ * @mode: tracking mode, currently only write track is supported.
+ */
+void kvm_page_track_add_page(struct kvm *kvm, gfn_t gfn,
+enum kvm_page_track_mode mode)
+{
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *slot;
+   int i;
+
+   for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
+   slots = __kvm_memslots(kvm, i);
+
+   slot = __gfn_to_memslot(slots, gfn);
+   if (!slot)
+   continue;
+
+   spin_lock(&kvm->mmu_lock);
+   kvm_slot_page_track_add_page_nolock(kvm, slot, gfn, mode);
+   spin_unlock(&kvm->mmu_lock);
+   }
+}
+
+void kvm_slot_page_track_remove_page_nolock(struct kvm *kvm,
+   struct kvm_memory_slot *slot,
+   gfn_t gfn,
+   enum kvm_page_track_mode mode)
+{
+   WARN_ON(!check_mode(mode));
+
+   update_gfn_track(slot, gfn, mode, -1);
+
+   /*
+* allow large page mapping for the tracked page
+* after the tracker is gone.
+*/
+   kvm_mmu_gfn_allow_lpage(slot, gfn);
+}
+
+/*
+ * remove the guest page from the tracking pool which stops the interception
+ * of co

[PATCH v3 08/11] KVM: MMU: use page track for non-leaf shadow pages

2016-02-14 Thread Xiao Guangrong

non-leaf shadow pages are always write protected, it can be the user
of page track

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bd9c278..e9dbd85 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -806,11 +806,17 @@ static void account_shadowed(struct kvm *kvm, struct 
kvm_mmu_page *sp)
struct kvm_memory_slot *slot;
gfn_t gfn;
 
+   kvm->arch.indirect_shadow_pages++;
gfn = sp->gfn;
slots = kvm_memslots_for_spte_role(kvm, sp->role);
slot = __gfn_to_memslot(slots, gfn);
+
+   /* the non-leaf shadow pages are keeping readonly. */
+   if (sp->role.level > PT_PAGE_TABLE_LEVEL)
+   return kvm_slot_page_track_add_page_nolock(kvm, slot, gfn,
+   KVM_PAGE_TRACK_WRITE);
+
kvm_mmu_gfn_disallow_lpage(slot, gfn);
-   kvm->arch.indirect_shadow_pages++;
 }
 
 static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
@@ -819,11 +825,15 @@ static void unaccount_shadowed(struct kvm *kvm, struct 
kvm_mmu_page *sp)
struct kvm_memory_slot *slot;
gfn_t gfn;
 
+   kvm->arch.indirect_shadow_pages--;
gfn = sp->gfn;
slots = kvm_memslots_for_spte_role(kvm, sp->role);
slot = __gfn_to_memslot(slots, gfn);
+   if (sp->role.level > PT_PAGE_TABLE_LEVEL)
+   return kvm_slot_page_track_remove_page_nolock(kvm, slot, gfn,
+   KVM_PAGE_TRACK_WRITE);
+
kvm_mmu_gfn_allow_lpage(slot, gfn);
-   kvm->arch.indirect_shadow_pages--;
 }
 
 static bool __mmu_gfn_lpage_is_disallowed(gfn_t gfn, int level,
@@ -2132,12 +2142,18 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct 
kvm_vcpu *vcpu,
hlist_add_head(&sp->hash_link,
&vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]);
if (!direct) {
-   if (rmap_write_protect(vcpu, gfn))
+   /*
+* we should do write protection before syncing pages
+* otherwise the content of the synced shadow page may
+* be inconsistent with guest page table.
+*/
+   account_shadowed(vcpu->kvm, sp);
+
+   if (level == PT_PAGE_TABLE_LEVEL &&
+ rmap_write_protect(vcpu, gfn))
kvm_flush_remote_tlbs(vcpu->kvm);
if (level > PT_PAGE_TABLE_LEVEL && need_sync)
kvm_sync_pages(vcpu, gfn);
-
-   account_shadowed(vcpu->kvm, sp);
}
sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
clear_page(sp->spt);
-- 
1.8.3.1

[PATCH v3 04/11] KVM: page track: add the framework of guest page tracking

2016-02-14 Thread Xiao Guangrong

The array, gfn_track[mode][gfn], is introduced in memory slot for every
guest page, this is the tracking count for the gust page on different
modes. If the page is tracked then the count is increased, the page is
not tracked after the count reaches zero

We use 'unsigned short' as the tracking count which should be enough as
shadow page table only can use 2^14 (2^3 for level, 2^1 for cr4_pae, 2^2
for quadrant, 2^3 for access, 2^1 for nxe, 2^1 for cr0_wp, 2^1 for
smep_andnot_wp, 2^1 for smap_andnot_wp, and 2^1 for smm) at most, there
is enough room for other trackers

Two callbacks, kvm_page_track_create_memslot() and
kvm_page_track_free_memslot() are implemented in this patch, they are
internally used to initialize and reclaim the memory of the array

Currently, only write track mode is supported

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h   |  2 ++
 arch/x86/include/asm/kvm_page_track.h | 13 +
 arch/x86/kvm/Makefile |  3 +-
 arch/x86/kvm/page_track.c | 52 +++
 arch/x86/kvm/x86.c|  5 
 5 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/asm/kvm_page_track.h
 create mode 100644 arch/x86/kvm/page_track.c

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e1c1f57..d8931d0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define KVM_MAX_VCPUS 255
 #define KVM_SOFT_MAX_VCPUS 160
@@ -650,6 +651,7 @@ struct kvm_lpage_info {
 struct kvm_arch_memory_slot {
struct kvm_rmap_head *rmap[KVM_NR_PAGE_SIZES];
struct kvm_lpage_info *lpage_info[KVM_NR_PAGE_SIZES - 1];
+   unsigned short *gfn_track[KVM_PAGE_TRACK_MAX];
 };
 
 /*
diff --git a/arch/x86/include/asm/kvm_page_track.h 
b/arch/x86/include/asm/kvm_page_track.h
new file mode 100644
index 000..55200406
--- /dev/null
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -0,0 +1,13 @@
+#ifndef _ASM_X86_KVM_PAGE_TRACK_H
+#define _ASM_X86_KVM_PAGE_TRACK_H
+
+enum kvm_page_track_mode {
+   KVM_PAGE_TRACK_WRITE,
+   KVM_PAGE_TRACK_MAX,
+};
+
+void kvm_page_track_free_memslot(struct kvm_memory_slot *free,
+struct kvm_memory_slot *dont);
+int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
+ unsigned long npages);
+#endif
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index a1ff508..464fa47 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -13,9 +13,10 @@ kvm-$(CONFIG_KVM_ASYNC_PF)   += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
   i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
-  hyperv.o
+  hyperv.o page_track.o
 
 kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT)+= assigned-dev.o iommu.o
+
 kvm-intel-y+= vmx.o pmu_intel.o
 kvm-amd-y  += svm.o pmu_amd.o
 
diff --git a/arch/x86/kvm/page_track.c b/arch/x86/kvm/page_track.c
new file mode 100644
index 000..8c396d0
--- /dev/null
+++ b/arch/x86/kvm/page_track.c
@@ -0,0 +1,52 @@
+/*
+ * Support KVM gust page tracking
+ *
+ * This feature allows us to track page access in guest. Currently, only
+ * write access is tracked.
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *   Xiao Guangrong 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+
+#include "mmu.h"
+
+void kvm_page_track_free_memslot(struct kvm_memory_slot *free,
+struct kvm_memory_slot *dont)
+{
+   int i;
+
+   for (i = 0; i < KVM_PAGE_TRACK_MAX; i++)
+   if (!dont || free->arch.gfn_track[i] !=
+ dont->arch.gfn_track[i]) {
+   kvfree(free->arch.gfn_track[i]);
+   free->arch.gfn_track[i] = NULL;
+   }
+}
+
+int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
+ unsigned long npages)
+{
+   int  i;
+
+   for (i = 0; i < KVM_PAGE_TRACK_MAX; i++) {
+   slot->arch.gfn_track[i] = kvm_kvzalloc(npages *
+   sizeof(*slot->arch.gfn_track[i]));
+   if (!slot->arch.gfn_track[i])
+   goto track_free;
+   }
+
+   return 0;
+
+track_free:
+   kvm_page_track_free_memslot(slot, NULL);
+   return -ENOMEM;
+}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f448e64..e25ebb7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7895,6 +7895,8 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct 
kvm_memory_slot *free,
free->arch.lpage_info[i - 1] = NULL;
}
}
+
+   kvm_pa

[PATCH v3 06/11] KVM: MMU: let page fault handler be aware tracked page

2016-02-14 Thread Xiao Guangrong

The page fault caused by write access on the write tracked page can not
be fixed, it always need to be emulated. page_fault_handle_page_track()
is the fast path we introduce here to skip holding mmu-lock and shadow
page table walking

However, if the page table is not present, it is worth making the page
table entry present and readonly to make the read access happy

mmu_need_write_protect() need to be cooked to avoid page becoming writable
when making page table present or sync/prefetch shadow page table entries

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_page_track.h |  2 ++
 arch/x86/kvm/mmu.c| 44 +--
 arch/x86/kvm/page_track.c | 14 +++
 arch/x86/kvm/paging_tmpl.h|  3 +++
 4 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_page_track.h 
b/arch/x86/include/asm/kvm_page_track.h
index c010124..97ac9c3 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -23,4 +23,6 @@ void kvm_slot_page_track_remove_page_nolock(struct kvm *kvm,
enum kvm_page_track_mode mode);
 void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
enum kvm_page_track_mode mode);
+bool kvm_page_track_check_mode(struct kvm_vcpu *vcpu, gfn_t gfn,
+  enum kvm_page_track_mode mode);
 #endif
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index edad3c7..bd9c278 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * When setting this variable to true it enables Two-Dimensional-Paging
@@ -2448,25 +2449,29 @@ static void kvm_unsync_pages(struct kvm_vcpu *vcpu,  
gfn_t gfn)
}
 }
 
-static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
- bool can_unsync)
+static bool mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
+  bool can_unsync)
 {
struct kvm_mmu_page *s;
bool need_unsync = false;
 
+   if (kvm_page_track_check_mode(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
+   return true;
+
for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn) {
if (!can_unsync)
-   return 1;
+   return true;
 
if (s->role.level != PT_PAGE_TABLE_LEVEL)
-   return 1;
+   return true;
 
if (!s->unsync)
need_unsync = true;
}
if (need_unsync)
kvm_unsync_pages(vcpu, gfn);
-   return 0;
+
+   return false;
 }
 
 static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
@@ -3381,10 +3386,30 @@ int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 
addr, bool direct)
 }
 EXPORT_SYMBOL_GPL(handle_mmio_page_fault);
 
+static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
+u32 error_code, gfn_t gfn)
+{
+   if (unlikely(error_code & PFERR_RSVD_MASK))
+   return false;
+
+   if (!(error_code & PFERR_PRESENT_MASK) ||
+ !(error_code & PFERR_WRITE_MASK))
+   return false;
+
+   /*
+* guest is writing the page which is write tracked which can
+* not be fixed by page fault handler.
+*/
+   if (kvm_page_track_check_mode(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
+   return true;
+
+   return false;
+}
+
 static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
u32 error_code, bool prefault)
 {
-   gfn_t gfn;
+   gfn_t gfn = gva >> PAGE_SHIFT;
int r;
 
pgprintk("%s: gva %lx error %x\n", __func__, gva, error_code);
@@ -3396,13 +3421,15 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, 
gva_t gva,
return r;
}
 
+   if (page_fault_handle_page_track(vcpu, error_code, gfn))
+   return 1;
+
r = mmu_topup_memory_caches(vcpu);
if (r)
return r;
 
MMU_WARN_ON(!VALID_PAGE(vcpu->arch.mmu.root_hpa));
 
-   gfn = gva >> PAGE_SHIFT;
 
return nonpaging_map(vcpu, gva & PAGE_MASK,
 error_code, gfn, prefault);
@@ -3486,6 +3513,9 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t 
gpa, u32 error_code,
return r;
}
 
+   if (page_fault_handle_page_track(vcpu, error_code, gfn))
+   return 1;
+
r = mmu_topup_memory_caches(vcpu);
if (r)
return r;
diff --git a/arch/x86/kvm/page_track.c b/arch/x86/kvm/page_track.c
index e17efe9..de9b32f 100644
--- a/arch/x86/kvm/page_track.c
+++ b/arch/x86/kvm/page_track.c
@@ -174,3 +174,17 @@ void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
spin_unlock(&kvm->mmu_lock);
}
 }
+
+/*
+

[PATCH v3 07/11] KVM: page track: add notifier support

2016-02-14 Thread Xiao Guangrong

Notifier list is introduced so that any node wants to receive the track
event can register to the list

Two APIs are introduced here:
- kvm_page_track_register_notifier(): register the notifier to receive
  track event

- kvm_page_track_unregister_notifier(): stop receiving track event by
  unregister the notifier

The callback, node->track_write() is called when a write access on the
write tracked page happens

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h   |  1 +
 arch/x86/include/asm/kvm_page_track.h | 39 
 arch/x86/kvm/page_track.c | 67 +++
 arch/x86/kvm/x86.c|  4 +++
 4 files changed, 111 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d8931d0..282bc2f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -696,6 +696,7 @@ struct kvm_arch {
 */
struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
+   struct kvm_page_track_notifier_head track_notifier_head;
 
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;
diff --git a/arch/x86/include/asm/kvm_page_track.h 
b/arch/x86/include/asm/kvm_page_track.h
index 97ac9c3..1aae4ef 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -6,6 +6,36 @@ enum kvm_page_track_mode {
KVM_PAGE_TRACK_MAX,
 };
 
+/*
+ * The notifier represented by @kvm_page_track_notifier_node is linked into
+ * the head which will be notified when guest is triggering the track event.
+ *
+ * Write access on the head is protected by kvm->mmu_lock, read access
+ * is protected by track_srcu.
+ */
+struct kvm_page_track_notifier_head {
+   struct srcu_struct track_srcu;
+   struct hlist_head track_notifier_list;
+};
+
+struct kvm_page_track_notifier_node {
+   struct hlist_node node;
+
+   /*
+* It is called when guest is writing the write-tracked page
+* and write emulation is finished at that time.
+*
+* @vcpu: the vcpu where the write access happened.
+* @gpa: the physical address written by guest.
+* @new: the data was written to the address.
+* @bytes: the written length.
+*/
+   void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+   int bytes);
+};
+
+void kvm_page_track_init(struct kvm *kvm);
+
 void kvm_page_track_free_memslot(struct kvm_memory_slot *free,
 struct kvm_memory_slot *dont);
 int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
@@ -25,4 +55,13 @@ void kvm_page_track_remove_page(struct kvm *kvm, gfn_t gfn,
enum kvm_page_track_mode mode);
 bool kvm_page_track_check_mode(struct kvm_vcpu *vcpu, gfn_t gfn,
   enum kvm_page_track_mode mode);
+
+void
+kvm_page_track_register_notifier(struct kvm *kvm,
+struct kvm_page_track_notifier_node *n);
+void
+kvm_page_track_unregister_notifier(struct kvm *kvm,
+  struct kvm_page_track_notifier_node *n);
+void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+ int bytes);
 #endif
diff --git a/arch/x86/kvm/page_track.c b/arch/x86/kvm/page_track.c
index de9b32f..0692cc6 100644
--- a/arch/x86/kvm/page_track.c
+++ b/arch/x86/kvm/page_track.c
@@ -188,3 +188,70 @@ bool kvm_page_track_check_mode(struct kvm_vcpu *vcpu, 
gfn_t gfn,
 
return !!ACCESS_ONCE(slot->arch.gfn_track[mode][index]);
 }
+
+void kvm_page_track_init(struct kvm *kvm)
+{
+   struct kvm_page_track_notifier_head *head;
+
+   head = &kvm->arch.track_notifier_head;
+   init_srcu_struct(&head->track_srcu);
+   INIT_HLIST_HEAD(&head->track_notifier_list);
+}
+
+/*
+ * register the notifier so that event interception for the tracked guest
+ * pages can be received.
+ */
+void
+kvm_page_track_register_notifier(struct kvm *kvm,
+struct kvm_page_track_notifier_node *n)
+{
+   struct kvm_page_track_notifier_head *head;
+
+   head = &kvm->arch.track_notifier_head;
+
+   spin_lock(&kvm->mmu_lock);
+   hlist_add_head_rcu(&n->node, &head->track_notifier_list);
+   spin_unlock(&kvm->mmu_lock);
+}
+
+/*
+ * stop receiving the event interception. It is the opposed operation of
+ * kvm_page_track_register_notifier().
+ */
+void
+kvm_page_track_unregister_notifier(struct kvm *kvm,
+  struct kvm_page_track_notifier_node *n)
+{
+   struct kvm_page_track_notifier_head *head;
+
+   head = &kvm->arch.track_notifier_head;
+
+   spin_lock(&kvm->mmu_lock);
+   hlist_del_rcu(&n->node);
+   spin_unlock(&kvm->mmu_lock);
+   synchronize_srcu(&head->track_srcu);
+}
+
+/*
+ * Notify the node that write access is intercepted and

[PATCH v3 01/11] KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowed

2016-02-14 Thread Xiao Guangrong

kvm_lpage_info->write_count is used to detect if the large page mapping
for the gfn on the specified level is allowed, rename it to disallow_lpage
to reflect its purpose, also we rename has_wrprotected_page() to
mmu_gfn_lpage_is_disallowed() to make the code more clearer

Later we will extend this mechanism for page tracking: if the gfn is
tracked then large mapping for that gfn on any level is not allowed.
The new name is more straightforward

Signed-off-by: Xiao Guangrong 
---
 Documentation/virtual/kvm/mmu.txt |  6 +++---
 arch/x86/include/asm/kvm_host.h   |  2 +-
 arch/x86/kvm/mmu.c| 25 +
 arch/x86/kvm/x86.c| 14 --
 4 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt 
b/Documentation/virtual/kvm/mmu.txt
index daf9c0f..dda2e93 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -391,11 +391,11 @@ To instantiate a large spte, four constraints must be 
satisfied:
   write-protected pages
 - the guest page must be wholly contained by a single memory slot
 
-To check the last two conditions, the mmu maintains a ->write_count set of
+To check the last two conditions, the mmu maintains a ->disallow_lpage set of
 arrays for each memory slot and large page size.  Every write protected page
-causes its write_count to be incremented, thus preventing instantiation of
+causes its disallow_lpage to be incremented, thus preventing instantiation of
 a large spte.  The frames at the end of an unaligned memory slot have
-artificially inflated ->write_counts so they can never be instantiated.
+artificially inflated ->disallow_lpages so they can never be instantiated.
 
 Zapping all pages (page generation count)
 =
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7dd6d55..e1c1f57 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -644,7 +644,7 @@ struct kvm_vcpu_arch {
 };
 
 struct kvm_lpage_info {
-   int write_count;
+   int disallow_lpage;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 95a955d..de9e992 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -789,7 +789,7 @@ static void account_shadowed(struct kvm *kvm, struct 
kvm_mmu_page *sp)
slot = __gfn_to_memslot(slots, gfn);
for (i = PT_DIRECTORY_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
linfo = lpage_info_slot(gfn, slot, i);
-   linfo->write_count += 1;
+   linfo->disallow_lpage += 1;
}
kvm->arch.indirect_shadow_pages++;
 }
@@ -807,31 +807,32 @@ static void unaccount_shadowed(struct kvm *kvm, struct 
kvm_mmu_page *sp)
slot = __gfn_to_memslot(slots, gfn);
for (i = PT_DIRECTORY_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
linfo = lpage_info_slot(gfn, slot, i);
-   linfo->write_count -= 1;
-   WARN_ON(linfo->write_count < 0);
+   linfo->disallow_lpage -= 1;
+   WARN_ON(linfo->disallow_lpage < 0);
}
kvm->arch.indirect_shadow_pages--;
 }
 
-static int __has_wrprotected_page(gfn_t gfn, int level,
- struct kvm_memory_slot *slot)
+static bool __mmu_gfn_lpage_is_disallowed(gfn_t gfn, int level,
+ struct kvm_memory_slot *slot)
 {
struct kvm_lpage_info *linfo;
 
if (slot) {
linfo = lpage_info_slot(gfn, slot, level);
-   return linfo->write_count;
+   return !!linfo->disallow_lpage;
}
 
-   return 1;
+   return true;
 }
 
-static int has_wrprotected_page(struct kvm_vcpu *vcpu, gfn_t gfn, int level)
+static bool mmu_gfn_lpage_is_disallowed(struct kvm_vcpu *vcpu, gfn_t gfn,
+   int level)
 {
struct kvm_memory_slot *slot;
 
slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-   return __has_wrprotected_page(gfn, level, slot);
+   return __mmu_gfn_lpage_is_disallowed(gfn, level, slot);
 }
 
 static int host_mapping_level(struct kvm *kvm, gfn_t gfn)
@@ -897,7 +898,7 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
large_gfn,
max_level = min(kvm_x86_ops->get_lpage_level(), host_level);
 
for (level = PT_DIRECTORY_LEVEL; level <= max_level; ++level)
-   if (__has_wrprotected_page(large_gfn, level, slot))
+   if (__mmu_gfn_lpage_is_disallowed(large_gfn, level, slot))
break;
 
return level - 1;
@@ -2503,7 +2504,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 * be fixed if guest refault.
 */
if (level > PT_PAGE_TABLE_LEVEL &&
-   has_wrprotected_page(vcpu, gfn, level))
+   mmu_gfn_lpage_is_disallowed(vcpu, gfn, level))

Re: Kernel docs: muddying the waters a bit

2016-02-14 Thread Daniel Vetter

On Sun, Feb 14, 2016 at 1:57 AM, Keith Packard  wrote:
> Jonathan Corbet  writes:
>
>> Asciidoc is a credible solution to the formatted documentation problem,
>> but it's not the only such; I'd like to be sure that we pick the right
>> one.  I worry that asciidoc seems to be aimed mostly at small documents,
>> and that the project itself seems a little lifeless - it's not a good
>> sign when your main page's link to the repository has been dead for a long
>> time.  (Asciidoctor seems more active, with the Github folks behind it,
>> but that means bringing Ruby into the picture).
>
> I was surprised when one of the asciidoctor developers said that
> asciidoc itself was 'in maintenance mode for existing users'. I've tried
> asciidoctor but never got it to the point where I was happy with the
> results. Having two tools using the same nominal format doesn't seem
> like a great idea to me.
>
> It's also clear from my hacking in asciidoc that docbook is the expected
> target for that tool. I've managed to make direct HTML output usable,
> but LaTeX doesn't work at all. Something which focuses on direct HTML
> (and ePub) output would be pretty nice.
>
>> An alternative we haven't really looked at yet is ReStructuredText (or
>> "RST") and the Sphinx system (sphinx-doc.org) built on top of it.  RST is
>> YA simple markup scheme, remarkably similar to Markdown or Asciidoc;
>> Sphinx is a fairly sophisticated documentation system that uses RST.
>
> I've installed debian's python3-sphinx package; it looks like it doesn't
> have a huge dependency chain below it, which is a nice change.
>
> I translated a fairly long document from asciidoc to rst using pandoc by
> using the docbook output from asciidoc -- pandoc doesn't have a native
> asciidoc reader, only a writer. The result didn't totally suck, although
> I haven't messed with fixing the css or using a different theme at all.
>
> http://keithp.com/~keithp/altusmetrum-sphinx/altusmetrum.html
>
> I installed the sphinxcontrib.fulltoc extension so that the whole TOC
> was visible from each section; this made navigating a lot easier. Having
> search included (if you have javascript) seems like a nice feature.
>
>> Like asciidoc, Sphinx is Python-based, so it adds little to the toolchain
>> requirements there.
>
> Having functional native latex output means that even PDF generation is
> lighterweight though.
>
>> It produces integrated, multi-file HTML natively,
>> with a TOC, an index, cross-file cross references, and more.  It can make
>> things like function indexes.  It claims output in epub, docbook, and man
>> (I've not yet messed with those).  The path to PDF is via latex; clearly
>> the docbook path could be used too.
>
> I've tried epub and latex backends; epub seems just fine (it's just
> html, after all). LaTeX works, and generates functional PDF, but I'm
> going to have to spend a bunch of time making it looks nice.
>
> http://keithp.com/~keithp/altusmetrum-sphinx/AltusMetrum.pdf
>
>> So can we discuss?  I'm not saying we have to use Sphinx, but, should we
>> choose not to, we should do so with open eyes and good reasons for the
>> course we do take.  What do you all think?
>
> Having spent the afternoon playing with it, I'm definitely
> impressed. I've spent a ton of time getting asciidoc to generate html
> and pdf that I can tolerate; far too much of that involved hacking XML
> files related to the docbook backend.
>
> Pros
>
>  * Credible HTML output without docbook
>
>  * Credible PDF output without docbook.
>
>  * Constructs a unified set of documents across
>multiple files.
>
>  * Written in Python (2 or 3)
>
>  * PanDoc already supports rst for both input and output. So, if we get
>bored with RST, we've got a way out.
>
> Cons
>
>  * Table formatting doesn't seem as sophisticated as asciidoc
>
> Questions
>
>  * Conditional text appears to be harder to manage (I haven't managed to
>make it work at all).
>
>  * Takes over a directory making building more than one
>document in a directory hard/impossible? The config file must be
>named 'conf.py'?

One concern/open I have for pro/cons are the hyperlinks from kerneldoc
comments. Currently we have the postproc hack, iirc Jani's patches
generated links native when extracting the kerneldoc. What's the
solution with spinx?

The other one is graphs - Keith showed me some neat stuff that
asciidoc can do, and I definitely wanted to integrate something like
that as a follow-up into the kerneldoc toolchain. Often a diagram is a
lot more helpful than lots of words. Can sphinx gives us that too?

Wrt reformatting: I'm not going to like it, but I hope that with a bit
of sed we can fix up any of the asciidoc comments we have already
easily - right now we don't (yet) use much of the more sophisticated
markup yet. So much better to change now than 1 year down the road.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH] x86/boot: remove unused is_big_kernel variable

2016-02-14 Thread Nicolas Iooss

Variable is_big_kernel is defined in arch/x86/boot/tools/build.c but
never used anywhere.

Signed-off-by: Nicolas Iooss 
---
 arch/x86/boot/tools/build.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
index a7661c430cd9..0702d2531bc7 100644
--- a/arch/x86/boot/tools/build.c
+++ b/arch/x86/boot/tools/build.c
@@ -49,7 +49,6 @@ typedef unsigned int   u32;
 
 /* This must be large enough to hold the entire setup */
 u8 buf[SETUP_SECT_MAX*512];
-int is_big_kernel;
 
 #define PECOFF_RELOC_RESERVE 0x20
 
-- 
2.7.1

[PATCH] mfd: stmpe: add the proper PWM resources

2016-02-14 Thread Linus Walleij

This adds the PWM resources to the STMPE MFD driver, so that
it can properly grab and use them.

Cc: linux-...@vger.kernel.org
Cc: Thierry Reding 
Signed-off-by: Linus Walleij 
---
ChangeLog: split this patch off from the PWM driver and sent
separately. This can be merged separately too so we need no
criss-cross between MFD and PWM anymore, Lee: this one is for
you if it looks all right, excess newlines are gone too.
---
 drivers/mfd/stmpe.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/drivers/mfd/stmpe.c b/drivers/mfd/stmpe.c
index 8222e374e4b1..fb8f9e8b75df 100644
--- a/drivers/mfd/stmpe.c
+++ b/drivers/mfd/stmpe.c
@@ -334,6 +334,31 @@ static const struct mfd_cell stmpe_keypad_cell = {
 };
 
 /*
+ * PWM (1601, 2401, 2403)
+ */
+static struct resource stmpe_pwm_resources[] = {
+   {
+   .name   = "PWM0",
+   .flags  = IORESOURCE_IRQ,
+   },
+   {
+   .name   = "PWM1",
+   .flags  = IORESOURCE_IRQ,
+   },
+   {
+   .name   = "PWM2",
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static const struct mfd_cell stmpe_pwm_cell = {
+   .name   = "stmpe-pwm",
+   .of_compatible  = "st,stmpe-pwm",
+   .resources  = stmpe_pwm_resources,
+   .num_resources  = ARRAY_SIZE(stmpe_pwm_resources),
+};
+
+/*
  * STMPE801
  */
 static const u8 stmpe801_regs[] = {
@@ -537,6 +562,11 @@ static struct stmpe_variant_block stmpe1601_blocks[] = {
.irq= STMPE1601_IRQ_KEYPAD,
.block  = STMPE_BLOCK_KEYPAD,
},
+   {
+   .cell   = &stmpe_pwm_cell,
+   .irq= STMPE1601_IRQ_PWM0,
+   .block  = STMPE_BLOCK_PWM,
+   },
 };
 
 /* supported autosleep timeout delay (in msecs) */
@@ -771,6 +801,11 @@ static struct stmpe_variant_block stmpe24xx_blocks[] = {
.irq= STMPE24XX_IRQ_KEYPAD,
.block  = STMPE_BLOCK_KEYPAD,
},
+   {
+   .cell   = &stmpe_pwm_cell,
+   .irq= STMPE24XX_IRQ_PWM0,
+   .block  = STMPE_BLOCK_PWM,
+   },
 };
 
 static int stmpe24xx_enable(struct stmpe *stmpe, unsigned int blocks,
-- 
2.4.3

Re: [PATCH v4 1/2] ARM: dts: imx6: Add support for Toradex Apalis iMX6Q/D SoM

2016-02-14 Thread Linus Walleij

On Thu, Feb 4, 2016 at 12:02 AM, Marcel Ziswiler
 wrote:
> On Wed, 2016-02-03 at 23:55 +0100, Marcel Ziswiler wrote:
>> On Mon, 2016-02-01 at 21:29 +0800, Shawn Guo wrote:
>>
>> [snip]
>>
>> > +   /* STMPE811 touch screen controller */
>> > > + stmpe811@41 {
>> > > + compatible = "st,stmpe811";
>> >
>> > The compatible seems to be used by kernel without a binding
>> > document.
>>
>> Indeed and looking at git logs Linus himself introduced it (;-p):
>>
>> commit 5a826feedc331a2d5ced2afd832199a70b3af891
>> Author: Linus Walleij 
>> Date:   Wed Apr 23 23:35:58 2014 +0200
>> Subject: mfd: stmpe: Probe properly from the Device Tree
>
> Looking more closely there is actually a proper binding document which
> got added by Lee Jones:
>
> commit 84e6de813b2d1bdb127943d3c8edf1c1afaa90da
> Author: Lee Jones 
> Date:   Mon Nov 5 16:10:34 2012 +0100
> Subject: Documentation: Describe bindings for STMPE Multi-Functional
> Device driver
>
> Documentation/devicetree/bindings/mfd/stmpe.txt
>
> Required properties:
>  - compatible   : "st,stmpe[811|1601|2401|2403]"
>
> Sorry for the noise.

Yeah the STMPE driver was a common work between Rabin and
Luotao Fu who pitched in the 811 with touchscreen support.

Then Lee added the device tree stuff.

I know it's a bit confusing that the DT bindings for the whole shebang
is in bindings/mfd but it is also confusing to split them up, don't know
what is best really :(

Yours,
Linus Walleij

next: sparc64 crashes due to 'blk-mq: dynamic h/w context count'

2016-02-14 Thread Guenter Roeck

Hi,

my runtime tests of linux-next crash for sparc64 due to commit 'blk-mq: dynamic
h/w context count'. Reverting the patch fixes the problem. Bisect log is
attached below. Full crash log is available at http://kerneltests.org/builders,
in the table with qemu test results.

Guenter

---
crash log:

[2.470860] Unable to handle kernel paging request at virtual address 
e000
[2.471099] tsk->{mm,active_mm}->context = 
[2.471263] tsk->{mm,active_mm}->pgd = f8402000
[2.471416]   \|/  \|/
[2.471416]   "@'/ .. \`@"
[2.471416]   /_| \__/ |_\
[2.471416]  \__U_/
[2.471848] bioset(350): Oops [#1]
[2.472072] CPU: 0 PID: 350 Comm: bioset Not tainted 4.5.0-rc3-next-20160212 
#1
[2.472418] task: f8001f2369e0 ti: f8001f354000 task.ti: 
f8001f354000
[2.472641] TSTATE: 008080e01603 TPC: 00470d70 TNPC: 
00470d74 Y: 0016Not tainted
[2.472933] TPC: 
[2.473062] g0: f8001f2369e0 g1:  g2: 04208060 
g3: 
[2.473304] g4: f8001f2369e0 g5:  g6: f8001f354000 
g7: 0400
[2.473546] o0:  o1: ffec o2: 0008 
o3: 00015ab9
[2.473788] o4: 00a11800 o5:  sp: f8001f3574e1 
ret_pc: 00470d4c
[2.474039] RPC: 
[2.474176] l0: f8001f213800 l1: f8001f213870 l2: 00a16000 
l3: 
[2.474426] l4: 0001 l5: 00a66000 l6: 0001 
l7: 0096c990
[2.474671] i0: f8001f3328d0 i1:  i2: 00abab10 
i3: 0082
[2.474915] i4: 00a25e58 i5: f8001f3328a0 i6: f8001f3575a1 
i7: 00475b48
[2.475171] I7: 
[2.475292] Call Trace:
[2.475400]  [00475b48] kthread+0xa8/0xe0
[2.475547]  [00405fa4] ret_from_fork+0x1c/0x2c
[2.475729]  []   (null)
[2.475879] Disabling lock debugging due to kernel taint
[2.476063] Caller[00475b48]: kthread+0xa8/0xe0
[2.476228] Caller[00405fa4]: ret_from_fork+0x1c/0x2c
[2.476392] Caller[]:   (null)
[2.476545] Instruction DUMP: 02600070  0100  f25c2070  
b6067f80  c071  c25e6008  c45e4000  c270a008 
[2.477010] Unable to handle kernel paging request at virtual address 
e000
[2.477233] tsk->{mm,active_mm}->context = 
[2.477388] tsk->{mm,active_mm}->pgd = f8402000
[2.477533]   \|/  \|/
[2.477533]   "@'/ .. \`@"
[2.477533]   /_| \__/ |_\
[2.477533]  \__U_/
[2.477941] bioset(350): Oops [#2]
[2.478085] CPU: 0 PID: 350 Comm: bioset Tainted: G  D 
4.5.0-rc3-next-20160212 #1
[2.478333] task: f8001f2369e0 ti: f8001f354000 task.ti: 
f8001f354000
[2.478550] TSTATE: 11e01603 TPC: 00476148 TNPC: 
004717b0 Y: 0190Tainted: G  D
[2.478854] TPC: 
[2.478969] g0: f8001f357790 g1:  g2: 0420806c 
g3: 0004
[2.479207] g4: f8001f2369e0 g5:  g6: f8001f354000 
g7: ffd23940
[2.479449] o0: f8001f2369e0 o1: f8001f2369e0 o2: 00a11800 
o3: 000166b9
[2.479687] o4: 00a11800 o5: 00a11a18 sp: f8001f356e71 
ret_pc: 004717a8
[2.479937] RPC: 
[2.480064] l0: 007b l1: 00abceb0 l2: 0080 
l3: 0005
[2.480305] l4: 2290 l5: 00afac00 l6:  
l7: 
[2.480554] i0:  i1:  i2: 0001 
i3: 015e
[2.480795] i4: 000e i5: 000e i6: f8001f356f21 
i7: 008b7e4c
[2.481039] I7: 
[2.481154] Call Trace:
[2.481231]  [008b7e4c] switch_to_pc+0xa0/0x394
[2.481380]  [008b825c] schedule+0x1c/0xa0
[2.481519]  [0045e218] do_exit+0x578/0x9a0
[2.481659]  [00427b78] die_if_kernel+0x198/0x320
[2.481813]  [008bb848] unhandled_fault+0x8c/0xa4
[2.481968]  [008bbe68] do_sparc64_fault+0x608/0x720
[2.482128]  [00407ac4] sparc64_realfault_common+0x10/0x20
[2.482316]  [00470d70] rescuer_thread+0x70/0x2c0
[2.482477]  [00475b48] kthread+0xa8/0xe0
[2.482613]  [00405fa4] ret_from_fork+0x1c/0x2c
[2.482759]  []   (null)
[2.482897] Caller[008b7e4c]: switch_to_pc+0xa0/0x394
[2.483064] Caller[008b825c]: schedule+0x1c/0xa0
[2.483218] Caller[0045e218]: do_exit+0x578/0x9a0
[2.483374] Caller[00427b78]: die_if_kernel+0x198/0x320
[2.483544] Caller[008bb848]: unhandled_fault+0x8c/0x

Re: [PATCH v2] gpio: Add driver for TI TPIC2810

2016-02-14 Thread Linus Walleij

On Wed, Feb 10, 2016 at 3:29 PM, Andy Shevchenko
 wrote:
> On Wed, Feb 10, 2016 at 4:21 PM, Linus Walleij  
> wrote:
>> On Sun, Jan 31, 2016 at 11:52 PM, Andy Shevchenko
>>  wrote:
>>
>>> It reminds me how 12 channel PWM chip is used on Intel Galileo Gen 2.
>>> Half pins are PWM, the other half is GPIO used for discrete based pin
>>> muxing and control. Nevertheless I think it's a userspace issue for
>>> now, otherwise we have to provide some 'semi-virtual' way of
>>> presenting pins as GPIO lines.
>>
>> That sounds like an MFD spawning a GPIO and a PWM cell.
>> That it is called "a PWM chip" is no big deal, it should be
>> modeled according to what it is, not what it claims to be.
>
> Although I agree with model I barely imagine how in this case drivers
> should access PWM chip registers in non-race way (take into account
> that PWM itself is connected to i2c bus).

There is a pattern for that. You add a set of accessor functions that
performs the I2C traffic in the MFD layer.

The accessor functions take a mutex. Since this is all slowpath,
waiting/preempting in a mutex is perfectly fine for all subdrivers.

Look at this:

/**
 * stmpe_reg_write() - write a single STMPE register
 * @stmpe:  Device to write to
 * @reg:Register to write
 * @val:Value to write
 */
int stmpe_reg_write(struct stmpe *stmpe, u8 reg, u8 val)
{
int ret;

mutex_lock(&stmpe->lock);
ret = __stmpe_reg_write(stmpe, reg, val);
mutex_unlock(&stmpe->lock);

return ret;
}
EXPORT_SYMBOL_GPL(stmpe_reg_write);

Yours,
Linus Walleij

Re: About support XZ-compressed kernel on x86

2016-02-14 Thread Baoquan He

On 02/13/16 at 08:57pm, Lasse Collin wrote:
> On 2016-02-12 Baoquan He wrote:
> > Now I have a question about the commit from you:
> > 
> > commit 303148045aac34b70db722a54e5ad94a3a6625c6
> > Author: Lasse Collin 
> > Date:   Wed Jan 12 17:01:24 2011 -0800
> > 
> > x86: support XZ-compressed kernel
> > 
> > 
> > In this commit for adding support of XZ-compressed kernel on x86, you
> > add extra 32K to the extract_offset. In commit log you said this is
> > because "The XZ decompressor needs around 30 KiB of heap, so the heap
> > size is increased to 32 KiB on both x86-32 and x86-64." With my
> > understanding decompression is done in decompression stage and it uses
> > boot_heap in arch/x86/boot/compressed/head_64.S, and boot_heap is
> > assigned to free_mem_ptr which is used for decompression heap malloc.
> > During this decompressio stage it's still in copied ZO space, why did
> > you add extra 32K space to extract_offset?  If you want to increase
> > the decompression heap space shouldn't you decrease the
> > extract_offset? Do I misunderstand anything or miss things?
> 
> The reason to increase the heap size in arch/x86/include/asm/boot.h is
> unrelated to the reason why the offset was changed in
> arch/x86/boot/compressed/mkpiggy.c.
> 
> The long comment in arch/x86/boot/compressed/misc.c explains the need
> for the offset for gzip/Deflate. A similar comment in
> lib/decompress_unxz.c explains it for XZ/LZMA2.

Thank you so much, Lasse. You clearly pointed out my confusion.
Yeah, I didn't understand it well. Your description for xz in
lib/decompress_unxz.c is very helpful. The 64K is the maximum payload in
one chunk. Adding this 64K is to avoid the worst case that very small
payload can reprsent a 64K uncompressed output data. With my
understanding it could be  a chunk which contains complete duplicate
content. like all "0" or other stuff?

Thanks
Baoquan

> 
> Smaller safety-margins can work in practice since the calculated
> margins are for the worst case. I'm not even sure if such calculations
> have been done for the other decompressors in Linux.

[PATCH] irqchip/ts4800: Make ts4800_ic_ops static const

2016-02-14 Thread Axel Lin

ts4800_ic_ops is only referenced in this driver, so make it static.
In additional, it's never get modified thus also make it const.

Signed-off-by: Axel Lin 
---
 drivers/irqchip/irq-ts4800.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ts4800.c b/drivers/irqchip/irq-ts4800.c
index 4192bdc..2325fb3 100644
--- a/drivers/irqchip/irq-ts4800.c
+++ b/drivers/irqchip/irq-ts4800.c
@@ -59,7 +59,7 @@ static int ts4800_irqdomain_map(struct irq_domain *d, 
unsigned int irq,
return 0;
 }
 
-struct irq_domain_ops ts4800_ic_ops = {
+static const struct irq_domain_ops ts4800_ic_ops = {
.map = ts4800_irqdomain_map,
.xlate = irq_domain_xlate_onecell,
 };
-- 
2.1.4

Re: [PATCH] x86/boot: remove unused is_big_kernel variable

2016-02-14 Thread Borislav Petkov

On Sun, Feb 14, 2016 at 01:35:58PM +0100, Nicolas Iooss wrote:
> Variable is_big_kernel is defined in arch/x86/boot/tools/build.c but
> never used anywhere.
> 
> Signed-off-by: Nicolas Iooss 
> ---
>  arch/x86/boot/tools/build.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
> index a7661c430cd9..0702d2531bc7 100644
> --- a/arch/x86/boot/tools/build.c
> +++ b/arch/x86/boot/tools/build.c
> @@ -49,7 +49,6 @@ typedef unsigned int   u32;
>  
>  /* This must be large enough to hold the entire setup */
>  u8 buf[SETUP_SECT_MAX*512];
> -int is_big_kernel;
>  
>  #define PECOFF_RELOC_RESERVE 0x20
>  
> -- 

Yap, that went out with 5e47c478b0b6 ("x86: remove zImage support")

Reviewed-by: Borislav Petkov 

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

Re: [PATCH v2] x86/kernel: use pr_() and dev_

2016-02-14 Thread Borislav Petkov

On Sun, Feb 14, 2016 at 12:10:47PM +0800, Chen Yucong wrote:
> arch/x86/kernel/* use a mixture of printk(KERN_ ) and pr_().
> This patch converts the bulk of printk(KERN_ ) to pr_() and
> uses dev_dbg() instead of the dev_printk(KERN_DEBUG,). All pr_warning()
> calls have been replaced with pr_warn().
> 
> Not sure what to do about the printk(KERN_DEFAULT) and printk() without a
> log level.

...

> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 25f9093..0ecb579 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -59,7 +59,7 @@ __setup("noreplace-paravirt", setup_noreplace_paravirt);
>  #define DPRINTK(fmt, args...)
> \
>  do { \
>   if (debug_alternative)  \
> - printk(KERN_DEBUG "%s: " fmt "\n", __func__, ##args);   \
> + pr_debug("%s: " fmt "\n", __func__, ##args);\
>  } while (0)
>  
>  #define DUMP_BYTES(buf, len, fmt, args...)   \
> @@ -70,10 +70,10 @@ do {  
> \
>   if (!(len)) \
>   break;  \
>   \
> - printk(KERN_DEBUG fmt, ##args); \
> + pr_debug(fmt, ##args);  \
>   for (j = 0; j < (len) - 1; j++) \
> - printk(KERN_CONT "%02hhx ", buf[j]);\
> - printk(KERN_CONT "%02hhx\n", buf[j]);   \
> + pr_cont("%02hhx ", buf[j]); \
> + pr_cont("%02hhx\n", buf[j]);\
>   }   \
>  } while (0)
>  

NAK the hell out of that hunk!

Did you actually look at how pr_debug() is defined?

Yeah, I don't think so. With your change, when I boot with
"debug-alternative" I get:

...
[0.064005] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0
[0.068005] e9 d5 3e d3 00
[0.072004] e9 e8 92 21 ff
[0.075003] eb 11 0f 1f 00
[0.077906] e8 c5 b6 30 00
[0.084009] f3 48 0f b8 c7
[0.091241] f3 48 0f b8 c7
[0.094259] e8 d9 b5 30 00
[0.097611] f3 48 0f b8 c7
[0.14] f3 48 0f b8 c7
[0.103067] 90 90 90
[0.105575] 0f ae f0
[0.108004] 0f ae f0
[0.112007] 90 90 90
[0.114331] 0f ae f0
[0.116004] 0f ae f0
[0.118365] e8 07 21 62 ff

How is that useful?!

Please stop for a second with those senseless conversions and think
first. Try the change you're doing in kvm, take a look at what it
affects and think hard whether it makes any sense at all. Only if it
does, *then* send out the patch.

I'm willing to bet that *all* pr_debug* conversions below are wrong too.

Geez :(

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

Re: [PATCH v42 5/6] clk: clk_put WARNs if user has not disabled clk

2016-02-14 Thread Geert Uytterhoeven

Hi Stephen, Mike,

On Sat, Feb 13, 2016 at 2:18 AM, Stephen Boyd  wrote:
> On 02/11, Michael Turquette wrote:
>> >From the clk_put kerneldoc in include/linux/clk.h:
>>
>> """
>> Note: drivers must ensure that all clk_enable calls made on this clock
>> source are balanced by clk_disable calls prior to calling this function.
>> """
>>
>> The common clock framework implementation of the clk.h api has per-user
>> reference counts for calls to clk_prepare and clk_disable. As such it
>> can enforce the requirement to properly call clk_disable and
>> clk_unprepare before calling clk_put.
>>
>> Because this requirement is probably violated in many places, this patch
>> starts with a simple warning. Once offending code has been fixed this
>> check could additionally release the reference counts automatically.
>
> Do we have any fixes for pm code in the works? I'm worried we're
> going to be giving a warning and nobody will fix them or has a
> plan to fix them.
drivers/base/power/clock_ops.c
AFAIK not.

I've been running with the above patch for several months, and I had
to remove the two clk_put() calls in {en,dis}able_clock() in
drivers/base/power/clock_ops.c to get rid of the warnings when using the
legacy clock domain.

Fixing drivers/base/power/clock_ops.c is non-trivial though, as you need a
place to store the clk's reference obtained in enable_clock(), for later use in
disable_clock().

However, the plan is to make CONFIG_PM=y mandatory for Renesas ARM
SoCs with clock domains, which makes us no longer users of the legacy clock
domain.

Legacy SH and Davinci/Keystone/OMAP1 users may care, though...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: next: sparc64 crashes due to 'blk-mq: dynamic h/w context count'

2016-02-14 Thread Ming Lei

On Sun, Feb 14, 2016 at 9:17 PM, Guenter Roeck  wrote:
> Hi,
>
> my runtime tests of linux-next crash for sparc64 due to commit 'blk-mq: 
> dynamic
> h/w context count'. Reverting the patch fixes the problem. Bisect log is
> attached below. Full crash log is available at 
> http://kerneltests.org/builders,
> in the table with qemu test results.

Guenter, could you test patch in the following link to see if it can be fixed?

http://marc.info/?l=linux-kernel&m=145526562410555&w=2

Thanks,

>
> Guenter
>
> ---
> crash log:
>
> [2.470860] Unable to handle kernel paging request at virtual address 
> e000
> [2.471099] tsk->{mm,active_mm}->context = 
> [2.471263] tsk->{mm,active_mm}->pgd = f8402000
> [2.471416]   \|/  \|/
> [2.471416]   "@'/ .. \`@"
> [2.471416]   /_| \__/ |_\
> [2.471416]  \__U_/
> [2.471848] bioset(350): Oops [#1]
> [2.472072] CPU: 0 PID: 350 Comm: bioset Not tainted 
> 4.5.0-rc3-next-20160212 #1
> [2.472418] task: f8001f2369e0 ti: f8001f354000 task.ti: 
> f8001f354000
> [2.472641] TSTATE: 008080e01603 TPC: 00470d70 TNPC: 
> 00470d74 Y: 0016Not tainted
> [2.472933] TPC: 
> [2.473062] g0: f8001f2369e0 g1:  g2: 04208060 
> g3: 
> [2.473304] g4: f8001f2369e0 g5:  g6: f8001f354000 
> g7: 0400
> [2.473546] o0:  o1: ffec o2: 0008 
> o3: 00015ab9
> [2.473788] o4: 00a11800 o5:  sp: f8001f3574e1 
> ret_pc: 00470d4c
> [2.474039] RPC: 
> [2.474176] l0: f8001f213800 l1: f8001f213870 l2: 00a16000 
> l3: 
> [2.474426] l4: 0001 l5: 00a66000 l6: 0001 
> l7: 0096c990
> [2.474671] i0: f8001f3328d0 i1:  i2: 00abab10 
> i3: 0082
> [2.474915] i4: 00a25e58 i5: f8001f3328a0 i6: f8001f3575a1 
> i7: 00475b48
> [2.475171] I7: 
> [2.475292] Call Trace:
> [2.475400]  [00475b48] kthread+0xa8/0xe0
> [2.475547]  [00405fa4] ret_from_fork+0x1c/0x2c
> [2.475729]  []   (null)
> [2.475879] Disabling lock debugging due to kernel taint
> [2.476063] Caller[00475b48]: kthread+0xa8/0xe0
> [2.476228] Caller[00405fa4]: ret_from_fork+0x1c/0x2c
> [2.476392] Caller[]:   (null)
> [2.476545] Instruction DUMP: 02600070  0100  f25c2070  
> b6067f80  c071  c25e6008  c45e4000  c270a008
> [2.477010] Unable to handle kernel paging request at virtual address 
> e000
> [2.477233] tsk->{mm,active_mm}->context = 
> [2.477388] tsk->{mm,active_mm}->pgd = f8402000
> [2.477533]   \|/  \|/
> [2.477533]   "@'/ .. \`@"
> [2.477533]   /_| \__/ |_\
> [2.477533]  \__U_/
> [2.477941] bioset(350): Oops [#2]
> [2.478085] CPU: 0 PID: 350 Comm: bioset Tainted: G  D 
> 4.5.0-rc3-next-20160212 #1
> [2.478333] task: f8001f2369e0 ti: f8001f354000 task.ti: 
> f8001f354000
> [2.478550] TSTATE: 11e01603 TPC: 00476148 TNPC: 
> 004717b0 Y: 0190Tainted: G  D
> [2.478854] TPC: 
> [2.478969] g0: f8001f357790 g1:  g2: 0420806c 
> g3: 0004
> [2.479207] g4: f8001f2369e0 g5:  g6: f8001f354000 
> g7: ffd23940
> [2.479449] o0: f8001f2369e0 o1: f8001f2369e0 o2: 00a11800 
> o3: 000166b9
> [2.479687] o4: 00a11800 o5: 00a11a18 sp: f8001f356e71 
> ret_pc: 004717a8
> [2.479937] RPC: 
> [2.480064] l0: 007b l1: 00abceb0 l2: 0080 
> l3: 0005
> [2.480305] l4: 2290 l5: 00afac00 l6:  
> l7: 
> [2.480554] i0:  i1:  i2: 0001 
> i3: 015e
> [2.480795] i4: 000e i5: 000e i6: f8001f356f21 
> i7: 008b7e4c
> [2.481039] I7: 
> [2.481154] Call Trace:
> [2.481231]  [008b7e4c] switch_to_pc+0xa0/0x394
> [2.481380]  [008b825c] schedule+0x1c/0xa0
> [2.481519]  [0045e218] do_exit+0x578/0x9a0
> [2.481659]  [00427b78] die_if_kernel+0x198/0x320
> [2.481813]  [008bb848] unhandled_fault+0x8c/0xa4
> [2.481968]  [008bbe68] do_sparc64_fault+0x608/0x720
> [2.482128]  [00407ac4] sparc64_realfault_common+0x10/0x20
> [2.482316]  [00470d70] rescuer_thread+0x70/0x2c0
> [2.482477]  [00475b48] kthread+0xa8/0xe0
> [2.482613]  [00405fa

[PATCH 0/5] perf tools: Store CPU cache details under perf data

2016-02-14 Thread Jiri Olsa

hi,
adding support to store CPU cache details under perf data. 

  $ perf report --header-only -I
  ...
  # cache info:
  #  L1 Data 32K [0-1]
  #  L1 Instruction 32K [0-1]
  #  L1 Data 32K [2-3]
  #  L1 Instruction 32K [2-3]
  #  L2 Unified 256K [0-1]
  #  L2 Unified 256K [2-3]
  #  L3 Unified 4096K [0-3]
  ...

Plus some libapi additions.

Also available in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/cache

thanks,
jirka


---
Jiri Olsa (5):
  tools lib api: Add debug output support
  tools lib api fs: Move filename__read_str into api/fs/fs.c
  tools lib api fs: Add sysfs__read_str function
  perf tools: Initialize libapi debug output
  perf tools: Add perf data cache feature

 tools/lib/api/Build|   1 +
 tools/lib/api/Makefile |   1 +
 tools/lib/api/debug-internal.h |  20 +
 tools/lib/api/debug.c  |  28 +
 tools/lib/api/debug.h  |  10 +
 tools/lib/api/fs/fs.c  |  64 +
 tools/lib/api/fs/fs.h  |   3 ++
 tools/perf/perf.c  |   2 +
 tools/perf/util/debug.c|  21 ++
 tools/perf/util/debug.h|   1 +
 tools/perf/util/env.c  |  13 ++
 tools/perf/util/env.h  |  15 +++
 tools/perf/util/header.c   | 265 

 tools/perf/util/header.h   |   1 +
 tools/perf/util/trace-event.c  |   1 +
 tools/perf/util/util.c |  48 --
 tools/perf/util/util.h |   1 -
 17 files changed, 446 insertions(+), 49 deletions(-)
 create mode 100644 tools/lib/api/debug-internal.h
 create mode 100644 tools/lib/api/debug.c
 create mode 100644 tools/lib/api/debug.h

[PATCH 5/5] perf tools: Add perf data cache feature

2016-02-14 Thread Jiri Olsa

Storing CPU cache details under perf data. It's stored
as new HEADER_CACHE feature and it's displayed under
header info with -I option:

  $ perf report --header-only -I
  ...
  # cache info:
  #  L1 Data 32K [0-1]
  #  L1 Instruction 32K [0-1]
  #  L1 Data 32K [2-3]
  #  L1 Instruction 32K [2-3]
  #  L2 Unified 256K [0-1]
  #  L2 Unified 256K [2-3]
  #  L3 Unified 4096K [0-3]
  ...

All distinct caches are stored/displayed.

Link: http://lkml.kernel.org/n/tip-byxl1gwto8z9d5hyozprt...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/perf/util/env.c|  13 +++
 tools/perf/util/env.h|  15 +++
 tools/perf/util/header.c | 265 +++
 tools/perf/util/header.h |   1 +
 4 files changed, 294 insertions(+)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 7dd5939dea2e..02a6970f2495 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -6,6 +6,8 @@ struct perf_env perf_env;
 
 void perf_env__exit(struct perf_env *env)
 {
+   int i;
+
zfree(&env->hostname);
zfree(&env->os_release);
zfree(&env->version);
@@ -19,6 +21,10 @@ void perf_env__exit(struct perf_env *env)
zfree(&env->numa_nodes);
zfree(&env->pmu_mappings);
zfree(&env->cpu);
+
+   for (i = 0; i < env->caches_cnt; i++)
+   cache_level__free(&env->caches[i]);
+   zfree(&env->caches);
 }
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
@@ -75,3 +81,10 @@ int perf_env__read_cpu_topology_map(struct perf_env *env)
env->nr_cpus_avail = nr_cpus;
return 0;
 }
+
+void cache_level__free(struct cache_level *cache)
+{
+   free(cache->type);
+   free(cache->map);
+   free(cache->size);
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 0132b9557c02..9963499820e7 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -1,11 +1,23 @@
 #ifndef __PERF_ENV_H
 #define __PERF_ENV_H
 
+#include 
+
 struct cpu_topology_map {
int socket_id;
int core_id;
 };
 
+struct cache_level {
+   u32 level;
+   u32 line_size;
+   u32 sets;
+   u32 ways;
+   char*type;
+   char*size;
+   char*map;
+};
+
 struct perf_env {
char*hostname;
char*os_release;
@@ -31,6 +43,8 @@ struct perf_env {
char*numa_nodes;
char*pmu_mappings;
struct cpu_topology_map *cpu;
+   struct cache_level  *caches;
+   int  caches_cnt;
 };
 
 extern struct perf_env perf_env;
@@ -41,4 +55,5 @@ int perf_env__set_cmdline(struct perf_env *env, int argc, 
const char *argv[]);
 
 int perf_env__read_cpu_topology_map(struct perf_env *env);
 
+void cache_level__free(struct cache_level *cache);
 #endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f50b7235ecb6..55d5a7ff25f6 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -23,6 +23,8 @@
 #include "strbuf.h"
 #include "build-id.h"
 #include "data.h"
+#include 
+#include "asm/bug.h"
 
 /*
  * magic2 = "PERFILE2"
@@ -868,6 +870,197 @@ static int write_auxtrace(int fd, struct perf_header *h,
return err;
 }
 
+static int cache_level__sort(const void *a, const void *b)
+{
+   struct cache_level *cache_a = (struct cache_level *) a;
+   struct cache_level *cache_b = (struct cache_level *) b;
+
+   return cache_a->level - cache_b->level;
+}
+
+static bool cache_level__cmp(struct cache_level *a, struct cache_level *b)
+{
+   if (a->level != b->level)
+   return false;
+
+   if (a->line_size != b->line_size)
+   return false;
+
+   if (a->sets != b->sets)
+   return false;
+
+   if (a->ways != b->ways)
+   return false;
+
+   if (strcmp(a->type, b->type))
+   return false;
+
+   if (strcmp(a->size, b->size))
+   return false;
+
+   if (strcmp(a->map, b->map))
+   return false;
+
+   return true;
+}
+
+static int cache_level__read(struct cache_level *cache, u32 cpu, u16 level)
+{
+   char path[PATH_MAX], file[PATH_MAX];
+   struct stat st;
+   size_t len;
+
+   scnprintf(path, PATH_MAX, "devices/system/cpu/cpu%d/cache/index%d/", 
cpu, level);
+   scnprintf(file, PATH_MAX, "%s/%s", sysfs__mountpoint(), path);
+
+   if (stat(file, &st))
+   return 1;
+
+   scnprintf(file, PATH_MAX, "%s/level", path);
+   if (sysfs__read_int(file, (int *) &cache->level))
+   return -1;
+
+   scnprintf(file, PATH_MAX, "%s/coherency_line_size", path);
+   if (sysfs__read_int(file, (int *) &cache->line_size))
+   return -1;
+
+   scnprintf(file, PATH_MAX, "%s/number_of_sets", path);
+   if (sysfs__read_int(file, (int *) &cache->sets))
+   return -1;
+
+   scnprin

[PATCH 4/5] perf tools: Initialize libapi debug output

2016-02-14 Thread Jiri Olsa

Setting libapi debug output functions
to use perf functions.

Link: http://lkml.kernel.org/n/tip-apyen6qfzzwkqjhi1tzv7...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/perf/perf.c   |  2 ++
 tools/perf/util/debug.c | 21 +
 tools/perf/util/debug.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index a929618b8eb6..144047c396f0 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -613,6 +613,8 @@ int main(int argc, const char **argv)
 */
pthread__block_sigwinch();
 
+   perf_debug_setup();
+
while (1) {
static int done_help;
int was_alias = run_argv(&argc, &argv);
diff --git a/tools/perf/util/debug.c b/tools/perf/util/debug.c
index 86d9c7302598..152c2e32975f 100644
--- a/tools/perf/util/debug.c
+++ b/tools/perf/util/debug.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cache.h"
 #include "color.h"
@@ -192,3 +193,23 @@ int perf_debug_option(const char *str)
free(s);
return 0;
 }
+
+#define FUNC(__n, __l) \
+static int pr_ ## __n ## _func(const char *fmt, ...)   \
+{  \
+   va_list args;   \
+   int ret;\
+   \
+   va_start(args, fmt);\
+   ret = _eprintf(__l, verbose, fmt, args);\
+   va_end(args);   \
+   return ret; \
+}
+
+FUNC(warning, 0);
+FUNC(debug, 1);
+
+void perf_debug_setup(void)
+{
+   libapi_set_print(pr_warning_func, pr_warning_func, pr_debug_func);
+}
diff --git a/tools/perf/util/debug.h b/tools/perf/util/debug.h
index 8b9a088c32ab..14bafda79eda 100644
--- a/tools/perf/util/debug.h
+++ b/tools/perf/util/debug.h
@@ -53,5 +53,6 @@ int eprintf_time(int level, int var, u64 t, const char *fmt, 
...) __attribute__(
 int veprintf(int level, int var, const char *fmt, va_list args);
 
 int perf_debug_option(const char *str);
+void perf_debug_setup(void);
 
 #endif /* __PERF_DEBUG_H */
-- 
2.4.3

[PATCH 3/5] tools lib api fs: Add sysfs__read_str function

2016-02-14 Thread Jiri Olsa

Adding sysfs__read_str function to ease up reading
string files from sysfs. New interface is:

  int sysfs__read_str(const char *entry, char **buf, size_t *sizep);

Link: http://lkml.kernel.org/n/tip-em9nb2r0dnbggdxrnyf6w...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/lib/api/fs/fs.c | 13 +
 tools/lib/api/fs/fs.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 2cbf6773ca5d..ef78c22ff44d 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -377,6 +377,19 @@ int sysfs__read_int(const char *entry, int *value)
return filename__read_int(path, value);
 }
 
+int sysfs__read_str(const char *entry, char **buf, size_t *sizep)
+{
+   char path[PATH_MAX];
+   const char *sysfs = sysfs__mountpoint();
+
+   if (!sysfs)
+   return -1;
+
+   snprintf(path, sizeof(path), "%s/%s", sysfs, entry);
+
+   return filename__read_str(path, buf, sizep);
+}
+
 int sysctl__read_int(const char *sysctl, int *value)
 {
char path[PATH_MAX];
diff --git a/tools/lib/api/fs/fs.h b/tools/lib/api/fs/fs.h
index 858922b61141..9f6598098dc5 100644
--- a/tools/lib/api/fs/fs.h
+++ b/tools/lib/api/fs/fs.h
@@ -32,4 +32,5 @@ int filename__read_str(const char *filename, char **buf, 
size_t *sizep);
 int sysctl__read_int(const char *sysctl, int *value);
 int sysfs__read_int(const char *entry, int *value);
 int sysfs__read_ull(const char *entry, unsigned long long *value);
+int sysfs__read_str(const char *entry, char **buf, size_t *sizep);
 #endif /* __API_FS__ */
-- 
2.4.3

[PATCH 2/5] tools lib api fs: Move filename__read_str into api/fs/fs.c

2016-02-14 Thread Jiri Olsa

We already moved similar functions in here,
also it'll be useful for sysfs__read_str
addition in following patch.

Link: http://lkml.kernel.org/n/tip-azxqdrdsw6fpfpk9ycmyp...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/lib/api/fs/fs.c | 51 +++
 tools/lib/api/fs/fs.h |  2 ++
 tools/perf/util/trace-event.c |  1 +
 tools/perf/util/util.c| 48 
 tools/perf/util/util.h|  1 -
 5 files changed, 54 insertions(+), 49 deletions(-)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 459599d1b6c4..2cbf6773ca5d 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -13,6 +13,7 @@
 #include 
 
 #include "fs.h"
+#include "debug-internal.h"
 
 #define _STR(x) #x
 #define STR(x) _STR(x)
@@ -300,6 +301,56 @@ int filename__read_ull(const char *filename, unsigned long 
long *value)
return err;
 }
 
+#define STRERR_BUFSIZE  128 /* For the buffer size of strerror_r */
+
+int filename__read_str(const char *filename, char **buf, size_t *sizep)
+{
+   size_t size = 0, alloc_size = 0;
+   void *bf = NULL, *nbf;
+   int fd, n, err = 0;
+   char sbuf[STRERR_BUFSIZE];
+
+   fd = open(filename, O_RDONLY);
+   if (fd < 0)
+   return -errno;
+
+   do {
+   if (size == alloc_size) {
+   alloc_size += BUFSIZ;
+   nbf = realloc(bf, alloc_size);
+   if (!nbf) {
+   err = -ENOMEM;
+   break;
+   }
+
+   bf = nbf;
+   }
+
+   n = read(fd, bf + size, alloc_size - size);
+   if (n < 0) {
+   if (size) {
+   pr_warning("read failed %d: %s\n", errno,
+strerror_r(errno, sbuf, sizeof(sbuf)));
+   err = 0;
+   } else
+   err = -errno;
+
+   break;
+   }
+
+   size += n;
+   } while (n > 0);
+
+   if (!err) {
+   *sizep = size;
+   *buf   = bf;
+   } else
+   free(bf);
+
+   close(fd);
+   return err;
+}
+
 int sysfs__read_ull(const char *entry, unsigned long long *value)
 {
char path[PATH_MAX];
diff --git a/tools/lib/api/fs/fs.h b/tools/lib/api/fs/fs.h
index d024a7f682f6..858922b61141 100644
--- a/tools/lib/api/fs/fs.h
+++ b/tools/lib/api/fs/fs.h
@@ -2,6 +2,7 @@
 #define __API_FS__
 
 #include 
+#include 
 
 /*
  * On most systems  would have given us this, but  not on some 
systems
@@ -26,6 +27,7 @@ FS(tracefs)
 
 int filename__read_int(const char *filename, int *value);
 int filename__read_ull(const char *filename, unsigned long long *value);
+int filename__read_str(const char *filename, char **buf, size_t *sizep);
 
 int sysctl__read_int(const char *sysctl, int *value);
 int sysfs__read_int(const char *entry, int *value);
diff --git a/tools/perf/util/trace-event.c b/tools/perf/util/trace-event.c
index 802bb868d446..8ae051e0ec79 100644
--- a/tools/perf/util/trace-event.c
+++ b/tools/perf/util/trace-event.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "trace-event.h"
 #include "machine.h"
 #include "util.h"
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index b9e2843cfbe7..35b20dd454de 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -507,54 +507,6 @@ int parse_callchain_record(const char *arg, struct 
callchain_param *param)
return ret;
 }
 
-int filename__read_str(const char *filename, char **buf, size_t *sizep)
-{
-   size_t size = 0, alloc_size = 0;
-   void *bf = NULL, *nbf;
-   int fd, n, err = 0;
-   char sbuf[STRERR_BUFSIZE];
-
-   fd = open(filename, O_RDONLY);
-   if (fd < 0)
-   return -errno;
-
-   do {
-   if (size == alloc_size) {
-   alloc_size += BUFSIZ;
-   nbf = realloc(bf, alloc_size);
-   if (!nbf) {
-   err = -ENOMEM;
-   break;
-   }
-
-   bf = nbf;
-   }
-
-   n = read(fd, bf + size, alloc_size - size);
-   if (n < 0) {
-   if (size) {
-   pr_warning("read failed %d: %s\n", errno,
-strerror_r(errno, sbuf, sizeof(sbuf)));
-   err = 0;
-   } else
-   err = -errno;
-
-   break;
-   }
-
-   size += n;
-   } while (n > 0);
-
-   if (!err) {
-   *sizep = size;
-   *buf   = bf;
-   } else
-   free(bf);
-
-   close

[PATCH 1/5] tools lib api: Add debug output support

2016-02-14 Thread Jiri Olsa

Adding support for warning/info/debug output
within libapi code. Adding following macros:
  pr_warning(fmt, ...)
  pr_info(fmt, ...)
  pr_debug(fmt, ...)

Also adding libapi_set_print function to set
above functions. This will be used in perf
to set standard debug handlers for libapi.

Adding 2 header files:
  debug.h
- to be used outside libapi, contains
  libapi_set_print interface

  debug-internal.h
- to be used within libapi, contains
  pr_warning/pr_info/pr_debug definitions

Link: http://lkml.kernel.org/n/tip-ul9hftog2mwuwsnyx7gm0...@git.kernel.org
Signed-off-by: Jiri Olsa 
---
 tools/lib/api/Build|  1 +
 tools/lib/api/Makefile |  1 +
 tools/lib/api/debug-internal.h | 20 
 tools/lib/api/debug.c  | 28 
 tools/lib/api/debug.h  | 10 ++
 5 files changed, 60 insertions(+)
 create mode 100644 tools/lib/api/debug-internal.h
 create mode 100644 tools/lib/api/debug.c
 create mode 100644 tools/lib/api/debug.h

diff --git a/tools/lib/api/Build b/tools/lib/api/Build
index e8b8a23b9bf4..954c644f7ad9 100644
--- a/tools/lib/api/Build
+++ b/tools/lib/api/Build
@@ -1,3 +1,4 @@
 libapi-y += fd/
 libapi-y += fs/
 libapi-y += cpu.o
+libapi-y += debug.o
diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index d85904dc9b38..bbc82c614bee 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -18,6 +18,7 @@ LIBFILE = $(OUTPUT)libapi.a
 CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
 CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
 CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
+CFLAGS += -I$(srctree)/tools/lib/api
 
 RM = rm -f
 
diff --git a/tools/lib/api/debug-internal.h b/tools/lib/api/debug-internal.h
new file mode 100644
index ..188f7880eafe
--- /dev/null
+++ b/tools/lib/api/debug-internal.h
@@ -0,0 +1,20 @@
+#ifndef __API_DEBUG_INTERNAL_H__
+#define __API_DEBUG_INTERNAL_H__
+
+#include "debug.h"
+
+#define __pr(func, fmt, ...)   \
+do {   \
+   if ((func)) \
+   (func)("libapi: " fmt, ##__VA_ARGS__); \
+} while (0)
+
+extern libapi_print_fn_t __pr_warning;
+extern libapi_print_fn_t __pr_info;
+extern libapi_print_fn_t __pr_debug;
+
+#define pr_warning(fmt, ...)   __pr(__pr_warning, fmt, ##__VA_ARGS__)
+#define pr_info(fmt, ...)  __pr(__pr_info, fmt, ##__VA_ARGS__)
+#define pr_debug(fmt, ...) __pr(__pr_debug, fmt, ##__VA_ARGS__)
+
+#endif /* __API_DEBUG_INTERNAL_H__ */
diff --git a/tools/lib/api/debug.c b/tools/lib/api/debug.c
new file mode 100644
index ..5fa5cf500a1f
--- /dev/null
+++ b/tools/lib/api/debug.c
@@ -0,0 +1,28 @@
+#include 
+#include 
+#include "debug.h"
+#include "debug-internal.h"
+
+static int __base_pr(const char *format, ...)
+{
+   va_list args;
+   int err;
+
+   va_start(args, format);
+   err = vfprintf(stderr, format, args);
+   va_end(args);
+   return err;
+}
+
+libapi_print_fn_t __pr_warning = __base_pr;
+libapi_print_fn_t __pr_info= __base_pr;
+libapi_print_fn_t __pr_debug;
+
+void libapi_set_print(libapi_print_fn_t warn,
+ libapi_print_fn_t info,
+ libapi_print_fn_t debug)
+{
+   __pr_warning = warn;
+   __pr_info= info;
+   __pr_debug   = debug;
+}
diff --git a/tools/lib/api/debug.h b/tools/lib/api/debug.h
new file mode 100644
index ..a0872f68fc56
--- /dev/null
+++ b/tools/lib/api/debug.h
@@ -0,0 +1,10 @@
+#ifndef __API_DEBUG_H__
+#define __API_DEBUG_H__
+
+typedef int (*libapi_print_fn_t)(const char *, ...);
+
+void libapi_set_print(libapi_print_fn_t warn,
+ libapi_print_fn_t info,
+ libapi_print_fn_t debug);
+
+#endif /* __API_DEBUG_H__ */
-- 
2.4.3

Re: Kernel docs: muddying the waters a bit

2016-02-14 Thread Keith Packard

Daniel Vetter  writes:

> The other one is graphs - Keith showed me some neat stuff that
> asciidoc can do, and I definitely wanted to integrate something like
> that as a follow-up into the kerneldoc toolchain. Often a diagram is a
> lot more helpful than lots of words. Can sphinx gives us that too?

.. graphviz::

   digraph foo {
"bar" -> "baz";
   }

Even better than asciidoc -- svg output is supported in both html and
pdf (when using rst2pdf). I had to hack asciidoc to add support for svg
output when using docbook.

> Wrt reformatting: I'm not going to like it, but I hope that with a bit
> of sed we can fix up any of the asciidoc comments we have already
> easily - right now we don't (yet) use much of the more sophisticated
> markup yet. So much better to change now than 1 year down the road.

I used pandoc on the docbook output from asciidoc to get a 100 page
document converted here. It wasn't perfect -- all of the internal links
were busted, and labels for tables were mis-positioned. It might be that
a few minor fixes to pandoc could be done to add 'sphinx'-specific rst
support that could fix this?

I spent (too much) time yesterday playing with sphinx and generated a
new html theme. Here's the result:

http://keithp.com/~keithp/altusmetrum-sphinx/altusmetrum.html

Here's the PDF output from rst2pdf, a python-based PDF output which
doesn't use docbook *or* latex:

http://keithp.com/~keithp/altusmetrum-sphinx/Altus%20Metrum.pdf

I need to spend some quality time building my own PDF theme; the default
provided by rst2pdf isn't great. It does, however, use fontconfig, so
switching fonts is *way* easier than with docbook...

There's currently an incompatibility between the rst2pdf and sphnix
packages in debian (and upstream) which I hacked around to generate that
output, but otherwise I'm using packaged bits.

So, another pro for sphinx appears to be native PDF generation...

-- 
-keith

signature.asc
Description: PGP signature

Re: [PATCH v3 3/7] debugfs: add support for self-protecting attribute file fops

2016-02-14 Thread Julia Lawall

> >> diff --git a/scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci 
> >> b/scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci
> >> new file mode 100644
> >> index 000..bdc418d
> >> --- /dev/null
> >> +++ b/scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci
> >> @@ -0,0 +1,68 @@
> >> +///

Could you drop the above line?

> >> +/// Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE
> >> +/// for debugfs files.
> >> +///
> >> +/// Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file()
> >> +/// imposes some significant overhead as compared to
> >> +/// DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe().

For the above three lines that give more detail, please use //#

thanks,
julia

> >> +// Copyright (C): 2016 Nicolai Stange
> >> +// Options: --no-includes
> >> +//
> >> +
> >> +virtual context
> >> +virtual patch
> >> +virtual org
> >> +virtual report
> >> +
> >> +@dsa@
> >> +declarer name DEFINE_SIMPLE_ATTRIBUTE;
> >> +identifier dsa_fops;
> >> +expression dsa_get, dsa_set, dsa_fmt;
> >> +position p;
> >> +@@
> >> +DEFINE_SIMPLE_ATTRIBUTE@p(dsa_fops, dsa_get, dsa_set, dsa_fmt);
> >> +
> >> +@dcf@
> >> +expression name, mode, parent, data;
> >> +identifier dsa.dsa_fops;
> >> +@@
> >> +debugfs_create_file(name, mode, parent, data, &dsa_fops)
> >> +
> >> +
> >> +@context_dsa depends on context && dcf@
> >> +declarer name DEFINE_DEBUGFS_ATTRIBUTE;
> >> +identifier dsa.dsa_fops;
> >> +expression dsa.dsa_get, dsa.dsa_set, dsa.dsa_fmt;
> >> +@@
> >> +* DEFINE_SIMPLE_ATTRIBUTE(dsa_fops, dsa_get, dsa_set, dsa_fmt);
> >> +
> >> +
> >> +@patch_dcf depends on patch expression@
> >> +expression name, mode, parent, data;
> >> +identifier dsa.dsa_fops;
> >> +@@
> >> +- debugfs_create_file(name, mode, parent, data, &dsa_fops)
> >> ++ debugfs_create_file_unsafe(name, mode, parent, data, &dsa_fops)
> >> +
> >> +@patch_dsa depends on patch_dcf && patch@
> >> +identifier dsa.dsa_fops;
> >> +expression dsa.dsa_get, dsa.dsa_set, dsa.dsa_fmt;
> >> +@@
> >> +- DEFINE_SIMPLE_ATTRIBUTE(dsa_fops, dsa_get, dsa_set, dsa_fmt);
> >> ++ DEFINE_DEBUGFS_ATTRIBUTE(dsa_fops, dsa_get, dsa_set, dsa_fmt);
> >> +
> >> +
> >> +@script:python depends on org && dcf@
> >> +fops << dsa.dsa_fops;
> >> +p << dsa.p;
> >> +@@
> >> +msg="%s should be defined with DEFINE_DEBUGFS_ATTRIBUTE" % (fops)
> >> +coccilib.org.print_todo(p[0], msg)
> >> +
> >> +@script:python depends on report && dcf@
> >> +fops << dsa.dsa_fops;
> >> +p << dsa.p;
> >> +@@
> >> +msg="WARNING: %s should be defined with DEFINE_DEBUGFS_ATTRIBUTE" % (fops)
> >> +coccilib.report.print_report(p[0], msg)
> >> -- 
> >> 2.7.1
> >> 
> >> 
>

Re: [lkp] [ppdev] 138e9d20e6: kernel BUG at drivers/base/driver.c:153!

2016-02-14 Thread Sudip Mukherjee

On Sun, Feb 14, 2016 at 02:24:27PM +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
> 
> https://github.com/0day-ci/linux 
> Sudip-Mukherjee/ppdev-space-prohibited-between-function-name-and-parenthesis/20160212-210833
> commit 138e9d20e69c0d682f584ee1ef0151338eef7499 ("ppdev: use new parport 
> device model")

I am not able to reproduce this as I donot have all your scripts/files.
Can you please check if the below patch solves this in your setup.


diff --git a/drivers/parport/share.c b/drivers/parport/share.c
index 3308427..176b2b6 100644
--- a/drivers/parport/share.c
+++ b/drivers/parport/share.c
@@ -273,6 +273,9 @@ int __parport_register_driver(struct parport_driver *drv, 
struct module *owner,
/* using device model */
int ret;
 
+   if (!parport_bus_type.p)
+   return -EAGAIN;
+
/* initialize common driver fields */
drv->driver.name = drv->name;
drv->driver.bus = &parport_bus_type;

--
regards
sudip

Re: [PATCH 2/2] tracing/rcu: don't trace rcu_callback on offline CPUs

2016-02-14 Thread Steven Rostedt

On Sat, 13 Feb 2016 21:22:53 +0300
Denis Kirjanov  wrote:

> 
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index ef72c4a..5470f2f 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -435,6 +435,8 @@ TRACE_EVENT(rcu_callback,
>  
>   TP_ARGS(rcuname, rhp, qlen_lazy, qlen),
>  
> + TP_CONDITION(cpu_online(raw_smp_processor_id())),
> +

Besides the fact that this isn't a TRACE_EVENT_CONDITION, Isn't calling
rcu_callback() dangerous from an offline CPU?

Or is calling a callback from an offline CPU OK?

Perhaps it is OK, as it doesn't need to worry about its current CPU,
just the other CPUs.

Paul?

-- Steve

>   TP_STRUCT__entry(
>   __field(const char *, rcuname)
>   __field(void *, rhp)

Re: [PATCH 1/2] tracing/mm: don't trace kfree on offline CPUs

2016-02-14 Thread Steven Rostedt

On Sat, 13 Feb 2016 21:22:52 +0300
Denis Kirjanov  wrote:

> -DEFINE_EVENT(kmem_free, kfree,
> +DEFINE_EVENT_CONDITION(kmem_free, kfree,
>  
>   TP_PROTO(unsigned long call_site, const void *ptr),
>  
> - TP_ARGS(call_site, ptr)
> + TP_ARGS(call_site, ptr),
> +
> + /*
> +  * This trace can be potentially called from an offlined cpu.
> +  * Since trace points use RCU and RCU should not be used from
> +  * offline cpus, filter such calls out.
> +  * While this trace can be called from a preemptable section,
> +  * it has no impact on the condition since tasks can migrate
> +  * only from online cpus to other online cpus. Thus its safe
> +  * to use raw_smp_processor_id.
> +  */
> + TP_CONDITION(cpu_online(raw_smp_processor_id()))

This is starting to become a common occurrence. Perhaps it is best to
just hardcode this into the tracepoint code itself?

-- Steve

>  );
>  
>  DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,

arm qemu test failures due to 'driver-core: platform: probe of-devices only using list of compatibles'

2016-02-14 Thread Guenter Roeck

Uwe,

Your patch 'driver-core: platform: probe of-devices only using list of
compatibles' causes the following qemu tests to crash in -next.

arm:vexpress-a9:vexpress_defconfig:vexpress-v2p-ca9
arm:vexpress-a15:vexpress_defconfig:vexpress-v2p-ca15-tc1
arm:vexpress-a9:multi_v7_defconfig:vexpress-v2p-ca9
arm:vexpress-a15:multi_v7_defconfig:vexpress-v2p-ca15-tc1

Crash log:

VFS: Cannot open root device "mmcblk0" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
1f00  131072 mtdblock0  (driver?)
1f01   32768 mtdblock1  (driver?)
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

ie the mmc driver no longer instantiates. Reverting the patch fixes the problem.

Bisect log is attached.

Guenter

---
# bad: [64d9a3617b3b8bc0734ba97caeb433b7019c6187] Add linux-next specific files 
for 20160212
# good: [388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95] Linux 4.5-rc3
git bisect start 'HEAD' 'v4.5-rc3'
# good: [597dc9d36e8bc04941b61b26ac7aa3f8a33aba53] Merge remote-tracking branch 
'sound-asoc/for-next'
git bisect good 597dc9d36e8bc04941b61b26ac7aa3f8a33aba53
# bad: [91fe8ea815243ec595753ccf7e14126b6f87f2bf] Merge remote-tracking branch 
'usb-chipidea-next/ci-for-usb-next'
git bisect bad 91fe8ea815243ec595753ccf7e14126b6f87f2bf
# good: [1d6796e67f265e835bcb1a19d27ba0433dbd75e4] Merge remote-tracking branch 
'tip/auto-latest'
git bisect good 1d6796e67f265e835bcb1a19d27ba0433dbd75e4
# bad: [858163465b53ab87c3939cae9e6fd0ecbeb60bfa] Merge remote-tracking branch 
'driver-core/driver-core-next'
git bisect bad 858163465b53ab87c3939cae9e6fd0ecbeb60bfa
# good: [5acd4c7ca23549bf4e480a92efb7d87d988be432] Merge remote-tracking branch 
'kvm-arm/next'
git bisect good 5acd4c7ca23549bf4e480a92efb7d87d988be432
# good: [d28003ab55e09323bf1a026e804165c6d371ae6b] Merge remote-tracking branch 
'drivers-x86/for-next'
git bisect good d28003ab55e09323bf1a026e804165c6d371ae6b
# good: [f28a8693f4b1eb8b4035167825f2bcd44bd95546] Merge remote-tracking branch 
'hsi/for-next'
git bisect good f28a8693f4b1eb8b4035167825f2bcd44bd95546
# good: [d3a7387f8aae81ba0f3687518a9ad7a14bfb165d] Merge remote-tracking branch 
'ipmi/for-next'
git bisect good d3a7387f8aae81ba0f3687518a9ad7a14bfb165d
# good: [75f3e8e47f381074801d0034874d20c638d9e3d9] firmware: introduce sysfs 
driver for QEMU's fw_cfg device
git bisect good 75f3e8e47f381074801d0034874d20c638d9e3d9
# good: [9e5b3d6f7f946a3fb4d83ac2ab6d2bfefcdafffb] drivers: dma-coherent: 
simplify dma_init_coherent_memory return value
git bisect good 9e5b3d6f7f946a3fb4d83ac2ab6d2bfefcdafffb
# good: [cf68d85529f7dccc24412887d46e364f4b422a5d] driver-core: platform: fix 
typo in documentation for multi-driver helper
git bisect good cf68d85529f7dccc24412887d46e364f4b422a5d
# bad: [67d02a1bbb334558e9380409a3cd426b36d4578b] driver-core: platform: probe 
of-devices only using list of compatibles
git bisect bad 67d02a1bbb334558e9380409a3cd426b36d4578b
# first bad commit: [67d02a1bbb334558e9380409a3cd426b36d4578b] driver-core: 
platform: probe of-devices only using list of compatibles

Re: [PATCH] [media] zl10353: use div_u64 instead of do_div

2016-02-14 Thread Nicolas Pitre

On Sun, 14 Feb 2016, Ard Biesheuvel wrote:

> On 13 February 2016 at 22:57, Nicolas Pitre  wrote:
> > On Sat, 13 Feb 2016, Ard Biesheuvel wrote:
> >
> >> On 12 February 2016 at 22:01, Arnd Bergmann  wrote:
> >> > However, I did stumble over an older patch I did now, which I could
> >> > not remember what it was good for. It does fix the problem, and
> >> > it seems to be a better solution.
> >> >
> >> > Arnd
> >> >
> >> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> >> > index b5acbb404854..b5ff9881bef8 100644
> >> > --- a/include/linux/compiler.h
> >> > +++ b/include/linux/compiler.h
> >> > @@ -148,7 +148,7 @@ void ftrace_likely_update(struct ftrace_branch_data 
> >> > *f, int val, int expect);
> >> >   */
> >> >  #define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
> >> >  #define __trace_if(cond) \
> >> > -   if (__builtin_constant_p((cond)) ? !!(cond) :   \
> >> > +   if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
> >> > ({  \
> >> > int __r;\
> >> > static struct ftrace_branch_data\
> >> >
> >>
> >> I remember seeing this patch, but I don't remember the exact context.
> >> But when you think about it, !!cond can be a build time constant even
> >> if cond is not, as long as you can prove statically that cond != 0. So
> >
> > You're right.  I just tested it and to my surprise gcc is smart enough
> > to figure that case out.
> >
> >> I think this change is obviously correct, and an improvement since it
> >> will remove the profiling overhead of branches that are not true
> >> branches in the first place.
> >
> > Indeed.
> >
> 
> ... and perhaps we should not evaluate cond twice either?

It is not. The value of the argument to __builtin_constant_p() is not 
itself evaluated and therefore does not produce side effects.


Nicolas

Re: [PATCH RFC] Introduce atomic and per-cpu add-max and sub-min operations

2016-02-14 Thread Tejun Heo

Hello, Konstantin.

On Sun, Feb 14, 2016 at 12:09:00PM +0300, Konstantin Khlebnikov wrote:
> bool atomic_add_max(atomic_t *var, int add, int max);
> bool atomic_sub_min(atomic_t *var, int sub, int min);
> 
> bool this_cpu_add_max(var, add, max);
> bool this_cpu_sub_min(var, sub, min);
> 
> They add/subtract only if result will be not bigger than max/lower that min.
> Returns true if operation was done and false otherwise.

If I'm reading the code right, all the above functions do is wrapping
the corresponding cmpxchg implementations.  Given that most use cases
would build further abstractions on top, I'm not sure how useful
providing another layer of abstraction is.  For the most part, we
introduce new per-cpu operations to take advantage of capabilities of
underlying hardware which can't be utilized in a different way (like
the x86 128bit atomic ops).

Thanks.

-- 
tejun

Re: [PATCH V2] AHCI: Workaround for ThunderX Errata#22536

2016-02-14 Thread Tejun Heo

Hello,

On Fri, Feb 12, 2016 at 03:20:30PM -0800, tchalama...@caviumnetworks.com wrote:
> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index 546a369..76e310e 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -1560,6 +1560,9 @@ static int ahci_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
>   if (ahci_broken_devslp(pdev))
>   hpriv->flags |= AHCI_HFLAG_NO_DEVSLP;
>  
> + if (pdev->vendor == 0x177d && pdev->device == 0xa01c)
> + ahci_thunderx_init(&pdev->dev, hpriv);

So, this would make ahci fail to build if thunderx is not configured.
Maybe we should add an additional callback to update hpriv.

>   /* save initial config */
>   ahci_pci_save_initial_config(pdev, hpriv);
>  
> diff --git a/drivers/ata/ahci_thunderx.c b/drivers/ata/ahci_thunderx.c

This driver isn't upstream yet, right?  Maybe just fold this change
together?

Thanks.

-- 
tejun

Re: Regard of thermal power allocator's coefficients

2016-02-14 Thread Eduardo Valentin

Hello Leo,

On Sun, Feb 14, 2016 at 07:00:41PM +0800, Leo Yan wrote:
> Hi there,
> 
> I'm trying to upstreaming IPA patches for 96board Hikey, but so far
> there have no standard DT binding for passing IPA coefficients for
> power modeling.

Thanks for your effort.

> 
> So want to firstly to confirm if should we pass coefficients by using
> device tree? Is someone working on related work for this?

We pass the sustainable power, but not the coefficients. IIRC, the IPA
coefficients were considered (SW) implementation details. Sustainable power,
on the other hand, still describes hardware capabilities.

So, I don't think they will go via DT.

> 
> Here has another more straightforward method is to directly to
> include power model's coefficients in thermal sensor driver (such like
> drivers/thermal/hisi_thermal.c), but my concern is this method will
> include SoC specific data in the common thermal sensor driver,
> so is this doable?

Yeah, unless you are sure that these coefficients are SoC dependent, and
not board dependent, for instance, (or even, use case dependent), I
would prefer you do not leave them configured in the driver.

Keep in mind that power allocator will compute defaults.

Also, userspace software may also tune these parameters, per thermal
zone.

> 
> Welcome any suggestion.
> 
> Thanks,
> Leo Yan

BR,

Eduardo Valentin

[PATCH] drivers/hwtracing: make coresight-* explicitly non-modular

2016-02-14 Thread Paul Gortmaker

None of the Kconfig currently controlling compilation of any of
the files here are tristate, meaning that none of it currently
is being built as a module by anyone.

We need not be concerned about .remove functions and blocking the
unbind sysfs operations, since that was already done in a recent
commit.

Lets remove any remaining modular references, so that when reading the
drivers there is no doubt they are builtin-only.

All drivers get mostly the same changes, so they are handled in batch.
Changes are (1) convert to builtin_amba_driver, (2) delete module.h
include where unused, and (3) relocate the description into the
comments so we don't need MODULE_DESCRIPTION and associated tags.

The etm3x and etm4x use module_param_named, and have been adjusted
to just include moduleparam.h for that purpose.

In commit f309d4443130bf814e991f836e919dca22df37ae ("platform_device:
better support builtin boilerplate avoidance") we introduced the
builtin_driver macro.

Here we use that support and extend it to amba driver registration,
so where a driver is clearly non-modular and builtin-only, we can
update with the simple mapping of

 module_amba_driver(...)  ---> builtin_amba_driver(...)

Since module_amba_driver() uses the same init level priority as
builtin_amba_driver() the init ordering remains unchanged with
this commit.

Cc: Mathieu Poirier 
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Paul Gortmaker 
---

[This is what I sent back in October[1], but with the .remove and
 .suppress bits stripped out, since that is now in Mathieu's queue.
 And to make this a standalone commit that Mathieu can also add to
 his queue, I've squished the amba/bus.h change[2] into this change.]

[1]  http://www.spinics.net/lists/kernel/msg2096702.html
[2]  http://lkml.iu.edu/hypermail/linux/kernel/1510.1/02464.html

 drivers/hwtracing/coresight/coresight-etb10.c   |  9 +++--
 drivers/hwtracing/coresight/coresight-etm3x.c   | 14 --
 drivers/hwtracing/coresight/coresight-etm4x.c   |  4 +---
 drivers/hwtracing/coresight/coresight-funnel.c  |  9 +++--
 drivers/hwtracing/coresight/coresight-replicator-qcom.c |  4 +---
 drivers/hwtracing/coresight/coresight-replicator.c  |  7 ++-
 drivers/hwtracing/coresight/coresight-tmc.c |  9 +++--
 drivers/hwtracing/coresight/coresight-tpiu.c|  9 +++--
 drivers/hwtracing/coresight/coresight.c |  3 ---
 drivers/hwtracing/coresight/of_coresight.c  |  1 -
 include/linux/amba/bus.h|  9 +
 11 files changed, 33 insertions(+), 45 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c 
b/drivers/hwtracing/coresight/coresight-etb10.c
index 92969dae739d..21f6afd24a66 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -1,5 +1,7 @@
 /* Copyright (c) 2011-2012, The Linux Foundation. All rights reserved.
  *
+ * Description: CoreSight Embedded Trace Buffer driver
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 and
  * only version 2 as published by the Free Software Foundation.
@@ -11,7 +13,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -534,8 +535,4 @@ static struct amba_driver etb_driver = {
.probe  = etb_probe,
.id_table   = etb_ids,
 };
-
-module_amba_driver(etb_driver);
-
-MODULE_LICENSE("GPL v2");
-MODULE_DESCRIPTION("CoreSight Embedded Trace Buffer driver");
+builtin_amba_driver(etb_driver);
diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c 
b/drivers/hwtracing/coresight/coresight-etm3x.c
index aae80e14508d..8750b432ad36 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x.c
@@ -1,5 +1,7 @@
 /* Copyright (c) 2011-2012, The Linux Foundation. All rights reserved.
  *
+ * Description: CoreSight Program Flow Trace driver
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 and
  * only version 2 as published by the Free Software Foundation.
@@ -11,7 +13,7 @@
  */
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -35,6 +37,10 @@
 
 #include "coresight-etm.h"
 
+/*
+ * not really modular but using module_param is the easiest way to
+ * remain consistent with existing use cases for now.
+ */
 static int boot_enable;
 module_param_named(boot_enable, boot_enable, int, S_IRUGO);
 
@@ -1950,8 +1956,4 @@ static struct amba_driver etm_driver = {
.probe  = etm_probe,
.id_table   = etm_ids,
 };
-
-module_amba_driver(etm_driver);
-
-MODULE_LICENSE("GPL v2");
-MODULE_DESCRIPTION("CoreSight Program Flow Trace driver");
+builtin_amba_driver(etm_driver);
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c 
b/drivers/hwtracing

Re: [PATCH 1/2] tracing/mm: don't trace kfree on offline CPUs

2016-02-14 Thread Denis Kirjanov

On 2/14/16, Steven Rostedt  wrote:
> On Sat, 13 Feb 2016 21:22:52 +0300
> Denis Kirjanov  wrote:
>
>> -DEFINE_EVENT(kmem_free, kfree,
>> +DEFINE_EVENT_CONDITION(kmem_free, kfree,
>>
>>  TP_PROTO(unsigned long call_site, const void *ptr),
>>
>> -TP_ARGS(call_site, ptr)
>> +TP_ARGS(call_site, ptr),
>> +
>> +/*
>> + * This trace can be potentially called from an offlined cpu.
>> + * Since trace points use RCU and RCU should not be used from
>> + * offline cpus, filter such calls out.
>> + * While this trace can be called from a preemptable section,
>> + * it has no impact on the condition since tasks can migrate
>> + * only from online cpus to other online cpus. Thus its safe
>> + * to use raw_smp_processor_id.
>> + */
>> +TP_CONDITION(cpu_online(raw_smp_processor_id()))
>
> This is starting to become a common occurrence. Perhaps it is best to
> just hardcode this into the tracepoint code itself?

Yeah, I was thinking about it the same way and so we can make it generic

>
> -- Steve
>
>>  );
>>
>>  DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,
>
>

Re: [PATCH 2/2] tracing/rcu: don't trace rcu_callback on offline CPUs

2016-02-14 Thread Denis Kirjanov

On 2/14/16, Steven Rostedt  wrote:
> On Sat, 13 Feb 2016 21:22:53 +0300
> Denis Kirjanov  wrote:
>
>>
>> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
>> index ef72c4a..5470f2f 100644
>> --- a/include/trace/events/rcu.h
>> +++ b/include/trace/events/rcu.h
>> @@ -435,6 +435,8 @@ TRACE_EVENT(rcu_callback,
>>
>>  TP_ARGS(rcuname, rhp, qlen_lazy, qlen),
>>
>> +TP_CONDITION(cpu_online(raw_smp_processor_id())),
>> +
>
> Besides the fact that this isn't a TRACE_EVENT_CONDITION, Isn't calling
> rcu_callback() dangerous from an offline CPU?

That was the wrong patch, I've sent the v2.

>
> Or is calling a callback from an offline CPU OK?
>
> Perhaps it is OK, as it doesn't need to worry about its current CPU,
> just the other CPUs.
>
> Paul?
>
> -- Steve
>
>
>>  TP_STRUCT__entry(
>>  __field(const char *, rcuname)
>>  __field(void *, rhp)
>
>

Re: [PATCH RFC] Introduce atomic and per-cpu add-max and sub-min operations

2016-02-14 Thread Konstantin Khlebnikov

On Sun, Feb 14, 2016 at 7:51 PM, Tejun Heo  wrote:
> Hello, Konstantin.
>
> On Sun, Feb 14, 2016 at 12:09:00PM +0300, Konstantin Khlebnikov wrote:
>> bool atomic_add_max(atomic_t *var, int add, int max);
>> bool atomic_sub_min(atomic_t *var, int sub, int min);
>>
>> bool this_cpu_add_max(var, add, max);
>> bool this_cpu_sub_min(var, sub, min);
>>
>> They add/subtract only if result will be not bigger than max/lower that min.
>> Returns true if operation was done and false otherwise.
>
> If I'm reading the code right, all the above functions do is wrapping
> the corresponding cmpxchg implementations.  Given that most use cases
> would build further abstractions on top, I'm not sure how useful
> providing another layer of abstraction is.  For the most part, we
> introduce new per-cpu operations to take advantage of capabilities of
> underlying hardware which can't be utilized in a different way (like
> the x86 128bit atomic ops).

Yep, they are just abstraction around cmpxchg, as well as a half of atomic
operations. Probably some architectures could implement this differently.

This is basic block with clear interface which performs just one operaion.
without managing memory and logic behind it. Users often already have
per-cpu memory stuctures, so they don't need high level abstractrions
because this will waste memory for unneeded pointers. I think this new
abstraction could replace alot of opencoded hacks in common way.

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Linus Walleij

Greg, heads-up on this... you'd know if this happened
before.

On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  wrote:
> On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
>> FYI, we noticed the below changes on
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git chardev
>> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a userspace 
>> chardev ABI for GPIOs")
>>
>>
>> [1.951191] user_verbs: couldn't register device number
>
> Looks like user_verbs is using a static device node setup.
>
> enum {
> IB_UVERBS_MAJOR   = 231,
> IB_UVERBS_BASE_MINOR  = 192,
> IB_UVERBS_MAX_DEVICES = 32
> };
>
> #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR)

That's annoying...
I notice that infiniband is using register_chrdev_region() at
module_init() time, counting on device major 231 to be free.

> Something tells me that a new GPIO chardev is taking this spot.

Yes. Please post the contents of /proc/devices on this system.

If you look in fs/char_dev.c this happens in
__register_chrdev_region() you can see that dynamic
character major numbers are assigned from 254 and
downwards in this way:

#define CHRDEV_MAJOR_HASH_SIZE   255
(...)
} *chrdevs[CHRDEV_MAJOR_HASH_SIZE];

/* temporary */
if (major == 0) {
for (i = ARRAY_SIZE(chrdevs)-1; i > 0; i--) {
if (chrdevs[i] == NULL)
break;
}

if (i == 0) {
ret = -EBUSY;
goto out;
}
major = i;
}

Whereas fixed device numbers are assigned sparsely
from low to high.

I suspect what happens is that in your system there are
already so many dynamically assigned character devices that
they go down and already collide with 232 and 233, you just
didn't notice until this make it hit 231 which incidentally
was in use.

So I would be very intersted in what misc stuff you have filling
out 232 thru 255, already knocking out other assigned
numbers...

I guess I *could* try to grab a static assignment in the low
range, say recycle character device 8, which is the first
unallocated from the bottom, but I'm afraid the device
core maintainers have worked to get devices to go more
dynamic and would be very unhappy about this.

Yours,
Linus Walleij

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Linus Walleij

On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  wrote:

> enum {
> IB_UVERBS_MAJOR   = 231,
> IB_UVERBS_BASE_MINOR  = 192,
> IB_UVERBS_MAX_DEVICES = 32
> };
>
> #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR)
>
> Something tells me that a new GPIO chardev is taking this spot.

I don't think so, since gpio is reserving it as a core_initcall() it
gets a high chardev major. It's likely the device that used to
be device 232 stealing the fingerprint sensor slot that got
pushed down and is now stealing device 231 infiniband.

Maybe I should make a patch making fs/char_dev.c emit a warning
or fail when it goes below 234... it doesn't currently. It just steals
more numbers. It's a serious archtectural issue if it persists.

Yours,
Linus Walleij

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Greg KH

On Sun, Feb 14, 2016 at 06:42:11PM +0100, Linus Walleij wrote:
> Greg, heads-up on this... you'd know if this happened
> before.
> 
> On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  wrote:
> > On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
> >> FYI, we noticed the below changes on
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
> >> chardev
> >> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a userspace 
> >> chardev ABI for GPIOs")
> >>
> >>
> >> [1.951191] user_verbs: couldn't register device number
> >
> > Looks like user_verbs is using a static device node setup.
> >
> > enum {
> > IB_UVERBS_MAJOR   = 231,
> > IB_UVERBS_BASE_MINOR  = 192,
> > IB_UVERBS_MAX_DEVICES = 32
> > };
> >
> > #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR)
> 
> That's annoying...
> I notice that infiniband is using register_chrdev_region() at
> module_init() time, counting on device major 231 to be free.

That device major is assigned to Infiniband, why shouldn't it be doing
this?

> 
> > Something tells me that a new GPIO chardev is taking this spot.
> 
> Yes. Please post the contents of /proc/devices on this system.
> 
> If you look in fs/char_dev.c this happens in
> __register_chrdev_region() you can see that dynamic
> character major numbers are assigned from 254 and
> downwards in this way:
> 
> #define CHRDEV_MAJOR_HASH_SIZE   255
> (...)
> } *chrdevs[CHRDEV_MAJOR_HASH_SIZE];
> 
> /* temporary */
> if (major == 0) {
> for (i = ARRAY_SIZE(chrdevs)-1; i > 0; i--) {
> if (chrdevs[i] == NULL)
> break;
> }
> 
> if (i == 0) {
> ret = -EBUSY;
> goto out;
> }
> major = i;
> }
> 
> Whereas fixed device numbers are assigned sparsely
> from low to high.
> 
> I suspect what happens is that in your system there are
> already so many dynamically assigned character devices that
> they go down and already collide with 232 and 233, you just
> didn't notice until this make it hit 231 which incidentally
> was in use.
> 
> So I would be very intersted in what misc stuff you have filling
> out 232 thru 255, already knocking out other assigned
> numbers...
> 
> I guess I *could* try to grab a static assignment in the low
> range, say recycle character device 8, which is the first
> unallocated from the bottom, but I'm afraid the device
> core maintainers have worked to get devices to go more
> dynamic and would be very unhappy about this.

Why not just ask for a new reserved one?  We could give you 261 and
everything should be fine, right?

thanks,

greg k-h

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Linus Walleij

On Sun, Feb 14, 2016 at 6:49 PM, Greg KH  wrote:
> On Sun, Feb 14, 2016 at 06:42:11PM +0100, Linus Walleij wrote:
>> Greg, heads-up on this... you'd know if this happened
>> before.
>>
>> On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  wrote:
>> > On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
>> >> FYI, we noticed the below changes on
>> >>
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
>> >> chardev
>> >> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a userspace 
>> >> chardev ABI for GPIOs")
>> >>
>> >>
>> >> [1.951191] user_verbs: couldn't register device number
>> >
>> > Looks like user_verbs is using a static device node setup.
>> >
>> > enum {
>> > IB_UVERBS_MAJOR   = 231,
>> > IB_UVERBS_BASE_MINOR  = 192,
>> > IB_UVERBS_MAX_DEVICES = 32
>> > };
>> >
>> > #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, 
>> > IB_UVERBS_BASE_MINOR)
>>
>> That's annoying...
>> I notice that infiniband is using register_chrdev_region() at
>> module_init() time, counting on device major 231 to be free.
>
> That device major is assigned to Infiniband, why shouldn't it be doing
> this?

I mean it's annoying that they collide. (Because of the details I
write below, it's fine it's using the assigned number.

> Why not just ask for a new reserved one?  We could give you 261 and
> everything should be fine, right?

Sure I can post a patch for that, but it just mitigates the problem.

The report point to the serious problem that on this system
some dynamic allocations have already stolen major device
numbers 232 thru 255, and 232 and 233 are also assigned.

What do you think about a patch that makes fs/char_dev.c
emit a warning when it starts assigning dynamic numbers
233 and below?

Yours,
Linus Walleij

tty: deadlock between tty_buffer_flush/n_tracesink_open

2016-02-14 Thread Dmitry Vyukov

Hello,

I've finally got the tty deadlock report with lockdep stack collection
bug fixed. This is on commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95
(4.5-rc3).

==
[ INFO: possible circular locking dependency detected ]
4.5.0-rc3+ #326 Not tainted
---
syz-executor/337 is trying to acquire lock:
 (&port->buf.lock/1){+.+...}, at: []
tty_buffer_flush+0xbf/0x3c0 drivers/tty/tty_buffer.c:244

but task is already holding lock:
 (writelock){+.+...}, at: []
n_tracesink_open+0x23/0xf0 drivers/tty/n_tracesink.c:78

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (writelock){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] n_tracesink_datadrain+0x24/0xc0
drivers/tty/n_tracesink.c:173
   [] n_tracerouter_receivebuf+0x2b/0x40
drivers/tty/n_tracerouter.c:176
   [< inline >] receive_buf drivers/tty/tty_buffer.c:454
   [] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #3 (routelock){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
   [< inline >] receive_buf drivers/tty/tty_buffer.c:454
   [] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #2 (&buf->lock){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] tty_buffer_flush+0xbf/0x3c0
drivers/tty/tty_buffer.c:244
   [] pty_flush_buffer+0x5c/0x180 drivers/tty/pty.c:225
   [] tty_driver_flush_buffer+0x65/0x80
drivers/tty/tty_ioctl.c:94
   [] isig+0x172/0x2c0 drivers/tty/n_tty.c:1148
   [] n_tty_receive_signal_char+0x22/0xf0
drivers/tty/n_tty.c:1249
   [] n_tty_receive_char_special+0x128d/0x2b30
drivers/tty/n_tty.c:1298
   [< inline >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1618
   [< inline >] __receive_buf drivers/tty/n_tty.c:1652
   [] n_tty_receive_buf_common+0x19a3/0x2400
drivers/tty/n_tty.c:1750
   [] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1785
   [< inline >] receive_buf drivers/tty/tty_buffer.c:450
   [] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #1 (&o_tty->termios_rwsem/1){..}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [] down_read+0x47/0x60 kernel/locking/rwsem.c:22
   [] n_tty_receive_buf_common+0x8d/0x2400
drivers/tty/n_tty.c:1713
   [] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1785
   [< inline >] receive_buf drivers/tty/tty_buffer.c:450
   [] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #0 (&port->buf.lock/1){+.+...}:
   [< inline >] check_prev_add kernel/locking/lockdep.c:1853
   [< inline >] check_prevs_add kernel/locking/lockdep.c:1963
   [< inline >] validate_chain kernel/locking/lockdep.c:2148
   [] __lock_acquire+0x31e9/0x4700
kernel/locking/lockdep.c:3210
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] tty_buffer_flush+0xbf/0x3c0
drivers/tty/tty_buffer.c:244
   [] pty_flush_buffer+0x5c/0x180 drivers/tty/pty.c:225
   [] tty_driver_flush_buffer+0x65/0x80
drivers/tty/tty_ioctl.c:94
   [] n_tracesink_open+0x95/0xf0
drivers/tty/n_tracesink.c:85
   [] tty_ldisc_open.isra.2+0x78/0xd0
drivers

Re: [PATCH] Revert "net: phy: turn carrier off on phy attach"

2016-02-14 Thread Clemens Gruber

On Fri, Feb 12, 2016 at 10:56:04AM -0800, Florian Fainelli wrote:
> On 12/02/16 10:01, Clemens Gruber wrote:
> > Commit 113c74d83eef ("net: phy: turn carrier off on phy attach") breaks
> > the eth0 link coming up on all my i.MX6Q boards with a Marvell 88E1510.
> > If I then do a ifconfig eth0 down/up cycle I first get a MDIO read
> > timeout but then the link becomes ready and everything is back to
> > normal.
> > Without this step however, the link stays down forever, an unusually
> > high amount of phy interrupts occur (about 1/second) and kworker/0:2
> > is constantly using over 60% of the CPU.
> > 
> > Reverting it fixes the problems with the link not coming up at boot as
> > well as the high amount of phy interrupts and kworker load in that
> > state.
> 
> You are seeing this with the FEC driver right? We probably want to
> carefully audit the driver and understand what could be going wrong, the
> initial change is correct, so there must be something else going on here.

I think I found the underlying problem!
It was not the fec driver but the marvell phy driver, more specifically
the marvell_of_reg_init call being made too late, which lead to the
observed problem at half the boot ups where the link never came up.

With the marvell,reg-init device tree parameter, a flag needs to be set
to tell the Marvell 88E1510 that it should enable the interrupt output.
(At a specific pin, in my case LED[2])
If this is not set (or set too late), the phydev->state is set to UP in
phy_start (called from fec_enet_open) but then, the auto-negotiation
never starts.
In comparison, now, after I called marvell_of_reg_init not in
m88e1510_config_aneg but in marvell_probe, everything works again :)
About a second after the fec_enet_open/phy_start calls, the
auto-negotiation starts (m88e1510_config_aneg) and the phy state changes
from UP to AN, to CHANGELINK and finally to RUNNING.

I will send a patch shortly, calling marvell_of_reg_init from a new
m88e1510_probe function instead of the m88e1510_config_aneg function.

Thanks.
Clemens

Re: tty: deadlock between tty_buffer_flush/n_tracesink_open

2016-02-14 Thread Dmitry Vyukov

On Sun, Feb 14, 2016 at 7:20 PM, Dmitry Vyukov  wrote:
> Hello,
>
> I've finally got the tty deadlock report with lockdep stack collection
> bug fixed. This is on commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95
> (4.5-rc3).


The following report is probably the same issue detected at different
stack. But please double-check that it is indeed the same issue.


[ INFO: possible circular locking dependency detected ]
4.5.0-rc3+ #326 Not tainted
---
kworker/u9:1/26 is trying to acquire lock:
 (&buf->lock){+.+...}, at: []
tty_buffer_flush+0xbf/0x3c0 drivers/tty/tty_buffer.c:244

but task is already holding lock:
 (&o_tty->termios_rwsem/1){..}, at: []
isig+0x9b/0x2c0 drivers/tty/n_tty.c:1137

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (&o_tty->termios_rwsem/1){..}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [] down_write+0x46/0xa0 kernel/locking/rwsem.c:49
   [] n_tty_flush_buffer+0x20/0x100
drivers/tty/n_tty.c:374
   [] tty_buffer_flush+0x249/0x3c0
drivers/tty/tty_buffer.c:255
   [] pty_flush_buffer+0x5c/0x180 drivers/tty/pty.c:225
   [] tty_driver_flush_buffer+0x65/0x80
drivers/tty/tty_ioctl.c:94
   [] hci_uart_tty_open+0x2b1/0x370
drivers/bluetooth/hci_ldisc.c:466
   [] tty_ldisc_open.isra.2+0x78/0xd0
drivers/tty/tty_ldisc.c:454
   [] tty_set_ldisc+0x292/0x8a0
drivers/tty/tty_ldisc.c:561
   [< inline >] tiocsetd drivers/tty/tty_io.c:2655
   [] tty_ioctl+0xb2e/0x2160 drivers/tty/tty_io.c:2910
   [< inline >] vfs_ioctl fs/ioctl.c:43
   [] do_vfs_ioctl+0x18c/0xfb0 fs/ioctl.c:674
   [< inline >] SYSC_ioctl fs/ioctl.c:689
   [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:680
   [] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

-> #3 (&port->buf.lock/1){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] tty_buffer_flush+0xbf/0x3c0
drivers/tty/tty_buffer.c:244
   [] pty_flush_buffer+0x5c/0x180 drivers/tty/pty.c:225
   [] tty_driver_flush_buffer+0x65/0x80
drivers/tty/tty_ioctl.c:94
   [] n_tracesink_open+0x95/0xf0
drivers/tty/n_tracesink.c:85
   [] tty_ldisc_open.isra.2+0x78/0xd0
drivers/tty/tty_ldisc.c:454
   [] tty_set_ldisc+0x292/0x8a0
drivers/tty/tty_ldisc.c:561
   [< inline >] tiocsetd drivers/tty/tty_io.c:2655
   [] tty_ioctl+0xb2e/0x2160 drivers/tty/tty_io.c:2910
   [< inline >] vfs_ioctl fs/ioctl.c:43
   [] do_vfs_ioctl+0x18c/0xfb0 fs/ioctl.c:674
   [< inline >] SYSC_ioctl fs/ioctl.c:689
   [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:680
   [] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

-> #2 (writelock){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] n_tracesink_datadrain+0x24/0xc0
drivers/tty/n_tracesink.c:173
   [] n_tracerouter_receivebuf+0x2b/0x40
drivers/tty/n_tracerouter.c:176
   [< inline >] receive_buf drivers/tty/tty_buffer.c:454
   [] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #1 (routelock){+.+...}:
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
   [< inline >] receive_buf drivers/tty/tty_buffer.c:454
   [] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
   [] process_one_work+0x796/0x1440
kernel/workqueue.c:2037
   [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
   [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #0 (&buf->lock){+.+...}:
   [< inline >] check_prev_add kernel/locking/lockdep.c:1853
   [< inline >] check_prevs_add kernel/locking/lockdep.c:1963
   [< inline >] validate_chain kernel/locking/lockdep.c:2148
   [] __lock_acquire+0x31e9/0x4700
kernel/locking/lockdep.c:3210
   [] lock_acquire+0x1dc/0x430
kernel/locking/lockdep.c:3589
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0xb1/0xa50
kernel/locking/mutex.c:618
   [] tty_buffer_flush+0xbf/0x3c0
drivers/tty/tty_buffer.c:244
   [] pty_flush_bu

Re: [PATCH] Revert "net: phy: turn carrier off on phy attach"

2016-02-14 Thread Florian Fainelli

On February 14, 2016 10:25:30 AM PST, Clemens Gruber 
 wrote:
>On Fri, Feb 12, 2016 at 10:56:04AM -0800, Florian Fainelli wrote:
>> On 12/02/16 10:01, Clemens Gruber wrote:
>> > Commit 113c74d83eef ("net: phy: turn carrier off on phy attach")
>breaks
>> > the eth0 link coming up on all my i.MX6Q boards with a Marvell
>88E1510.
>> > If I then do a ifconfig eth0 down/up cycle I first get a MDIO read
>> > timeout but then the link becomes ready and everything is back to
>> > normal.
>> > Without this step however, the link stays down forever, an
>unusually
>> > high amount of phy interrupts occur (about 1/second) and
>kworker/0:2
>> > is constantly using over 60% of the CPU.
>> > 
>> > Reverting it fixes the problems with the link not coming up at boot
>as
>> > well as the high amount of phy interrupts and kworker load in that
>> > state.
>> 
>> You are seeing this with the FEC driver right? We probably want to
>> carefully audit the driver and understand what could be going wrong,
>the
>> initial change is correct, so there must be something else going on
>here.
>
>I think I found the underlying problem!
>It was not the fec driver but the marvell phy driver, more specifically
>the marvell_of_reg_init call being made too late, which lead to the
>observed problem at half the boot ups where the link never came up.
>
>With the marvell,reg-init device tree parameter, a flag needs to be set
>to tell the Marvell 88E1510 that it should enable the interrupt output.
>(At a specific pin, in my case LED[2])
>If this is not set (or set too late), the phydev->state is set to UP in
>phy_start (called from fec_enet_open) but then, the auto-negotiation
>never starts.
>In comparison, now, after I called marvell_of_reg_init not in
>m88e1510_config_aneg but in marvell_probe, everything works again :)
>About a second after the fec_enet_open/phy_start calls, the
>auto-negotiation starts (m88e1510_config_aneg) and the phy state
>changes
>from UP to AN, to CHANGELINK and finally to RUNNING.
>
>I will send a patch shortly, calling marvell_of_reg_init from a new
>m88e1510_probe function instead of the m88e1510_config_aneg function.

config_init is more appropriate here since this call back will be called even 
if there is a software reset (e.g: from phy_init_hw). config_aneg is definitely 
too late, thanks for finding this!

-- 
Florian

[GIT PULL] USB driver fixes for 4.5-rc4

2016-02-14 Thread Greg KH

The following changes since commit 36f90b0a2ddd60823fe193a85e60ff1906c2a9b3:

  Linux 4.5-rc2 (2016-01-31 18:12:16 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ tags/usb-4.5-rc4

for you to fetch changes up to 6b44d1e9bf3b850b433694d654709b4cbc9bc00e:

  Merge tag 'phy-for-4.5-rc' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-linus 
(2016-02-11 20:10:58 -0800)


USB and PHY fixes for 4.5-rc4

Here are a number of USB and PHY driver fixes for 4.5-rc4.

They are the usual gadget and xhci drivers that had reported problems,
as well as a few small phy issues as well.  All have been in linux-next
with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


Chunfeng Yun (2):
  usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
  usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling

Felipe Balbi (1):
  MAINTAINERS: fix my email address

Geert Uytterhoeven (1):
  phy: Restrict phy-hi6220-usb to HiSilicon arm64

Greg Kroah-Hartman (2):
  Merge tag 'fixes-for-v4.5-rc3' of git://git.kernel.org/.../balbi/usb into 
usb-linus
  Merge tag 'phy-for-4.5-rc' of git://git.kernel.org/.../kishon/linux-phy 
into usb-linus

Gregory CLEMENT (1):
  usb: host: xhci-plat: fix NULL pointer in probe for device tree case

Jianqiang Tang (1):
  usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.

Joe Lawrence (1):
  xhci: harden xhci_find_next_ext_cap against device removal

John Youn (2):
  Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
  usb: dwc2: Fix probe problem on bcm2835

Li Jun (1):
  usb: phy: mxs: declare variable with initialized value

Lu Baolu (4):
  usb: xhci: handle both SSIC ports in PME stuck quirk
  usb: xhci: add a quirk bit for ssic port unused
  usb: xhci: set SSIC port unused only if xhci_suspend succeeds
  usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms

Mathias Nyman (2):
  Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
  xhci: Fix list corruption in urb dequeue at host removal

Shawn Lin (1):
  phy: core: fix wrong err handle for phy_power_on

Srinivas Kandagatla (1):
  usb: phy: msm: fix error handling in probe.

Tony Lindgren (2):
  phy: twl4030-usb: Relase usb phy on unload
  phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload

Ulf Hansson (1):
  usb: musb: ux500: Fix NULL pointer dereference at system PM

 MAINTAINERS  | 10 +++
 drivers/phy/Kconfig  |  1 +
 drivers/phy/phy-core.c   | 16 +++-
 drivers/phy/phy-twl4030-usb.c| 14 ++
 drivers/usb/dwc2/core.c  | 14 --
 drivers/usb/dwc2/platform.c  |  8 +-
 drivers/usb/dwc3/gadget.c|  1 +
 drivers/usb/host/xhci-ext-caps.h |  4 +++
 drivers/usb/host/xhci-mtk-sch.c  | 16 +---
 drivers/usb/host/xhci-mtk.c  | 23 +
 drivers/usb/host/xhci-pci.c  | 56 ++--
 drivers/usb/host/xhci-plat.c |  3 ++-
 drivers/usb/host/xhci-ring.c | 10 ---
 drivers/usb/host/xhci.c  |  4 ++-
 drivers/usb/host/xhci.h  |  1 +
 drivers/usb/musb/ux500.c |  7 +++--
 drivers/usb/phy/phy-msm-usb.c| 37 --
 drivers/usb/phy/phy-mxs-usb.c|  2 +-
 18 files changed, 150 insertions(+), 77 deletions(-)

[GIT PULL] TTY/Serial fixes for 4.5-rc4

2016-02-14 Thread Greg KH

The following changes since commit 36f90b0a2ddd60823fe193a85e60ff1906c2a9b3:

  Linux 4.5-rc2 (2016-01-31 18:12:16 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git/ tags/tty-4.5-rc4

for you to fetch changes up to c8053b58762745d93930826b60a4073854a15ce5:

  Revert "8250: uniphier: allow modular build with 8250 console" (2016-02-07 
18:22:54 -0800)


tty/serial fixes for 4.5-rc4

Here are a number of small tty and serial driver fixes for 4.5-rc4 that
resolve some reported issues.

One of them got reverted as it wasn't correct based on testing, and all
have been in linux-next for a while.

Signed-off-by: Greg Kroah-Hartman 


Arnd Bergmann (2):
  8250: uniphier: allow modular build with 8250 console
  serial/omap: mark wait_for_xmitr as __maybe_unused

Greg Kroah-Hartman (1):
  Revert "8250: uniphier: allow modular build with 8250 console"

Herton R. Krzesinski (2):
  pty: fix possible use after free of tty->driver_data
  pty: make sure super_block is still valid in final /dev/tty close

Jeremy McNicoll (1):
  tty: Add support for PCIe WCH382 2S multi-IO card

Peter Hurley (2):
  tty: Drop krefs for interrupted tty lock
  serial: omap: Prevent DoS using unprivileged ioctl(TIOCSRS485)

 drivers/tty/pty.c  | 21 -
 drivers/tty/serial/8250/8250_pci.c | 21 +
 drivers/tty/serial/omap-serial.c   | 10 +++---
 drivers/tty/tty_io.c   |  3 +--
 drivers/tty/tty_mutex.c|  7 ++-
 fs/devpts/inode.c  | 20 
 include/linux/devpts_fs.h  |  4 
 7 files changed, 79 insertions(+), 7 deletions(-)

[GIT PULL] Driver core fix for 4.5-rc4

2016-02-14 Thread Greg KH

The following changes since commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95:

  Linux 4.5-rc3 (2016-02-07 15:38:30 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/ 
tags/driver-core-4.5-rc4

for you to fetch changes up to 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17:

  klist: fix starting point removed bug in klist iterators (2016-02-07 22:18:47 
-0800)


driver core fix for 4.5-rc4

Here is one driver core, well klist, fix for 4.5-rc4.  It fixes a
problem found in the scsi device list traversal that probably also could
be triggered by other subsystems.

The fix has been in linux-next for a while with no reported problems.

Signed-off-by: Greg Kroah-Hartman 


James Bottomley (1):
  klist: fix starting point removed bug in klist iterators

 lib/klist.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

[GIT PULL] char/misc driver fixes for 4.5-rc4

2016-02-14 Thread Greg KH

The following changes since commit 92e963f50fc74041b5e9e744c330dca48e04f08d:

  Linux 4.5-rc1 (2016-01-24 13:06:47 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
tags/char-misc-4.5-rc4

for you to fetch changes up to 3b2b9ead32142b4cf55ea2793e5e4f7b63c04818:

  nvmem: qfprom: Specify LE device endianness (2016-02-07 23:09:13 -0800)


char/misc driver fixes for 4.5-rc4

Here are 3 fixes for some reported issues.  Two nvmem driver fixes, and
one mei fix.  All have been in linux-next just fine.

Signed-off-by: Greg Kroah-Hartman 


Alexander Usyskin (1):
  mei: validate request value in client notify request ioctl

Srinivas Kandagatla (1):
  nvmem: core: return error for non word aligned access

Stephen Boyd (1):
  nvmem: qfprom: Specify LE device endianness

 drivers/misc/mei/main.c | 6 +-
 drivers/nvmem/core.c| 6 ++
 drivers/nvmem/qfprom.c  | 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Greg KH

On Sun, Feb 14, 2016 at 06:56:20PM +0100, Linus Walleij wrote:
> On Sun, Feb 14, 2016 at 6:49 PM, Greg KH  wrote:
> > On Sun, Feb 14, 2016 at 06:42:11PM +0100, Linus Walleij wrote:
> >> Greg, heads-up on this... you'd know if this happened
> >> before.
> >>
> >> On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  wrote:
> >> > On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
> >> >> FYI, we noticed the below changes on
> >> >>
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
> >> >> chardev
> >> >> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a userspace 
> >> >> chardev ABI for GPIOs")
> >> >>
> >> >>
> >> >> [1.951191] user_verbs: couldn't register device number
> >> >
> >> > Looks like user_verbs is using a static device node setup.
> >> >
> >> > enum {
> >> > IB_UVERBS_MAJOR   = 231,
> >> > IB_UVERBS_BASE_MINOR  = 192,
> >> > IB_UVERBS_MAX_DEVICES = 32
> >> > };
> >> >
> >> > #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, 
> >> > IB_UVERBS_BASE_MINOR)
> >>
> >> That's annoying...
> >> I notice that infiniband is using register_chrdev_region() at
> >> module_init() time, counting on device major 231 to be free.
> >
> > That device major is assigned to Infiniband, why shouldn't it be doing
> > this?
> 
> I mean it's annoying that they collide. (Because of the details I
> write below, it's fine it's using the assigned number.
> 
> > Why not just ask for a new reserved one?  We could give you 261 and
> > everything should be fine, right?
> 
> Sure I can post a patch for that, but it just mitigates the problem.
> 
> The report point to the serious problem that on this system
> some dynamic allocations have already stolen major device
> numbers 232 thru 255, and 232 and 233 are also assigned.
> 
> What do you think about a patch that makes fs/char_dev.c
> emit a warning when it starts assigning dynamic numbers
> 233 and below?

That's fine with me.  I also think maybe we should look into just
switching all char major/minor allocation to be dynamic, starting at the
bottom and moving up.  I think the only tools that might have an issue
with that is the raw device controller, but maybe that has been fixed up
in userspace, I haven't looked at that in many years.

I thought I had an old patch around somewhere that did that, will go
look for it this week and see what breaks with it enabled...

thanks,

greg k-h

Re: [PATCH] [media] zl10353: use div_u64 instead of do_div

2016-02-14 Thread Ard Biesheuvel

On 14 February 2016 at 17:52, Nicolas Pitre  wrote:
> On Sun, 14 Feb 2016, Ard Biesheuvel wrote:
>
>> On 13 February 2016 at 22:57, Nicolas Pitre  wrote:
>> > On Sat, 13 Feb 2016, Ard Biesheuvel wrote:
>> >
>> >> On 12 February 2016 at 22:01, Arnd Bergmann  wrote:
>> >> > However, I did stumble over an older patch I did now, which I could
>> >> > not remember what it was good for. It does fix the problem, and
>> >> > it seems to be a better solution.
>> >> >
>> >> > Arnd
>> >> >
>> >> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
>> >> > index b5acbb404854..b5ff9881bef8 100644
>> >> > --- a/include/linux/compiler.h
>> >> > +++ b/include/linux/compiler.h
>> >> > @@ -148,7 +148,7 @@ void ftrace_likely_update(struct ftrace_branch_data 
>> >> > *f, int val, int expect);
>> >> >   */
>> >> >  #define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
>> >> >  #define __trace_if(cond) \
>> >> > -   if (__builtin_constant_p((cond)) ? !!(cond) :   
>> >> > \
>> >> > +   if (__builtin_constant_p(!!(cond)) ? !!(cond) : 
>> >> > \
>> >> > ({  
>> >> > \
>> >> > int __r;
>> >> > \
>> >> > static struct ftrace_branch_data
>> >> > \
>> >> >
>> >>
>> >> I remember seeing this patch, but I don't remember the exact context.
>> >> But when you think about it, !!cond can be a build time constant even
>> >> if cond is not, as long as you can prove statically that cond != 0. So
>> >
>> > You're right.  I just tested it and to my surprise gcc is smart enough
>> > to figure that case out.
>> >
>> >> I think this change is obviously correct, and an improvement since it
>> >> will remove the profiling overhead of branches that are not true
>> >> branches in the first place.
>> >
>> > Indeed.
>> >
>>
>> ... and perhaps we should not evaluate cond twice either?
>
> It is not. The value of the argument to __builtin_constant_p() is not
> itself evaluated and therefore does not produce side effects.
>

Interesting, thanks for clarifying.

[PATCH] MIPS: Use CPHYSADDR to implement mips32 __pa

2016-02-14 Thread Paul Burton

Use CPHYSADDR to implement the __pa macro converting from a virtual to a
physical address for MIPS32, much as is already done for MIPS64 (though
without the complication of having both compatibility & XKPHYS
segments).

This allows for __pa to work regardless of whether the address being
translated is in kseg0 or kseg1, unlike the previous subtraction based
approach which only worked for addresses in kseg0. Working for kseg1
addresses is important if __pa is used on addresses allocated by
dma_alloc_coherent, where on systems with non-coherent I/O we provide
addresses in kseg1. If this address is then used with
dma_map_single_attrs then it is provided to virt_to_page, which in turn
calls virt_to_phys which is a wrapper around __pa. The result is that we
end up with a physical address 0x2000 bytes (ie. the size of kseg0)
too high.

In addition to providing consistency with MIPS64 & fixing the kseg1 case
above this has the added bonus of generating smaller code for systems
implementing MIPS32r2 & beyond, where a single ext instruction can
extract the physical address rather than needing to load an immediate
into a temp register & subtract it. This results in ~1.3KB savings for a
boston_defconfig kernel adjusted to set CONFIG_32BIT=y.

Signed-off-by: Paul Burton 
---

 arch/mips/include/asm/page.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h
index 21ed715..35c1222 100644
--- a/arch/mips/include/asm/page.h
+++ b/arch/mips/include/asm/page.h
@@ -169,8 +169,7 @@ typedef struct { unsigned long pgprot; } pgprot_t;
 __x < CKSEG0 ? XPHYSADDR(__x) : CPHYSADDR(__x);\
 })
 #else
-#define __pa(x)
\
-((unsigned long)(x) - PAGE_OFFSET + PHYS_OFFSET)
+#define __pa(x)CPHYSADDR(x)
 #endif
 #define __va(x)((void *)((unsigned long)(x) + PAGE_OFFSET - 
PHYS_OFFSET))
 #include 
-- 
2.7.1

Re: [PATCH] Documentation: Chinese translation of arm64/silicon-errata.txt

2016-02-14 Thread Weiwei Jia

2016-02-14 3:40 GMT+08:00  :
> From: Fu Wei 
>
> This is a Chinese translated version of Documentation/arm64/silicon-errata.txt
>
> Signed-off-by: Fu Wei 

Reviewed-by: Weiwei Jia 

> ---
>  Documentation/zh_CN/arm64/silicon-errata.txt | 74 
> 
>  1 file changed, 74 insertions(+)
>  create mode 100644 Documentation/zh_CN/arm64/silicon-errata.txt
>
> diff --git a/Documentation/zh_CN/arm64/silicon-errata.txt 
> b/Documentation/zh_CN/arm64/silicon-errata.txt
> new file mode 100644
> index 000..0584bd6
> --- /dev/null
> +++ b/Documentation/zh_CN/arm64/silicon-errata.txt
> @@ -0,0 +1,74 @@
> +Chinese translated version of Documentation/arm64/silicon-errata.txt
> +
> +If you have any comment or update to the content, please contact the
> +original document maintainer directly.  However, if you have a problem
> +communicating in English you can also ask the Chinese maintainer for
> +help.  Contact the Chinese maintainer if this translation is outdated
> +or if there is a problem with the translation.
> +
> +M: Will Deacon 
> +zh_CN: Fu Wei 
> +C: e835a65f7ab143acf9aee6f9a98ef1c7afd2a835
> +-
> +Documentation/arm64/silicon-errata.txt 的中文翻译
> +
> +如果想评论或更新本文的内容，请直接联系原文档的维护者。如果你使用英文
> +交流有困难的话，也可以向中文版维护者求助。如果本翻译更新不及时或者翻
> +译存在问题，请联系中文版维护者。
> +
> +英文版维护者： Will Deacon 
> +中文版维护者： 傅炜  Fu Wei 
> +中文版翻译者： 傅炜  Fu Wei 
> +中文版校译者： 傅炜  Fu Wei 
> +本文翻译提交时的 Git 检出点为： e835a65f7ab143acf9aee6f9a98ef1c7afd2a835
> +
> +以下为正文
> +-
> +芯片勘误和软件解决办法
> +==
> +
> +作者: Will Deacon 
> +日期: 2015年11月27日
> +
> +一个不幸的现实：硬件经常带有一些所谓的“错误（errata）”，致使其在
> +某些特定的情况下会违背构架定义的行为。对基于 ARM 的硬件，这些错误
> +大体被分为以下几类：
> +
> +  A 类：无可行解决方法的严重错误。
> +  B 类：有可接受的解决方法的重大或严重错误。
> +  C 类：在正常操作中不会显现的小错误。
> +
> +更多资讯，请在 infocenter.arm.com （需注册）中查阅“软件开发者勘误
> +笔记”（“Software Developers Errata Notice”）文档。
> +
> +对于 Linux 而言，B 类错误可能需要操作系统的某些特别处理。例如，避免
> +一个特殊的代码序列，或是以一种特定的方式配置处理器。在某种不太常见的
> +情况下，为将 A 类错误当作 C 类处理，可能需要用类似手段。这些手段被
> +统称为“软件解决办法”，且仅在少数情况需要（例如，那些需要一个在非安全
> +异常级运行的解决方法 *且* 能被 Linux 触发的情况）。

I think it may be better like this.

（例如，那些需要在一个非安全异常级别运行的解决方法 *并且* 能被 Linux 触发的情况）

> +
> +对于尚在讨论中的可能对未受错误影响的系统产生不利影响的软件解决办法，
> +有一个相应的内核配置（Kconfig）选项被加在 “内核特性（Kernel Features）”
> +- > “基于可选方案框架的 ARM 错误解决办法（ARM errata workarounds via
> +the alternatives framework）"。这些选项被默认开启，若探测到受影响的CPU，
> +补丁将在运行时被打入。对于对系统运行影响较小的解决办法，内核配置选项
> +并不存在，且代码以一种避开错误的方式被构造（带注释为宜）。
> +
> +这种做法对于在任意内核源代码树中准确地判断出哪个错误已被软件方法所解决
> +稍微有点麻烦，所以这个文件在 Linux 内核中作为软件解决办法的注册表，
> +并将在新的软件解决办法被提交和反向移植到稳定内核时被更新。
> +
> +| 实现者 | 受影响的组件| 勘误编号| 内核配置|
> +++-+-+-+
> +| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319  
>   |
> +| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319  
>   |
> +| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069  
>   |
> +| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472  
>   |
> +| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719  
>   |
> +| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419  
>   |
> +| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075  
>   |
> +| ARM| Cortex-A57  | #852523 | N/A   
>   |
> +| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220  
>   |
> +|| | |   
>   |
> +| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375  
>   |
> +| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154  
>   |
> --
> 2.5.0
>

Re: [lkp] [gpio] 3c702e9987: kmsg.user_verbs:couldn't_register_device_number

2016-02-14 Thread Michael Welling

On Sun, Feb 14, 2016 at 11:05:15AM -0800, Greg KH wrote:
> On Sun, Feb 14, 2016 at 06:56:20PM +0100, Linus Walleij wrote:
> > On Sun, Feb 14, 2016 at 6:49 PM, Greg KH  wrote:
> > > On Sun, Feb 14, 2016 at 06:42:11PM +0100, Linus Walleij wrote:
> > >> Greg, heads-up on this... you'd know if this happened
> > >> before.
> > >>
> > >> On Sun, Feb 14, 2016 at 9:06 AM, Michael Welling  
> > >> wrote:
> > >> > On Sun, Feb 14, 2016 at 02:59:06PM +0800, kernel test robot wrote:
> > >> >> FYI, we noticed the below changes on
> > >> >>
> > >> >> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
> > >> >> chardev
> > >> >> commit 3c702e9987e261042a07e43460a8148be254412e ("gpio: add a 
> > >> >> userspace chardev ABI for GPIOs")
> > >> >>
> > >> >>
> > >> >> [1.951191] user_verbs: couldn't register device number
> > >> >
> > >> > Looks like user_verbs is using a static device node setup.
> > >> >
> > >> > enum {
> > >> > IB_UVERBS_MAJOR   = 231,
> > >> > IB_UVERBS_BASE_MINOR  = 192,
> > >> > IB_UVERBS_MAX_DEVICES = 32
> > >> > };
> > >> >
> > >> > #define IB_UVERBS_BASE_DEV  MKDEV(IB_UVERBS_MAJOR, 
> > >> > IB_UVERBS_BASE_MINOR)
> > >>
> > >> That's annoying...
> > >> I notice that infiniband is using register_chrdev_region() at
> > >> module_init() time, counting on device major 231 to be free.
> > >
> > > That device major is assigned to Infiniband, why shouldn't it be doing
> > > this?
> > 
> > I mean it's annoying that they collide. (Because of the details I
> > write below, it's fine it's using the assigned number.
> > 
> > > Why not just ask for a new reserved one?  We could give you 261 and
> > > everything should be fine, right?
> > 
> > Sure I can post a patch for that, but it just mitigates the problem.
> > 
> > The report point to the serious problem that on this system
> > some dynamic allocations have already stolen major device
> > numbers 232 thru 255, and 232 and 233 are also assigned.
> > 
> > What do you think about a patch that makes fs/char_dev.c
> > emit a warning when it starts assigning dynamic numbers
> > 233 and below?
> 
> That's fine with me.  I also think maybe we should look into just
> switching all char major/minor allocation to be dynamic, starting at the
> bottom and moving up.  I think the only tools that might have an issue
> with that is the raw device controller, but maybe that has been fixed up
> in userspace, I haven't looked at that in many years.
>

Is there any reason for the CHRDEV_MAJOR_HASH_SIZE being 255?
If we increase the size to say 511 will it break userspace?

In the future I see a robot building a kernel with more that 255 devices and
having to deal with this kind of collision again.

The handling of large major assignment baffles me.
The major numbers outside of the size of the table are just wrapping around to
the beginning again. This is inherently going to cause collisions.

static inline int major_to_index(unsigned major)
{
return major % CHRDEV_MAJOR_HASH_SIZE;
}

> I thought I had an old patch around somewhere that did that, will go
> look for it this week and see what breaks with it enabled...
> 
> thanks,
> 
> greg k-h

Re: arm qemu test failures due to 'driver-core: platform: probe of-devices only using list of compatibles'

2016-02-14 Thread Uwe Kleine-König

[adding lakml and rmk to Cc]

Hello Guenter,

On Sun, Feb 14, 2016 at 08:50:10AM -0800, Guenter Roeck wrote:
> Uwe,
> 
> Your patch 'driver-core: platform: probe of-devices only using list of
> compatibles' causes the following qemu tests to crash in -next.
> 
> arm:vexpress-a9:vexpress_defconfig:vexpress-v2p-ca9
> arm:vexpress-a15:vexpress_defconfig:vexpress-v2p-ca15-tc1
> arm:vexpress-a9:multi_v7_defconfig:vexpress-v2p-ca9
> arm:vexpress-a15:multi_v7_defconfig:vexpress-v2p-ca15-tc1
> 
> Crash log:
> 
> VFS: Cannot open root device "mmcblk0" or unknown-block(0,0): error -6
> Please append a correct "root=" boot option; here are the available 
> partitions:
> 1f00  131072 mtdblock0  (driver?)
> 1f01   32768 mtdblock1  (driver?)
> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
> 
> ie the mmc driver no longer instantiates. Reverting the patch fixes the 
> problem.

The driver is drivers/mmc/host/mmci.c, right? and the relevant device
tree snippet is:

mmci@05000 {
compatible = "arm,pl180", "arm,primecell";
...
};

? So the unexpected abnormality here is that even though this device is
instantiated by dt, the driver doesn't provide any compatibles.
Either my expectation is wrong, then 67d02a1bbb33455 should be reverted
(or handle this case in a different way), or the mmci driver should
declare compatibles (but then it needs to be a platform driver and not
an amba driver?).

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |

[PATCH] drivers/platform: make x86/intel_scu_ipc.c explicitly non-modular

2016-02-14 Thread Paul Gortmaker

The Kconfig currently controlling compilation of this code is:

drivers/platform/x86/Kconfig:config INTEL_SCU_IPC
drivers/platform/x86/Kconfig:   bool "Intel SCU IPC Support"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

We explicitly disallow a driver unbind, since that doesn't have a
sensible use case anyway, and it allows us to drop the ".remove"
code for non-modular drivers.

Since module_pci_driver() uses the same init level priority as
builtin_pci_driver() the init ordering remains unchanged with
this commit.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

We don't replace module.h with init.h since the file already has that.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Darren Hart 
Cc: platform-driver-...@vger.kernel.org
Signed-off-by: Paul Gortmaker 
---
 drivers/platform/x86/intel_scu_ipc.c | 35 ---
 1 file changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/platform/x86/intel_scu_ipc.c 
b/drivers/platform/x86/intel_scu_ipc.c
index f94b730540e2..e81daff65f62 100644
--- a/drivers/platform/x86/intel_scu_ipc.c
+++ b/drivers/platform/x86/intel_scu_ipc.c
@@ -24,7 +24,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -611,28 +610,6 @@ static int ipc_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
return 0;
 }
 
-/**
- * ipc_remove  -   remove a bound IPC device
- * @pdev: PCI device
- *
- * In practice the SCU is not removable but this function is also
- * called for each device on a module unload or cleanup which is the
- * path that will get used.
- *
- * Free up the mappings and release the PCI resources
- */
-static void ipc_remove(struct pci_dev *pdev)
-{
-   struct intel_scu_ipc_dev *scu = pci_get_drvdata(pdev);
-
-   mutex_lock(&ipclock);
-   scu->dev = NULL;
-   mutex_unlock(&ipclock);
-
-   iounmap(scu->i2c_base);
-   intel_scu_devices_destroy();
-}
-
 static const struct pci_device_id pci_ids[] = {
{
PCI_VDEVICE(INTEL, PCI_DEVICE_ID_LINCROFT),
@@ -650,17 +627,13 @@ static const struct pci_device_id pci_ids[] = {
0,
}
 };
-MODULE_DEVICE_TABLE(pci, pci_ids);
 
 static struct pci_driver ipc_driver = {
+   .driver = {
+   .suppress_bind_attrs = true,
+   },
.name = "intel_scu_ipc",
.id_table = pci_ids,
.probe = ipc_probe,
-   .remove = ipc_remove,
 };
-
-module_pci_driver(ipc_driver);
-
-MODULE_AUTHOR("Sreedhara DS ");
-MODULE_DESCRIPTION("Intel SCU IPC driver");
-MODULE_LICENSE("GPL");
+builtin_pci_driver(ipc_driver);
-- 
2.6.1

Re: [PATCH 2/2] iio: ina2xx: Remove trace_printk debug statments

2016-02-14 Thread Andrew F. Davis


On 02/13/2016 07:21 AM, Jonathan Cameron wrote:

On 12/02/16 18:34, Andrew F. Davis wrote:

These are generally for devlopment use only, remove these
from performance-critical code, convert to dev_dbg elswhere.

Signed-off-by: Andrew F. Davis 

Hm... Tracepoints are also somewhat considered to be ABI and
hence it is possible some tooling relies on them.  Also they
are very nearly free when not enabled.

The fundamental reason they are here it to allow checking of whether
the thread is ticking along fast enough to keep up with the incoming data.
Can see this being useful on live platforms to debug sampling issues.



This looks more like development testing statements to see if the delay timer
adjustment algorithm is working well. If one is debugging sampling issues,
then they are always free to add any extra debugging lines they need.


What do others think about this change?

Andrew, what is driving your wish to change this?



v4.5-rc3 include/linux/kernel.h +609:

"This is intended as a debugging tool for the developer only.
Please refrain from leaving trace_printks scattered around in
your code."


Jonathan

---
  drivers/iio/adc/ina2xx-adc.c | 21 +++--
  1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/drivers/iio/adc/ina2xx-adc.c b/drivers/iio/adc/ina2xx-adc.c
index 61e8ae9..ba11b2e 100644
--- a/drivers/iio/adc/ina2xx-adc.c
+++ b/drivers/iio/adc/ina2xx-adc.c
@@ -440,7 +440,6 @@ static int ina2xx_work_buffer(struct iio_dev *indio_dev)
struct ina2xx_chip_info *chip = iio_priv(indio_dev);
unsigned short data[8];
int bit, ret, i = 0;
-   unsigned long buffer_us, elapsed_us;
s64 time_a, time_b;
unsigned int alert;

@@ -464,8 +463,6 @@ static int ina2xx_work_buffer(struct iio_dev *indio_dev)
return ret;

alert &= INA266_CVRF;
-   trace_printk("Conversion ready: %d\n", !!alert);
-
} while (!alert);

/*
@@ -490,14 +487,9 @@ static int ina2xx_work_buffer(struct iio_dev *indio_dev)
iio_push_to_buffers_with_timestamp(indio_dev,
   (unsigned int *)data, time_a);

-   buffer_us = (unsigned long)(time_b - time_a) / 1000;
-   elapsed_us = (unsigned long)(time_a - chip->prev_ns) / 1000;
-
-   trace_printk("uS: elapsed: %lu, buf: %lu\n", elapsed_us, buffer_us);
-
chip->prev_ns = time_a;

-   return buffer_us;
+   return (unsigned long)(time_b - time_a) / 1000;
  };

  static int ina2xx_capture_thread(void *data)
@@ -532,12 +524,13 @@ static int ina2xx_buffer_enable(struct iio_dev *indio_dev)
struct ina2xx_chip_info *chip = iio_priv(indio_dev);
unsigned int sampling_us = SAMPLING_PERIOD(chip);

-   trace_printk("Enabling buffer w/ scan_mask %02x, freq = %d, avg =%u\n",
-(unsigned int)(*indio_dev->active_scan_mask),
-100/sampling_us, chip->avg);
+   dev_dbg(&indio_dev->dev, "Enabling buffer w/ scan_mask %02x, freq = %d, avg 
=%u\n",
+   (unsigned int)(*indio_dev->active_scan_mask),
+   100 / sampling_us, chip->avg);

-   trace_printk("Expected work period: %u us\n", sampling_us);
-   trace_printk("Async readout mode: %d\n", chip->allow_async_readout);
+   dev_dbg(&indio_dev->dev, "Expected work period: %u us\n", sampling_us);
+   dev_dbg(&indio_dev->dev, "Async readout mode: %d\n",
+   chip->allow_async_readout);

chip->prev_ns = iio_get_time_ns();

Re: next: sparc64 crashes due to 'blk-mq: dynamic h/w context count'

2016-02-14 Thread Guenter Roeck


On 02/14/2016 07:14 AM, Ming Lei wrote:

On Sun, Feb 14, 2016 at 9:17 PM, Guenter Roeck  wrote:

Hi,

my runtime tests of linux-next crash for sparc64 due to commit 'blk-mq: dynamic
h/w context count'. Reverting the patch fixes the problem. Bisect log is
attached below. Full crash log is available at http://kerneltests.org/builders,
in the table with qemu test results.


Guenter, could you test patch in the following link to see if it can be fixed?

http://marc.info/?l=linux-kernel&m=145526562410555&w=2



Yes, that fixes the problem.

Thanks,
Guenter

[PATCH] drivers/rtc: make class.c explicitly non-modular

2016-02-14 Thread Paul Gortmaker

The Makefile/Kconfig currently controlling compilation of this code is:

obj-$(CONFIG_RTC_CLASS) += rtc-core.o
rtc-core-y  := class.o interface.o

drivers/rtc/Kconfig:menuconfig RTC_CLASS
drivers/rtc/Kconfig:bool "Real Time Clock"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the code there is no doubt it is builtin-only.

We don't replace module.h with init.h since the file does need
to know what a struct module is.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Alessandro Zummo 
Cc: Alexandre Belloni 
Cc: rtc-li...@googlegroups.com
Signed-off-by: Paul Gortmaker 
---
 drivers/rtc/class.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index de86578bcd6d..74fd9746aeca 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -361,17 +361,4 @@ static int __init rtc_init(void)
rtc_dev_init();
return 0;
 }
-
-static void __exit rtc_exit(void)
-{
-   rtc_dev_exit();
-   class_destroy(rtc_class);
-   ida_destroy(&rtc_ida);
-}
-
 subsys_initcall(rtc_init);
-module_exit(rtc_exit);
-
-MODULE_AUTHOR("Alessandro Zummo ");
-MODULE_DESCRIPTION("RTC class support");
-MODULE_LICENSE("GPL");
-- 
2.5.0

Re: arm qemu test failures due to 'driver-core: platform: probe of-devices only using list of compatibles'

2016-02-14 Thread Russell King - ARM Linux

On Sun, Feb 14, 2016 at 08:55:01PM +0100, Uwe Kleine-König wrote:
> So the unexpected abnormality here is that even though this device is
> instantiated by dt, the driver doesn't provide any compatibles.
> Either my expectation is wrong, then 67d02a1bbb33455 should be reverted

Your expectation is wrong.  AMBA primecell devices have hardware IDs
and are matched to their drivers by those IDs.  Just like PCI.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

Re: Computer fails to resume from suspend unless I rmmod jme before initiating the suspend

2016-02-14 Thread Diego Viola

On Sat, Feb 13, 2016 at 6:38 PM, Diego Viola  wrote:
> On Fri, Feb 12, 2016 at 6:17 AM, Diego Viola  wrote:
>> On Wed, Feb 10, 2016 at 7:36 PM, Diego Viola  wrote:
>>> On Wed, Feb 10, 2016 at 2:19 AM, Diego Viola  wrote:
 Hi Guo,

 I have an x86 computer with this network card:

 02:00.0 Ethernet controller: JMicron Technology Corp. JMC260 PCI
 Express Fast Ethernet Controller (rev 03)

 Every time I initiate a suspend (systemctl suspend) the machine hangs
 at resume unless I unload the jme driver.

 Here is a Call Trace I was able to get after it hanged:

 tasklet_action+0xb0/0xd0
 __do_softirq+0xcf/0x290
 irq_exit+0xa3/0xb0
 do_IRQ+0x54/0xd0
 common_interrupt+0x82/0x82

 jme_start_irq+0x84/0xa0 [jme]
 jme_resume+0x12f/0x210 [jme]
 pci_pm_resume+0x64/0xa0
 ? pci_pm_thaw+0x90/0x90
 dpm_run_callback+0x4e/0x130
 device_resume+0xd3/0x1f0
 async_resume+0x1d/0x50
 async_run_entry_fn+0x48/0x150
 process_one_work+0x14b/0x440
 worker_thread+0x48/0x4a0
 ? process_one_work+0x440/0x440
 kthread+0xd8/0xf0
 ? kthread_worker_fn+0x170/0x170
 ret_from_fork+0x3f/0x70
 ? kthread_worker_fn+0x170/0x170

 Please note that I had to type the calltrace above as I don't have a
 serial cable and netconsole didn't work for me for some reason, so
 there could be typos I didn't notice.

 I run Arch Linux (x86-64), my uname is:

 $ uname -a
 Linux myhost 4.4.1-2-ARCH #1 SMP PREEMPT Wed Feb 3 13:12:33 UTC 2016
 x86_64 GNU/Linux

 Please let me know if you have any questions or need any other information.

 Thanks,

 Diego
>>>
>>> I found something interesting, I can suspend/resume just fine when the
>>> module is loaded and when I do this:
>>>
>>> $ ip link set ens34 down
>>>
>>> When I bring the device up again the hang still occurs.
>>>
>>> Diego
>>
>> I have tried to reproduce this problem with the latest git
>> (torvalds/linux.git) and also went back to Linux 3.11 and I still got
>> the hang with both, my plan was to run git bisect, but the problem
>> still occurs.
>>
>> I opened this bug in bugzilla if it's preferred to deal with the problem 
>> there:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=112351
>>
>> Thanks,
>> Diego
>
> So I found that disabling async as in:
>
> $ echo 0 > /sys/power/pm_async
>
> Helps with my issue, I can't reproduce the hang anymore, tried
> suspend/resume almost ~15 times.
>
> Diego

Can someone please help?

Re: [PATCH 1/2] tracing/mm: don't trace kfree on offline CPUs

2016-02-14 Thread Denis Kirjanov

On 2/14/16, Steven Rostedt  wrote:
> On Sat, 13 Feb 2016 21:22:52 +0300
> Denis Kirjanov  wrote:
>
>> -DEFINE_EVENT(kmem_free, kfree,
>> +DEFINE_EVENT_CONDITION(kmem_free, kfree,
>>
>>  TP_PROTO(unsigned long call_site, const void *ptr),
>>
>> -TP_ARGS(call_site, ptr)
>> +TP_ARGS(call_site, ptr),
>> +
>> +/*
>> + * This trace can be potentially called from an offlined cpu.
>> + * Since trace points use RCU and RCU should not be used from
>> + * offline cpus, filter such calls out.
>> + * While this trace can be called from a preemptable section,
>> + * it has no impact on the condition since tasks can migrate
>> + * only from online cpus to other online cpus. Thus its safe
>> + * to use raw_smp_processor_id.
>> + */
>> +TP_CONDITION(cpu_online(raw_smp_processor_id()))
>
> This is starting to become a common occurrence. Perhaps it is best to
> just hardcode this into the tracepoint code itself?

Can you take it as a fix for now. I'll post the follow-up patch then
for rcu and offline cpus
issue.

Thanks!

>
> -- Steve
>
>>  );
>>
>>  DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,
>
>

1 2 3 4 5 6 7 8 >

1 - 100 of 793 matches

Mail list logo