Re: [PATCH] pstore: fix crypto dependencies without compression

2018-04-06 Thread Arnd Bergmann
On Fri, Apr 6, 2018 at 8:38 AM, Tobias Regnery  wrote:
> Commit 58eb5b670747 ("pstore: fix crypto dependencies") fixed up the crypto
> dependencies but missed the case when no compression is selected.
>
> With CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n  and CONFIG_CRYPTO=m we see
> the following link error:
>
> fs/pstore/platform.o: In function `pstore_register':
> (.text+0x1b1): undefined reference to `crypto_has_alg'
> (.text+0x205): undefined reference to `crypto_alloc_base'
> fs/pstore/platform.o: In function `pstore_unregister':
> (.text+0x3b0): undefined reference to `crypto_destroy_tfm'
>
> Fix this by selecting CONFIG_CRYPTO unconditionally.
>
> Fixes: 58eb5b670747 ("pstore: fix crypto dependencies")
> Signed-off-by: Tobias Regnery 

Thanks, I wonder how I missed this one. Thanks for fixing it up.
It's a bit unfortunate that it now disallows the otherwise valid
CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n
and CONFIG_CRYPTO=n configuration, though.

Could we do this by making the calls compile-time configured
in the pstore code instead? Please try the untested version
below.

Arnd

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 1143ef351c58..dc720573fd53 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -258,7 +258,7 @@ static int pstore_decompress(void *in, void *out,

 static void allocate_buf_for_compression(void)
 {
-   if (!zbackend)
+   if (!IS_ENABLED(CONFIG_PSTORE_COMPRESS) || !zbackend)
return;

if (!crypto_has_comp(zbackend->name, 0, 0)) {
@@ -287,7 +287,7 @@ static void allocate_buf_for_compression(void)

 static void free_buf_for_compression(void)
 {
-   if (!IS_ERR_OR_NULL(tfm))
+   if (IS_ENABLED(CONFIG_PSTORE_COMPRESS) && !IS_ERR_OR_NULL(tfm))
crypto_free_comp(tfm);
kfree(big_oops_buf);
big_oops_buf = NULL;


Re: [PATCH 4.4 014/134] perf tools: Make perf_event__synthesize_mmap_events() scale

2018-04-06 Thread Greg Kroah-Hartman
On Thu, Mar 29, 2018 at 05:13:56PM +0100, Ben Hutchings wrote:
> On Mon, 2018-03-19 at 19:04 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Stephane Eranian 
> > 
> > 
> > [ Upstream commit 88b897a30c525c2eee6e7f16e1e8d0f18830845e ]
> > 
> > This patch significantly improves the execution time of
> > perf_event__synthesize_mmap_events() when running perf record on systems
> > where processes have lots of threads.
> > 
> > It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
> > generate each map line in the maps file.  If you have 1000 threads, then you
> > have necessarily 1000 stacks.  For each vma, you need to check if it
> > corresponds to a thread's stack.  With a large number of threads, this can 
> > take
> > a very long time. I have seen latencies >> 10mn.
> > 
> > As of today, perf does not use the fact that a mapping is a stack, 
> > therefore we
> > can work around the issue by using /proc/pid/tasks/pid/maps.  This entry 
> > does
> > not try to map a vma to stack and is thus much faster with no loss of
> > functonality.
> > 
> > The proc-map-timeout logic is kept in case users still want some upper 
> > limit.
> > 
> > In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
> > /proc/pid/task/pid/maps, tasks -> task.  Thanks Arnaldo for catching this.
> > 
> > Committer note:
> > 
> > This problem seems to have been elliminated in the kernel since commit :
> > b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").
> [...]
> 
> I don't think so.  It looks like this was fixed by commit 65376df58217
> ("proc: revert /proc//maps [stack:TID] annotation") which we
> already have in 4.4-stable.  But older branches (3.16, 3.18, 4.1) don't
> have that and probably should do.

Now added to 3.18.y

> It looks like commit b18cb64ead40 ("fs/proc: Stop trying to report
> thread stacks") is also a candidate for stable.

Now added to 3.18.y and 4.4.y, thanks!

greg k-h


Re: [PATCH 3/8] bindings: PCI: designware: Add support for the EP in designware driver

2018-04-06 Thread Kishon Vijay Abraham I
Hi,

On Tuesday 03 April 2018 06:50 PM, Gustavo Pimentel wrote:
> Hi Kishon,
> 
> On 03/04/2018 11:55, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On Tuesday 03 April 2018 04:13 PM, Gustavo Pimentel wrote:
>>> Hi Kishon,
>>>
>>> On 02/04/2018 06:35, Kishon Vijay Abraham I wrote:


 On Wednesday 28 March 2018 05:08 PM, Gustavo Pimentel wrote:
> Signed-off-by: Gustavo Pimentel 

 Please add a commit message.
>>>
>>> Ok. I'll add. Thanks for noticing it.
>>>
> ---
>  Documentation/devicetree/bindings/pci/designware-pcie.txt | 13 
> +
>  1 file changed, 13 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/pci/designware-pcie.txt 
> b/Documentation/devicetree/bindings/pci/designware-pcie.txt
> index 6300762..4bb2e08 100644
> --- a/Documentation/devicetree/bindings/pci/designware-pcie.txt
> +++ b/Documentation/devicetree/bindings/pci/designware-pcie.txt
> @@ -3,6 +3,7 @@
>  Required properties:
>  - compatible:
>   "snps,dw-pcie" for RC mode;
> + "snps,dw-pcie-ep" for EP mode;
>  - reg: Should contain the configuration address space.
>  - reg-names: Must be "config" for the PCIe configuration space.
>  (The old way of getting the configuration address space from "ranges"
> @@ -56,3 +57,15 @@ Example configuration:
>   #interrupt-cells = <1>;
>   num-lanes = <1>;
>   };
> +or
> + pcie_ep: pcie_ep@dfc0 {
> + compatible = "snps,dw-pcie-ep";
> + reg = <0xdfc0 0x0001000>, /* IP registers 1 */
> +   <0xdfc01000 0x0001000>, /* IP registers 2 */

 Doesn't this have iATU unroll space?
>>>
>>> I don't think EP has it, but I'm no expert on this matter. Can you provide 
>>> me
>>> some example of having iATU unroll space mapping would be useful in EP 
>>> scope?
>>
>> I'm not sure. I thought if the dwc3 core version is 4.80, then it'll have a
>> separate ATU space irrespective of RC mode or EP mode.
> 
> As replied on patch 1, let's leave out  any reference of iATU unroll to avoid
> any confusion. Agree?

Mentioning iATU is fine as long as you change the size field in the "reg" 
property.

Thanks
Kishon


Re: [PATCH v2] xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE

2018-04-06 Thread kbuild test robot
Hi Paul,

I love your patch! Yet something to improve:

[auto build test ERROR on xen-tip/linux-next]
[also build test ERROR on v4.16 next-20180405]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Paul-Durrant/xen-privcmd-add-IOCTL_PRIVCMD_MMAP_RESOURCE/20180406-121749
base:   https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

All errors (new ones prefixed by >>):

   drivers/xen/privcmd.o: In function `privcmd_ioctl':
>> privcmd.c:(.text+0x12ac): undefined reference to `xen_remap_domain_mfn_array'
   privcmd.c:(.text+0x12ac): relocation truncated to fit: R_AARCH64_CALL26 
against undefined symbol `xen_remap_domain_mfn_array'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 4.4 092/134] cpufreq: Fix governor module removal race

2018-04-06 Thread Greg Kroah-Hartman
On Sun, Apr 01, 2018 at 09:56:41PM +0100, Ben Hutchings wrote:
> On Mon, 2018-03-19 at 19:06 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: "Rafael J. Wysocki" 
> > 
> > 
> > [ Upstream commit a8b149d32b663c1a4105273295184b78f53d33cf ]
> [...]
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -551,6 +551,8 @@ static int cpufreq_parse_governor(char *
> >     *governor = t;
> >     err = 0;
> >     }
> > +   if (t && !try_module_get(t->owner))
> > +   t = NULL;
> 
> This won't work because t is dead after this point.  The fix appears to
> depend on:
> 
> commit 045149e6a22119e5bf0d16a0b24a4173a2abb71d
> Author: Rafael J. Wysocki 
> Date:   Thu Nov 23 01:23:16 2017 +0100
> 
> cpufreq: Clean up cpufreq_parse_governor()
> 
> which moves the assignment to *governor further down.

Ick, this also didn't make it into 4.9.y so I'm just reverting it from
everywhere.

thanks for the review!

greg k-h


INFO: rcu detected stall in __snd_pcm_lib_xfer (2)

2018-04-06 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
e02d37bf55a9a36f22427fd6dd733fe104d817b6 (Thu Apr 5 17:42:07 2018 +)
Merge tag 'sound-4.17-rc1' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=7e3f31a52646f939c052


So far this crash happened 43 times on upstream.
Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5148264345108480
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-4805825610197092128

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7e3f31a52646f939c...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

INFO: rcu_sched self-detected stall on CPU
	1-...!: (124999 ticks this GP) idle=522/1/4611686018427387906  
softirq=65432/65432 fqs=169

 (t=125000 jiffies g=35384 c=35383 q=537)
rcu_sched kthread starved for 124294 jiffies! g35384 c35383 f0x0  
RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0

RCU grace-period kthread stack dump:
rcu_sched   R  running task23768 9  2 0x8000
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x807/0x1e40 kernel/sched/core.c:3490
 schedule+0xef/0x430 kernel/sched/core.c:3549
 schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
 rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
 kthread+0x345/0x410 kernel/kthread.c:238
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
NMI backtrace for cpu 1
CPU: 1 PID: 15299 Comm: syz-executor3 Not tainted 4.16.0+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b9/0x29f lib/dump_stack.c:53
 nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
 nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
 rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
 print_cpu_stall kernel/rcu/tree.c:1525 [inline]
 check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
 __rcu_pending kernel/rcu/tree.c:3356 [inline]
 rcu_pending kernel/rcu/tree.c:3401 [inline]
 rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
 update_process_times+0x2d/0x70 kernel/time/timer.c:1636
 tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171
 tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179
 __run_hrtimer kernel/time/hrtimer.c:1337 [inline]
 __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399
 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
 smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
 
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:793 [inline]
RIP: 0010:snd_pcm_stream_unlock_irq+0xb7/0xf0 sound/core/pcm_native.c:166
RSP: 0018:8801c3f27560 EFLAGS: 0246 ORIG_RAX: ff13
RAX: 0004 RBX:  RCX: c900038d5000
RDX: 0004 RSI: 859ca760 RDI: 88b172b8
RBP: 8801c3f27568 R08: 8801abf86af8 R09: 0006
R10: 8801abf86280 R11:  R12: 0004
R13: ffe0 R14: 8801d7d9e1c0 R15: 8801ceac6940
 __snd_pcm_lib_xfer+0x739/0x1d10 sound/core/pcm_lib.c:2246
 snd_pcm_oss_write3+0xe9/0x220 sound/core/oss/pcm_oss.c:1236
 io_playback_transfer+0x274/0x310 sound/core/oss/io.c:47
 snd_pcm_plug_write_transfer+0x36c/0x470 sound/core/oss/pcm_plugin.c:619
 snd_pcm_oss_write2+0x25c/0x460 sound/core/oss/pcm_oss.c:1365
 snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1415 [inline]
 snd_pcm_oss_write+0x764/0xa20 sound/core/oss/pcm_oss.c:2774
 __vfs_write+0x10b/0x880 fs/read_write.c:485
 vfs_write+0x1f8/0x560 fs/read_write.c:549
 ksys_write+0xf9/0x250 fs/read_write.c:598
 SYSC_write fs/read_write.c:610 [inline]
 SyS_write+0x24/0x30 fs/read_write.c:607
 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4552d9
RSP: 002b:7f1568de6c68 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 7f1568de76d4 RCX: 004552d9
RDX: 0060 RSI: 2000 RDI: 0013
RBP: 0072bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 06c0 R14: 006fd2a0 R15: 


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkal...@googlegroups.com.

syzbot will keep track of thi

Re: [PATCH] mm: shmem: enable thp migration (Re: [PATCH v1] mm: consider non-anonymous thp as unmovable page)

2018-04-06 Thread Michal Hocko
On Fri 06-04-18 05:14:53, Naoya Horiguchi wrote:
> On Fri, Apr 06, 2018 at 03:07:11AM +, Horiguchi Naoya(堀口 直也) wrote:
> ...
> > -
> > From e31ec037701d1cc76b26226e4b66d8c783d40889 Mon Sep 17 00:00:00 2001
> > From: Naoya Horiguchi 
> > Date: Fri, 6 Apr 2018 10:58:35 +0900
> > Subject: [PATCH] mm: enable thp migration for shmem thp
> > 
> > My testing for the latest kernel supporting thp migration showed an
> > infinite loop in offlining the memory block that is filled with shmem
> > thps.  We can get out of the loop with a signal, but kernel should
> > return with failure in this case.
> > 
> > What happens in the loop is that scan_movable_pages() repeats returning
> > the same pfn without any progress. That's because page migration always
> > fails for shmem thps.
> > 
> > In memory offline code, memory blocks containing unmovable pages should
> > be prevented from being offline targets by has_unmovable_pages() inside
> > start_isolate_page_range(). So it's possible to change migratability
> > for non-anonymous thps to avoid the issue, but it introduces more complex
> > and thp-specific handling in migration code, so it might not good.
> > 
> > So this patch is suggesting to fix the issue by enabling thp migration
> > for shmem thp. Both of anon/shmem thp are migratable so we don't need
> > precheck about the type of thps.
> > 
> > Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too 
> > early")
> > Signed-off-by: Naoya Horiguchi 
> > Cc: sta...@vger.kernel.org # v4.15+
> 
> ... oh, I don't think this is suitable for stable.
> Michal's fix in another email can come first with "CC: stable",
> then this one.
> Anyway I want to get some feedback on the change of this patch.

My patch is indeed much simpler but it depends on [1] and that doesn't
sound like a stable material as well because it depends on onether 2
patches. Maybe we need some other hack for 4.15 if we really care enough.

[1] http://lkml.kernel.org/r/20180103082555.14592-4-mho...@kernel.org
-- 
Michal Hocko
SUSE Labs


INFO: rcu detected stall in io_playback_transfer

2018-04-06 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
e02d37bf55a9a36f22427fd6dd733fe104d817b6 (Thu Apr 5 17:42:07 2018 +)
Merge tag 'sound-4.17-rc1' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=4f2016cf5185da7759dc


Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6609942245015552
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-4805825610197092128

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4f2016cf5185da775...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

INFO: rcu_sched self-detected stall on CPU
	1-: (124999 ticks this GP) idle=212/1/4611686018427387906  
softirq=57947/57947 fqs=31225

 (t=125000 jiffies g=30827 c=30826 q=171)
NMI backtrace for cpu 1
CPU: 1 PID: 13330 Comm: syz-executor4 Not tainted 4.16.0+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b9/0x29f lib/dump_stack.c:53
 nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
 nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
 rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
 print_cpu_stall kernel/rcu/tree.c:1525 [inline]
 check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
 __rcu_pending kernel/rcu/tree.c:3356 [inline]
 rcu_pending kernel/rcu/tree.c:3401 [inline]
 rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
 update_process_times+0x2d/0x70 kernel/time/timer.c:1636
 tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171
 tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179
 __run_hrtimer kernel/time/hrtimer.c:1337 [inline]
 __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399
 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
 smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
 
RIP: 0010:snd_pcm_oss_write3+0xf6/0x220 sound/core/oss/pcm_oss.c:1236
RSP: 0018:88018f57ed00 EFLAGS: 0282 ORIG_RAX: ff13
RAX: ffe0 RBX: ffe0 RCX: 
RDX:  RSI: ffe0 RDI: ffe0
RBP: 88018f57ed48 R08: 8801b29daaf8 R09: 0006
R10: 8801b29da280 R11:  R12: 0001
R13: 8801ce8c56c0 R14: 8801aa45a300 R15: ffe0
 io_playback_transfer+0x274/0x310 sound/core/oss/io.c:47
 snd_pcm_plug_write_transfer+0x36c/0x470 sound/core/oss/pcm_plugin.c:619
 snd_pcm_oss_write2+0x25c/0x460 sound/core/oss/pcm_oss.c:1365
 snd_pcm_oss_sync1+0x332/0x5a0 sound/core/oss/pcm_oss.c:1606
 snd_pcm_oss_sync.isra.29+0x790/0x980 sound/core/oss/pcm_oss.c:1682
 snd_pcm_oss_release+0x214/0x290 sound/core/oss/pcm_oss.c:2559
 __fput+0x34d/0x890 fs/file_table.c:209
 fput+0x15/0x20 fs/file_table.c:243
 task_work_run+0x1e4/0x290 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x1aee/0x2730 kernel/exit.c:865
 do_group_exit+0x16f/0x430 kernel/exit.c:968
 get_signal+0x886/0x1960 kernel/signal.c:2469
 do_signal+0x90/0x2020 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x792/0x9d0 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4552d9
RSP: 002b:7f5c2a0e9c68 EFLAGS: 0246 ORIG_RAX: 0023
RAX: fdfc RBX: 7f5c2a0ea6d4 RCX: 004552d9
RDX:  RSI: 21c0 RDI: 2240
RBP: 0072c010 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 0413 R14: 006f9268 R15: 0002


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause

INFO: rcu detected stall in snd_pcm_oss_write3 (2)

2018-04-06 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
e02d37bf55a9a36f22427fd6dd733fe104d817b6 (Thu Apr 5 17:42:07 2018 +)
Merge tag 'sound-4.17-rc1' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=150189c103427d31a053


So far this crash happened 3 times on upstream.
Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6067392849379328
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-4805825610197092128

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+150189c103427d31a...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

Buffer I/O error on dev loop0, logical block 6, lost async page write
Buffer I/O error on dev loop0, logical block 7, lost async page write
Buffer I/O error on dev loop0, logical block 8, lost async page write
Buffer I/O error on dev loop0, logical block 9, lost async page write
Buffer I/O error on dev loop0, logical block 10, lost async page write
INFO: rcu_sched self-detected stall on CPU
	1-: (124998 ticks this GP) idle=9b2/1/4611686018427387906  
softirq=22733/22733 fqs=31170

 (t=125000 jiffies g=11599 c=11598 q=1619)
NMI backtrace for cpu 1
CPU: 1 PID: 7184 Comm: syz-executor3 Not tainted 4.16.0+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b9/0x29f lib/dump_stack.c:53
 nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
 nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
 rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
 print_cpu_stall kernel/rcu/tree.c:1525 [inline]
 check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
 __rcu_pending kernel/rcu/tree.c:3356 [inline]
 rcu_pending kernel/rcu/tree.c:3401 [inline]
 rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
 update_process_times+0x2d/0x70 kernel/time/timer.c:1636
 tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171
 tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179
 __run_hrtimer kernel/time/hrtimer.c:1337 [inline]
 __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399
 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
 smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
 
RIP: 0010:__sanitizer_cov_trace_pc+0x2b/0x50 kernel/kcov.c:101
RSP: 0018:8801cfe77710 EFLAGS: 0246 ORIG_RAX: ff13
RAX: 8801d0108080 RBX: 0004 RCX: 85a1f955
RDX: 0002 RSI: 85a1f95f RDI: 0005
RBP: 8801cfe77710 R08: 8801d0108080 R09: 0006
R10: 8801d0108080 R11:  R12: 0001
R13: 8801ceb2cd80 R14: 8801aaacec00 R15: ffe0
 snd_pcm_oss_write3+0x16f/0x220 sound/core/oss/pcm_oss.c:1224
 io_playback_transfer+0x274/0x310 sound/core/oss/io.c:47
 snd_pcm_plug_write_transfer+0x36c/0x470 sound/core/oss/pcm_plugin.c:619
 snd_pcm_oss_write2+0x25c/0x460 sound/core/oss/pcm_oss.c:1365
 snd_pcm_oss_sync1+0x332/0x5a0 sound/core/oss/pcm_oss.c:1606
 snd_pcm_oss_sync.isra.29+0x790/0x980 sound/core/oss/pcm_oss.c:1682
 snd_pcm_oss_release+0x214/0x290 sound/core/oss/pcm_oss.c:2559
 __fput+0x34d/0x890 fs/file_table.c:209
 fput+0x15/0x20 fs/file_table.c:243
 task_work_run+0x1e4/0x290 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x792/0x9d0 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4552d9
RSP: 002b:7f4e5ffe6c68 EFLAGS: 0246 ORIG_RAX: 0003
RAX:  RBX: 7f4e5ffe76d4 RCX: 004552d9
RDX:  RSI:  RDI: 0013
RBP: 0072bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 0052 R14: 006f3850 R15: 
INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 1-... } 127452  
jiffies s: 2685 root: 0x2/.

blocking rcu_node structures:
Task dump for CPU 1:
syz-executor3   R  running task24120  7184   4559 0x000c
Call Trace:


---
This bug is generated by a dumb bot. It may contain errors.
See https://

Re: [4.9, 137/145] spi: bcm-qspi: shut up warning about cfi header inclusion

2018-04-06 Thread gre...@linuxfoundation.org
On Tue, Apr 03, 2018 at 10:46:07AM -0700, Florian Fainelli wrote:
> On 02/23/2018 10:27 AM, gre...@linuxfoundation.org wrote:
> > 4.9-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Arnd Bergmann 
> > 
> > When CONFIG_MTD_CFI is disabled, we get a warning for this spi driver:
> > 
> > include/linux/mtd/cfi.h:76:2: #warning No CONFIG_MTD_CFI_Ix selected. No 
> > NOR chip support can work. [-Werror=cpp]
> > 
> > The problem here is a layering violation that was fixed in mainline kernels 
> > with
> > a larger rework in commit 054e532f8f90 ("spi: bcm-qspi: Remove hardcoded 
> > settings
> > and spi-nor.h dependency"). We can't really backport that to stable 
> > kernels, so
> > this just adds a Kconfig dependency to make it either build cleanly or 
> > force it
> > to be disabled.
> 
> Sorry for noticing so late, but this appears to be bogus, there is no
> MTD_NORFLASH symbol being defined in 4.9, in fact I can't find this
> Kconfig symbol in any kernel version, so this effectively results in the
> driver no longer being selectable, so this sure does silence the warning.
> 
> Arnd, should we just send reverts of this patch for the affected kernel
> or should we be defining MTD_NORFLASH somehow? Am I missing something here?

I'm going to revert this patch for now, thanks.

greg k-h


[GIT PULL] mtd: Changes for 4.17

2018-04-06 Thread Boris Brezillon
Hello Linus,

Here is the MTD PR for 4.17. See below for the list of changes queued
for this release.

Regards,

Boris

The following changes since commit 91ab883eb21325ad80f3473633f794c78ac87f51:

  Linux 4.16-rc2 (2018-02-18 17:29:42 -0800)

are available in the git repository at:

  git://git.infradead.org/linux-mtd.git tags/mtd/for-4.17

for you to fetch changes up to fe5f31a8010a0cb13e72cfb72905fefa2a41730c:

  Merge tag 'v4.16-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mtd/next 
(2018-04-04 22:13:35 +0200)


MTD changes:
  Core:
* Remove support for asynchronous erase (not implemented by any of
  the existing drivers anyway)
* Remove Cyrille from the list of SPI NOR and MTD maintainers
* Fix kernel doc headers
* Allow users to define the partitions parsers they want to test
  through a DT property (compatible of the partitions subnode)
* Remove the bfin-async-flash driver (the only architecture using
  it has been removed)
* Fix pagetest test
* Add extra checks in mtd_erase()
* Simplify the MTD partition creation logic and get rid of
  mtd_add_device_partitions()

   Drivers:
* Add endianness information to the physmap DT binding
* Add Eon EN29LV400A IDs to JEDEC probe logic
* Use %*ph where appropriate

SPI NOR changes:
  Drivers:
* Make fsl-quaspi assign different names to MTD devices connected
  to the same QSPI controller
* Remove an unneeded driver.bus assigned in the fsl-qspi driver

NAND changes:
  Core:
* Prepare arrival of the SPI NAND subsystem by implementing a
  generic (interface-agnostic) layer to ease manipulation of NAND
  devices
* Move onenand code base to the drivers/mtd/nand/ dir
* Rework timing mode selection
* Provide a generic way for NAND chip drivers to flag a specific
  GET/SET FEATURE operation as supported/unsupported
* Stop embedding ONFI/JEDEC param page in nand_chip

  Drivers:
* Rework/cleanup of the mxc driver
* Various cleanups in the vf610 driver
* Migrate the fsmc and vf610 to ->exec_op()
* Get rid of the pxa driver (replaced by marvell_nand)
* Support ->setup_data_interface() in the GPMI driver
* Fix probe error path in several drivers
* Remove support for unused hw_syndrome mode in sunxi_nand
* Various minor improvements


Alexey Khoroshilov (3):
  mtd: nand: vf610: remove the unnecessary of_node_put()
  mtd: nand: vf610: improve readability of error label
  mtd: nand: vf610: check mtd_device_register() return code

Antonio Cardace (2):
  mtd: st_spi_fsm: use %*ph to print small buffer
  mtd: nftl: use %*ph to print small buffer

Arnd Bergmann (2):
  mtd: rawnand: remove bf5xx_nand driver
  mtd: maps: remove bfin-async-flash driver

Arushi Singhal (1):
  mtd: ftl: Use DIV_ROUND_UP()

Boris Brezillon (23):
  mtd: Make sure the device supports erase operations in mtd_erase()
  mtd: nand: Get rid of comments giving the file path inside the file itself
  mtd: nand: Stop using full path when referring to files placed in the 
same dir
  mtd: nand: ams-delta: Fix path to toto.c source file
  mtd: nand: State when references to other drivers are no longer valid
  mtd: nand: Add missing copyright information
  mtd: nand: move raw NAND related code to the raw/ subdir
  mtd: nand: Add core infrastructure to deal with NAND devices
  Update Boris Brezillon email address
  Merge tag 'nand/pxa3xx-removal' of git://git.infradead.org/linux-mtd into 
nand/next
  mtd: onenand: Get rid of comments giving the file path inside the file 
itself
  mtd: Move onenand code base to drivers/mtd/nand/onenand
  mtd: Initialize ->fail_addr early in mtd_erase()
  mtd: Get rid of unused fields in struct erase_info
  mtd: Stop assuming mtd_erase() is asynchronous
  mtd: Unconditionally update ->fail_addr and ->addr in part_erase()
  mtd: Stop updating erase_info->state and calling mtd_erase_callback()
  mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode
  mtd: rawnand: marvell: Rename ->ecc_clk into ->core_clk
  mtd: fsl-quadspi: Remove unneeded driver.bus assignment
  Merge tag 'spi-nor/for-4.17' of git://git.infradead.org/linux-mtd into 
mtd/next
  Merge tag 'nand/for-4.17' of git://git.infradead.org/linux-mtd into 
mtd/next
  Merge tag 'v4.16-rc2' of git://git.kernel.org/.../torvalds/linux into 
mtd/next

Colin Ian King (1):
  mtd: block2mtd: remove redundant initialization of 'bdev'

Cyrille Pitchen (1):
  MAINTAINERS: update maintainers for MTD and SPI NOR subsystems

Fabio Estevam (2):
  mtd: fsl-quadspi: Distinguish the mtd device names
  dt-bindings: fsl-quadspi: Add the example of two SPI NOR

Gregory CLEMENT (1):
  mtd: rawnand: marvell: Fix clock r

Re: [PATCH 3/6] aio: refactor read/write iocb setup

2018-04-06 Thread Christoph Hellwig
On Fri, Apr 06, 2018 at 04:21:46AM +0100, Al Viro wrote:
> On Wed, Mar 28, 2018 at 09:26:36AM +0200, Christoph Hellwig wrote:
> > +   struct inode *inode = file_inode(file);
> > +
> > req->ki_flags |= IOCB_WRITE;
> > file_start_write(file);
> > -   ret = aio_ret(req, call_write_iter(file, req, &iter));
> > +   ret = aio_rw_ret(req, call_write_iter(file, req, &iter));
> > /*
> > -* We release freeze protection in aio_complete().  Fool lockdep
> > -* by telling it the lock got released so that it doesn't
> > -* complain about held lock when we return to userspace.
> > +* We release freeze protection in aio_complete_rw().  Fool
> > +* lockdep by telling it the lock got released so that it
> > +* doesn't complain about held lock when we return to userspace.
> >  */
> > -   if (S_ISREG(file_inode(file)->i_mode))
> > -   __sb_writers_release(file_inode(file)->i_sb, 
> > SB_FREEZE_WRITE);
> > +   if (S_ISREG(inode->i_mode))
> 
> ... and that's another use-after-free, since we might've already done fput() 
> of
> that sucker by that point.

Indeed.  Not in any way new in this patch, this is an existing issue
dating way back that needs to be fixed, which will be rather annoying
without taking an extra reference to the inode or at least sb.


INFO: rcu detected stall in n_tty_receive_char_special

2018-04-06 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 (Sat Mar 31 01:52:36 2018 +)
kernel.h: Retain constant expression output for max()/min()
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=18df353d7540aa6b5467


Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5836679554269184
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-1647968177339044852

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+18df353d7540aa6b5...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

INFO: rcu_sched detected stalls on CPUs/tasks:
(detected by 1, t=125007 jiffies, g=42488, c=42487, q=11)
All QSes seen, last rcu_sched kthread activity 125014  
(4295022441-4294897427), jiffies_till_next_fqs=3, root ->qsmask 0x0

kworker/u4:5R  running task15272  8806  2 0x8008
Workqueue: events_unbound flush_to_ldisc
Call Trace:
 
 sched_show_task.cold.87+0x27a/0x301 kernel/sched/core.c:5325
 print_other_cpu_stall.cold.79+0x92f/0x9d2 kernel/rcu/tree.c:1481
 check_cpu_stall.isra.61+0x706/0xf50 kernel/rcu/tree.c:1599
 __rcu_pending kernel/rcu/tree.c:3356 [inline]
 rcu_pending kernel/rcu/tree.c:3401 [inline]
 rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
 update_process_times+0x2d/0x70 kernel/time/timer.c:1636
 tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171
 tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179
 __run_hrtimer kernel/time/hrtimer.c:1337 [inline]
 __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399
 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
 smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
 
RIP: 0010:echo_char+0xae/0x2e0 drivers/tty/n_tty.c:915
RSP: 0018:8801d33e71e0 EFLAGS: 0a07 ORIG_RAX: ff13
RAX: dc00 RBX: c90013158000 RCX: 8375b1b7
RDX: 11003ad87636 RSI: 8375b1c6 RDI: 8801d6c3b1b4
RBP: 8801d33e7210 R08: 8801cf482540 R09: f5200262b460
R10: f5200262b460 R11: c9001315a307 R12: 00cb
R13: 8801d6c3ae00 R14: c240f0bb R15: 00bb
 n_tty_receive_char_special+0x13b3/0x31c0 drivers/tty/n_tty.c:1306
 n_tty_receive_buf_fast drivers/tty/n_tty.c:1577 [inline]
 __receive_buf drivers/tty/n_tty.c:1611 [inline]
 n_tty_receive_buf_common+0x20ca/0x2c50 drivers/tty/n_tty.c:1709
 n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744
 tty_ldisc_receive_buf+0xb0/0x190 drivers/tty/tty_buffer.c:456
 tty_port_default_receive_buf+0x110/0x170 drivers/tty/tty_port.c:38
 receive_buf drivers/tty/tty_buffer.c:475 [inline]
 flush_to_ldisc+0x3e9/0x560 drivers/tty/tty_buffer.c:524
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:238
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
rcu_sched kthread starved for 125626 jiffies! g42488 c42487 f0x2  
RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0

RCU grace-period kthread stack dump:
rcu_sched   R  running task23592 9  2 0x8000
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x807/0x1e40 kernel/sched/core.c:3490
 schedule+0xef/0x430 kernel/sched/core.c:3549
 schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
 rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
 kthread+0x345/0x410 kernel/kthread.c:238
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.

Note: all commands must start from beginning of the line in the email body.


Re: INFO: rcu detected stall in n_tty_receive_char_special

2018-04-06 Thread Dmitry Vyukov
On Fri, Apr 6, 2018 at 9:12 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 (Sat Mar 31 01:52:36 2018 +)
> kernel.h: Retain constant expression output for max()/min()
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=18df353d7540aa6b5467
>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=5836679554269184
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=-1647968177339044852
> compiler: gcc (GCC) 8.0.1 20180301 (experimental)
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+18df353d7540aa6b5...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.


This looks somewhat similar to "INFO: rcu detected stall in __process_echoes":
https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40
But I am not sure because stall stacks are somewhat different.


> INFO: rcu_sched detected stalls on CPUs/tasks:
> (detected by 1, t=125007 jiffies, g=42488, c=42487, q=11)
> All QSes seen, last rcu_sched kthread activity 125014
> (4295022441-4294897427), jiffies_till_next_fqs=3, root ->qsmask 0x0
> kworker/u4:5R  running task15272  8806  2 0x8008
> Workqueue: events_unbound flush_to_ldisc
> Call Trace:
>  
>  sched_show_task.cold.87+0x27a/0x301 kernel/sched/core.c:5325
>  print_other_cpu_stall.cold.79+0x92f/0x9d2 kernel/rcu/tree.c:1481
>  check_cpu_stall.isra.61+0x706/0xf50 kernel/rcu/tree.c:1599
>  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>  rcu_pending kernel/rcu/tree.c:3401 [inline]
>  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>  tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171
>  tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179
>  __run_hrtimer kernel/time/hrtimer.c:1337 [inline]
>  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399
>  hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>  
> RIP: 0010:echo_char+0xae/0x2e0 drivers/tty/n_tty.c:915
> RSP: 0018:8801d33e71e0 EFLAGS: 0a07 ORIG_RAX: ff13
> RAX: dc00 RBX: c90013158000 RCX: 8375b1b7
> RDX: 11003ad87636 RSI: 8375b1c6 RDI: 8801d6c3b1b4
> RBP: 8801d33e7210 R08: 8801cf482540 R09: f5200262b460
> R10: f5200262b460 R11: c9001315a307 R12: 00cb
> R13: 8801d6c3ae00 R14: c240f0bb R15: 00bb
>  n_tty_receive_char_special+0x13b3/0x31c0 drivers/tty/n_tty.c:1306
>  n_tty_receive_buf_fast drivers/tty/n_tty.c:1577 [inline]
>  __receive_buf drivers/tty/n_tty.c:1611 [inline]
>  n_tty_receive_buf_common+0x20ca/0x2c50 drivers/tty/n_tty.c:1709
>  n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744
>  tty_ldisc_receive_buf+0xb0/0x190 drivers/tty/tty_buffer.c:456
>  tty_port_default_receive_buf+0x110/0x170 drivers/tty/tty_port.c:38
>  receive_buf drivers/tty/tty_buffer.c:475 [inline]
>  flush_to_ldisc+0x3e9/0x560 drivers/tty/tty_buffer.c:524
>  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
>  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
> rcu_sched kthread starved for 125626 jiffies! g42488 c42487 f0x2
> RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> RCU grace-period kthread stack dump:
> rcu_sched   R  running task23592 9  2 0x8000
> Call Trace:
>  context_switch kernel/sched/core.c:2848 [inline]
>  __schedule+0x807/0x1e40 kernel/sched/core.c:3490
>  schedule+0xef/0x430 kernel/sched/core.c:3549
>  schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
>  rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.
>
> --
> You

Re: regression: twl4030 audio/clock stopped working in v4.16

2018-04-06 Thread Peter Ujfalusi


On 2018-04-05 17:43, H. Nikolaus Schaller wrote:
> Hi Peter,
> 
>> Am 05.04.2018 um 13:05 schrieb Peter Ujfalusi :
>>
>> Nikolaus,
>>
>> can you CC me also, I have almost missed this...
> 
> Ah, good. Thanks for quick response! I was just starting to setup git 
> bisect...

I think it got broken because of these:
7558562a70fb clk: ti: Drop legacy clk-3xxx-legacy code
0ed266d7ae5e clk: ti: omap3: cleanup unnecessary clock aliases

These used to wire the twl's fck to osc_sys_ck, but they are gone in 4.16.

>>
>> On 2018-04-04 21:29, H. Nikolaus Schaller wrote:
>>> Hi Peter,
>>> I just noticed a problem in v4.16 kernels with twl4030 audio and vibra 
>>> driver no longer working.
>>>
>>> Tracing it back shows that it already did appear in v4.16-rc1 and wasn't 
>>> fixed up to v4.16.0.
>>> Kernel v4.15.9 (the latest one where I have a binary) works.
>>
>> v4.16 works just fine on beagle-xm (including audio), omap2plus_defconfig.
>>
>>> The symptoms are:
>>>
>>> [1.557342] twl4030-audio 4807.i2c:twl@48:audio: Invalid audio_mclk
>>> [1.564788] twl4030-audio: probe of 4807.i2c:twl@48:audio failed 
>>> with error -22
>>> [1.839141] TWL4030: HFCLK is not configured
>>
>> Hrm, the order looks a bit weird, it should be
>> TWL4030: HFCLK is not configured
>> twl4030-audio 4807.i2c:twl@48:audio: Invalid audio_mclk
>> twl4030-audio: probe of 4807.i2c:twl@48:audio failed with error -22
> 
> Indeed, but I see it is as listed:
> 
> root@letux:~# dmesg|fgrep -i twl4030
> [1.787200] twl4030_reg 4807.i2c:twl@48:regulator-vmmc2: can't 
> register VMMC2, -22
> [1.795745] twl4030_reg: probe of 4807.i2c:twl@48:regulator-vmmc2 
> failed with error -22
> [1.840789] TWL4030: HFCLK is not configured
> [1.845977] twl4030-audio 4807.i2c:twl@48:audio: Invalid audio_mclk
> [1.852935] twl4030-audio: probe of 4807.i2c:twl@48:audio failed with 
> error -22
> [6.764160] twl4030_madc 4807.i2c:twl@48:madc: 
> 4807.i2c:twl@48:madc supply vusb3v1 not found, using dummy regulator
> [6.872253] input: twl4030_pwrbutton as 
> /devices/platform/6800.ocp/4807.i2c/i2c-0/0-0048/4807.i2c:twl@48:pwrbutton/input/input2
> [6.997192] twl4030_gpio twl4030-gpio: can't dispatch IRQs from modules
> [7.120666] twl4030_usb 4807.i2c:twl@48:twl4030-usb: Initialized 
> TWL4030 USB module
> [8.176147] omap-twl4030 sound: ASoC: CODEC DAI twl4030-hifi not registered
> [8.183441] omap-twl4030 sound: devm_snd_soc_register_card() failed: -517
> [8.267120] omap-twl4030 sound: ASoC: CODEC DAI twl4030-hifi not registered
> [8.280975] omap-twl4030 sound: devm_snd_soc_register_card() failed: -517
> [8.388366] omap-twl4030 sound: ASoC: CODEC DAI twl4030-hifi not registered
> [8.404113] omap-twl4030 sound: devm_snd_soc_register_card() failed: -517
> [9.250274] omap-twl4030 sound: ASoC: CODEC DAI twl4030-hifi not registered
> [9.264312] omap-twl4030 sound: devm_snd_soc_register_card() failed: -517
> [9.653381] omap-twl4030 sound: ASoC: CODEC DAI twl4030-hifi not registered
> [9.664123] omap-twl4030 sound: devm_snd_soc_register_card() failed: -517
> root@letux:~# 
> 
>>
>> In twl4030_audio_probe() we try to get the HFCLK rate via
>> twl_get_hfclk_rate(), which is reading it with:
>> twl_i2c_read_u8(TWL_MODULE_PM_MASTER, &ctrl, R_CFG_BOOT);
>>
>>> Those are not visible in v4.15.9. And I am not aware of any changes to the 
>>> gta04 device tree.
>>>
>>> Do you know about this issue and a fix, before I start to bisect?
>>
>> The CFG_BOOT register of twl4030 is not configured correctly for some
>> reason?
>> The TRM of twl4030 states that the SW should program the HFCLK_FREQ
>> during boot sequence.
>>
>> If it is not done, MDAC and USB should not work either. And all sorts of
>> other issues might happen.
> 
> Well, USB works for me... Strange.
> 
>>
>> So the boot loader is not configuring the HFCLK_FREQ, for me it does as
>> I have this line in the kernel log:
>> [1.472503] Skipping twl internal clock init and using bootloader
>> value (unknown osc rate)
> 
> root@letux:~# dmesg|fgrep -i Skipping
> [1.691619] Skipping twl internal clock init and using bootloader value 
> (unknown osc rate)
> root@letux:~#
> 
> So I can see this as well.
> 
>>
>> In DT the twl should have the fck clock to not depend on the bootloader
>> for the HFCLK_FREQ settings.
> 
> I am not even modifying the bootloader when trying v4.15 and v4.16. I just
> swap uImage and kernel modules...
> 
> The interesting question is why it did work before (for years) and stopped 
> with v4.16-rc1.
> 
>> We do not have that for beagle-xm for sure.
> 
> I have tried your new patches:
> 
> * first patch alone shows no change
> 
> root@letux:~# dmesg|fgrep -i twl4030
> [1.787322] twl4030_reg 4807.i2c:twl@48:regulator-vmmc2: can't 
> register VMMC2, -22
> [1.795867] twl4030_reg: probe of 4807.i2c:twl@48:regulator-vmmc2 
> failed with error -22
> [1.834228] 

Re: [PATCH] ring-buffer: Add set/clear_current_oom_origin() during allocations

2018-04-06 Thread Zhaoyang Huang
On Fri, Apr 6, 2018 at 7:36 AM, Joel Fernandes  wrote:
> Hi Steve,
>
> On Thu, Apr 5, 2018 at 12:57 PM, Joel Fernandes  wrote:
>> On Thu, Apr 5, 2018 at 6:43 AM, Steven Rostedt  wrote:
>>> On Wed, 4 Apr 2018 16:59:18 -0700
>>> Joel Fernandes  wrote:
>>>
 Happy to try anything else, BTW when the si_mem_available check
 enabled, this doesn't happen and the buffer_size_kb write fails
 normally without hurting anything else.
>>>
>>> Can you remove the RETRY_MAYFAIL and see if you can try again? It may
>>> be that we just remove that, and if si_mem_available() is wrong, it
>>> will kill the process :-/ My original code would only add MAYFAIL if it
>>> was a kernel thread (which is why I created the mflags variable).
>>
>> Tried this. Dropping RETRY_MAYFAIL and the si_mem_available check
>> destabilized the system and brought it down (along with OOM killing
>> the victim).
>>
>> System hung for several seconds and then both the memory hog and bash
>> got killed.
>
> I think its still Ok to keep the OOM patch as a safe guard even though
> its hard to test, and the si_mem_available on its own seem sufficient.
> What do you think?
>
> thanks,
>
>
> - Joel
I also test the patch on my system, which works fine for the previous script.

PS: The script I mentioned is the cts test case POC 16_12 on android8.1


Re: [PATCH 2/8] PCI: dwc: designware: Add support for endpoint mode

2018-04-06 Thread Kishon Vijay Abraham I
Hi,

On Wednesday 04 April 2018 03:50 PM, Gustavo Pimentel wrote:
> On 02/04/2018 06:34, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On Wednesday 28 March 2018 05:08 PM, Gustavo Pimentel wrote:
>>> The PCIe controller dual mode is capable of operating in host mode as well
>>> as endpoint mode by configuration, therefore this patch aims to add
>>> endpoint mode support to the designware driver.
>>>
>>> Signed-off-by: Gustavo Pimentel 
>>> ---
>>>  drivers/pci/dwc/Kconfig   |  45 ++--
>>>  drivers/pci/dwc/pcie-designware-plat.c| 157 
>>> --
>>>  drivers/pci/endpoint/functions/pci-epf-test.c |   5 +
>>>  3 files changed, 187 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
>>> index 2f3f5c5..3fd7daf 100644
>>> --- a/drivers/pci/dwc/Kconfig
>>> +++ b/drivers/pci/dwc/Kconfig
>>> @@ -7,8 +7,7 @@ config PCIE_DW
>>>  
>>>  config PCIE_DW_HOST
>>>  bool
>>> -   depends on PCI
>>> -   depends on PCI_MSI_IRQ_DOMAIN
>>> +   depends on PCI && PCI_MSI_IRQ_DOMAIN
>>>  select PCIE_DW
>>>  
>>>  config PCIE_DW_EP
>>> @@ -52,16 +51,42 @@ config PCI_DRA7XX_EP
>>>  
>>>  config PCIE_DW_PLAT
>>> bool "Platform bus based DesignWare PCIe Controller"
>>> -   depends on PCI
>>> -   depends on PCI_MSI_IRQ_DOMAIN
>>> -   select PCIE_DW_HOST
>>> -   ---help---
>>> -This selects the DesignWare PCIe controller support. Select this if
>>> -you have a PCIe controller on Platform bus.
>>> +   help
>>> + There are two instances of PCIe controller in Designware IP.
>>> + This controller can work either as EP or RC. In order to enable
>>> + host-specific features PCIE_DW_PLAT_HOST must be selected and in
>>> + order to enable device-specific features PCIE_DW_PLAT_EP must be
>>> + selected.
>>>  
>>> -If you have a controller with this interface, say Y or M here.
>>> +config PCIE_DW_PLAT_HOST
>>> +   bool "Platform bus based DesignWare PCIe Controller - Host mode"
>>> +   depends on PCI && PCI_MSI_IRQ_DOMAIN
>>> +   select PCIE_DW_HOST
>>> +   select PCIE_DW_PLAT
>>> +   default y
>>> +   help
>>> + Enables support for the PCIe controller in the Designware IP to
>>> + work in host mode. There are two instances of PCIe controller in
>>> + Designware IP.
>>> + This controller can work either as EP or RC. In order to enable
>>> + host-specific features PCIE_DW_PLAT_HOST must be selected and in
>>> + order to enable device-specific features PCI_DW_PLAT_EP must be
>>> + selected.
>>>  
>>> -If unsure, say N.
>>> +config PCIE_DW_PLAT_EP
>>> +   bool "Platform bus based DesignWare PCIe Controller - Endpoint mode"
>>> +   depends on PCI && PCI_MSI_IRQ_DOMAIN
>>> +   depends on PCI_ENDPOINT
>>> +   select PCIE_DW_EP
>>> +   select PCIE_DW_PLAT
>>> +   help
>>> + Enables support for the PCIe controller in the Designware IP to
>>> + work in endpoint mode. There are two instances of PCIe controller
>>> + in Designware IP.
>>> + This controller can work either as EP or RC. In order to enable
>>> + host-specific features PCIE_DW_PLAT_HOST must be selected and in
>>> + order to enable device-specific features PCI_DW_PLAT_EP must be
>>> + selected.
>>>  
>>>  config PCI_EXYNOS
>>> bool "Samsung Exynos PCIe controller"
>>> diff --git a/drivers/pci/dwc/pcie-designware-plat.c 
>>> b/drivers/pci/dwc/pcie-designware-plat.c
>>> index 5416aa8..921ab07 100644
>>> --- a/drivers/pci/dwc/pcie-designware-plat.c
>>> +++ b/drivers/pci/dwc/pcie-designware-plat.c
>>> @@ -12,19 +12,29 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  #include 
>>>  #include 
>>>  #include 
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  #include "pcie-designware.h"
>>>  
>>>  struct dw_plat_pcie {
>>> -   struct dw_pcie  *pci;
>>> +   struct dw_pcie  *pci;
>>> +   struct regmap   *regmap;
>>> +   enum dw_pcie_device_modemode;
>>>  };
>>>  
>>> +struct dw_plat_pcie_of_data {
>>> +   enum dw_pcie_device_modemode;
>>> +};
>>> +
>>> +static const struct of_device_id dw_plat_pcie_of_match[];
>>> +
>>>  static int dw_plat_pcie_host_init(struct pcie_port *pp)
>>>  {
>>> struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>>> @@ -42,9 +52,61 @@ static const struct dw_pcie_host_ops 
>>> dw_plat_pcie_host_ops = {
>>> .host_init = dw_plat_pcie_host_init,
>>>  };
>>>  
>>> -static int dw_plat_add_pcie_port(struct pcie_port *pp,
>>> +static int dw_plat_pcie_establish_link(struct dw_pcie *pci)
>>> +{
>>> +   dw_pcie_ep_linkup(&pci->ep);
>>
>> .start_link ops is used incorrectly here. .start_link is used when all the
>> endpoint side configuration is done and "is ready" to establish a link with 
>> the
>> host. But dw_pcie_ep_linkup is used to inform the function devices that the
>> link "has been" established.
> 
> If I move the dw_pcie_ep_linkup call from the dw_plat_pcie_establish_link 

Re: [RFC] virtio: Use DMA MAP API for devices without an IOMMU

2018-04-06 Thread Christoph Hellwig
On Fri, Apr 06, 2018 at 08:23:10AM +0530, Anshuman Khandual wrote:
> On 04/06/2018 02:48 AM, Benjamin Herrenschmidt wrote:
> > On Thu, 2018-04-05 at 21:34 +0300, Michael S. Tsirkin wrote:
> >>> In this specific case, because that would make qemu expect an iommu,
> >>> and there isn't one.
> >>
> >>
> >> I think that you can set iommu_platform in qemu without an iommu.
> > 
> > No I mean the platform has one but it's not desirable for it to be used
> > due to the performance hit.
> 
> Also the only requirement is to bounce the I/O buffers through SWIOTLB
> implemented as DMA API which the virtio core understands. There is no
> need for an IOMMU to be involved for the device representation in this
> case IMHO.

This whole virtio translation issue is a mess.  I think we need to
switch it to the dma API, and then quirk the legacy case to always
use the direct mapping inside the dma API.


Re: [PATCH v9 04/10] jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC

2018-04-06 Thread Rafael J. Wysocki
On Friday, April 6, 2018 3:09:55 AM CEST Frederic Weisbecker wrote:
> On Wed, Apr 04, 2018 at 10:38:34AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > Since the subsequent changes will need a TICK_USEC definition
> > analogous to TICK_NSEC, rename the existing TICK_USEC as
> > USER_TICK_USEC, update its users and redefine TICK_USEC
> > accordingly.
> > 
> > Suggested-by: Peter Zijlstra 
> > Signed-off-by: Rafael J. Wysocki 
> > ---
> > 
> > v8 -> v9: No changes.
> > 
> > ---
> >  drivers/net/ethernet/sfc/mcdi.c |2 +-
> >  include/linux/jiffies.h |7 +--
> >  kernel/time/ntp.c   |2 +-
> >  3 files changed, 7 insertions(+), 4 deletions(-)
> > 
> > Index: linux-pm/include/linux/jiffies.h
> > ===
> > --- linux-pm.orig/include/linux/jiffies.h
> > +++ linux-pm/include/linux/jiffies.h
> > @@ -62,8 +62,11 @@ extern int register_refined_jiffies(long
> >  /* TICK_NSEC is the time between ticks in nsec assuming SHIFTED_HZ */
> >  #define TICK_NSEC ((NSEC_PER_SEC+HZ/2)/HZ)
> >  
> > -/* TICK_USEC is the time between ticks in usec assuming fake USER_HZ */
> > -#define TICK_USEC ((100UL + USER_HZ/2) / USER_HZ)
> > +/* TICK_USEC is the time between ticks in usec assuming SHIFTED_HZ */
> > +#define TICK_USEC ((USEC_PER_SEC + HZ/2) / HZ)
> 
> Nit: SHIFTED_HZ doesn't seem to exist anymore.

Well, fair enough, but that would need to be changed along with the TICK_NSEC
comment IMO.

> Reviewed-by: Frederic Weisbecker 

Thanks!



Re: [PATCH 4.4 15/97] genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Apr 03, 2018 at 03:17:16PM +0100, Ben Hutchings wrote:
> On Fri, 2018-03-23 at 10:54 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Hans de Goede 
> > 
> > 
> > [ Upstream commit 382bd4de61827dbaaf5fb4fb7b1f4be4a86505e7 ]
> > 
> > When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
> > flags get filled with the trigger type from the irq_data:
> > 
> > if (!(new->flags & IRQF_TRIGGER_MASK))
> > new->flags |= irqd_get_trigger_type(&desc->irq_data);
> 
> The code above was added to __setup_irq() in 4.8, so I don't think this
> fix is needed in 3.18 or 4.4; and I suspect it might cause a regression
> there.

Already reverted, thanks!

greg k-h


Re: [4.9, 137/145] spi: bcm-qspi: shut up warning about cfi header inclusion

2018-04-06 Thread Arnd Bergmann
On Fri, Apr 6, 2018 at 9:09 AM, gre...@linuxfoundation.org
 wrote:
> On Tue, Apr 03, 2018 at 10:46:07AM -0700, Florian Fainelli wrote:
>> On 02/23/2018 10:27 AM, gre...@linuxfoundation.org wrote:
>> > 4.9-stable review patch.  If anyone has any objections, please let me know.
>> >
>> > --
>> >
>> > From: Arnd Bergmann 
>> >
>> > When CONFIG_MTD_CFI is disabled, we get a warning for this spi driver:
>> >
>> > include/linux/mtd/cfi.h:76:2: #warning No CONFIG_MTD_CFI_Ix selected. No 
>> > NOR chip support can work. [-Werror=cpp]
>> >
>> > The problem here is a layering violation that was fixed in mainline 
>> > kernels with
>> > a larger rework in commit 054e532f8f90 ("spi: bcm-qspi: Remove hardcoded 
>> > settings
>> > and spi-nor.h dependency"). We can't really backport that to stable 
>> > kernels, so
>> > this just adds a Kconfig dependency to make it either build cleanly or 
>> > force it
>> > to be disabled.
>>
>> Sorry for noticing so late, but this appears to be bogus, there is no
>> MTD_NORFLASH symbol being defined in 4.9, in fact I can't find this
>> Kconfig symbol in any kernel version, so this effectively results in the
>> driver no longer being selectable, so this sure does silence the warning.
>>
>> Arnd, should we just send reverts of this patch for the affected kernel
>> or should we be defining MTD_NORFLASH somehow? Am I missing something here?
>
> I'm going to revert this patch for now, thanks.

Yes, please do. Sorry for missing Florian's bug report. I looked at it again
and found that it was never intended for backports to 4.9, as the regression
addressed by the patch was originally merged into 4.14-rc1.

   Arnd


Re: regression, imx6 and sgtl5000 sound problems

2018-04-06 Thread Nicolin Chen
On Fri, Apr 06, 2018 at 07:46:37AM +0300, Mika Penttilä wrote:
> 
> With recent merge to pre 4.17-rc, audio stopped workin (or it's hearable but 
> way too slow).
> imx6q + sgtl5000 codec.

Could you please be more specific at your test cases?

Which board? Whose is the DAI master? Which sample rate?

> Maybe some of the soc/fsl changes is causing this.

There are quite a few clean-up patches of SSI driver being merged.
Would you please try to revert/bisect the changes of fsl_ssi driver
so as to figure out which one breaks your test cases?

If there is a regression because of one of the changes, I will need
to fix it.

Thanks
Nicolin


Re: [PATCH v9 05/10] cpuidle: Return nohz hint from cpuidle_select()

2018-04-06 Thread Rafael J. Wysocki
On Friday, April 6, 2018 4:44:14 AM CEST Frederic Weisbecker wrote:
> On Wed, Apr 04, 2018 at 10:39:50AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > Index: linux-pm/kernel/time/tick-sched.c
> > ===
> > --- linux-pm.orig/kernel/time/tick-sched.c
> > +++ linux-pm/kernel/time/tick-sched.c
> > @@ -991,6 +991,20 @@ void tick_nohz_irq_exit(void)
> >  }
> >  
> >  /**
> > + * tick_nohz_idle_got_tick - Check whether or not the tick handler has run
> > + */
> > +bool tick_nohz_idle_got_tick(void)
> > +{
> > +   struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
> > +
> > +   if (ts->inidle > 1) {
> > +   ts->inidle = 1;
> > +   return true;
> > +   }
> > +   return false;
> > +}
> > +
> > +/**
> >   * tick_nohz_get_sleep_length - return the length of the current sleep
> >   *
> >   * Called from power state control code with interrupts disabled
> > @@ -1101,6 +1115,9 @@ static void tick_nohz_handler(struct clo
> > struct pt_regs *regs = get_irq_regs();
> > ktime_t now = ktime_get();
> >  
> > +   if (ts->inidle)
> > +   ts->inidle = 2;
> > +
> 
> You can move that to tick_sched_do_timer() to avoid code duplication.

Right.

> Also these constants are very opaque. And even with proper symbols it 
> wouldn't look
> right to extend ts->inidle that way.

Well, this was a Peter's idea. :-)

> Perhaps you should add a field such as ts->got_idle_tick under the boolean 
> fields
> after the below patch:

OK, but at this point I'd prefer to make such changes on top of the existing
set, because that's got quite some testing coverage already and honestly this
is cosmetics in my view (albeit important).

> --
> From c7b2ca5a4c512517ddfeb9f922d5999f82542ced Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker 
> Date: Fri, 6 Apr 2018 04:32:37 +0200
> Subject: [PATCH] nohz: Gather tick_sched booleans under a common flag field
> 
> This optimize the space and leave plenty of room for further flags.
> 
> Signed-off-by: Frederic Weisbecker 
> ---
>  kernel/time/tick-sched.h | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
> index 954b43d..38f24dc 100644
> --- a/kernel/time/tick-sched.h
> +++ b/kernel/time/tick-sched.h
> @@ -45,14 +45,17 @@ struct tick_sched {
>   struct hrtimer  sched_timer;
>   unsigned long   check_clocks;
>   enum tick_nohz_mode nohz_mode;
> +
> + unsigned intinidle  : 1;
> + unsigned inttick_stopped: 1;
> + unsigned intidle_active : 1;
> + unsigned intdo_timer_last   : 1;
> +
>   ktime_t last_tick;
>   ktime_t next_tick;
> - int inidle;
> - int tick_stopped;
>   unsigned long   idle_jiffies;
>   unsigned long   idle_calls;
>   unsigned long   idle_sleeps;
> - int idle_active;
>   ktime_t idle_entrytime;
>   ktime_t idle_waketime;
>   ktime_t idle_exittime;
> @@ -62,7 +65,6 @@ struct tick_sched {
>   unsigned long   last_jiffies;
>   u64 next_timer;
>   ktime_t idle_expires;
> - int do_timer_last;
>   atomic_ttick_dep_mask;
>  };
>  
> 




Re: [PATCH 4.4 51/97] mfd: palmas: Reset the POWERHOLD mux during power off

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Apr 03, 2018 at 09:49:14PM +0100, Ben Hutchings wrote:
> On Fri, 2018-03-23 at 10:54 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Keerthy 
> > 
> > 
> > [ Upstream commit 85fdaf8eb9bbec1f0f8a52fd5d85659d60738816 ]
> > 
> > POWERHOLD signal has higher priority  over the DEV_ON bit.
> > So power off will not happen if the POWERHOLD is held high.
> > Hence reset the MUX to GPIO_7 mode to release the POWERHOLD
> > and the DEV_ON bit to take effect to power off the PMIC.
> > 
> > PMIC Power off happens in dire situations like thermal shutdown
> > so irrespective of the POWERHOLD setting go ahead and turn off
> > the powerhold.  Currently poweroff is broken on boards that have
> > powerhold enabled. This fixes poweroff on those boards.
> [...]
> 
> This is not very useful by itself; I think you should pick these too:
> 
> [3.18]
> 0ea66f76ba17 Documentation: pinctrl: palmas: Add ti,palmas-powerhold-override 
> property definition
> 7c62de5f3fc9 ARM: dts: dra7: Add power hold and power controller properties 
> to palmas
> 
> [4.4]
> 0ea66f76ba17 Documentation: pinctrl: palmas: Add ti,palmas-powerhold-override 
> property definition
> 1f166499ce00 ARM: dts: am57xx-beagle-x15-common: Add overide powerhold 
> property
> - apply the changes in am57xx-beagle-x15.dts
> 7c62de5f3fc9 ARM: dts: dra7: Add power hold and power controller properties 
> to palmas
> 
> [4.9]
> 0ea66f76ba17 Documentation: pinctrl: palmas: Add ti,palmas-powerhold-override 
> property definition
> 1f166499ce00 ARM: dts: am57xx-beagle-x15-common: Add overide powerhold 
> property
> 8804755bfb1f ARM: dts: am57xx-idk-common: Add overide powerhold
> property
> 7c62de5f3fc9 ARM: dts: dra7: Add power hold and power controller properties 
> to palmas
> 
> None of the above are needed for 4.14 and 4.15, but they do have one
> more board that needed this property, so please pick this:
> 
> aac4619d028e ARM: dts: DRA76-EVM: Set powerhold property for tps65917

Many thanks for this, all now queued up.

greg k-h


Re: [PATCH] pstore: fix crypto dependencies without compression

2018-04-06 Thread Tobias Regnery
On 06.04.18, Arnd Bergmann wrote:
> On Fri, Apr 6, 2018 at 8:38 AM, Tobias Regnery  
> wrote:
> > Commit 58eb5b670747 ("pstore: fix crypto dependencies") fixed up the crypto
> > dependencies but missed the case when no compression is selected.
> >
> > With CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n  and CONFIG_CRYPTO=m we see
> > the following link error:
> >
> > fs/pstore/platform.o: In function `pstore_register':
> > (.text+0x1b1): undefined reference to `crypto_has_alg'
> > (.text+0x205): undefined reference to `crypto_alloc_base'
> > fs/pstore/platform.o: In function `pstore_unregister':
> > (.text+0x3b0): undefined reference to `crypto_destroy_tfm'
> >
> > Fix this by selecting CONFIG_CRYPTO unconditionally.
> >
> > Fixes: 58eb5b670747 ("pstore: fix crypto dependencies")
> > Signed-off-by: Tobias Regnery 
> 
> Thanks, I wonder how I missed this one. Thanks for fixing it up.
> It's a bit unfortunate that it now disallows the otherwise valid
> CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n
> and CONFIG_CRYPTO=n configuration, though.
> 
> Could we do this by making the calls compile-time configured
> in the pstore code instead? Please try the untested version
> below.
> 
> Arnd
> 
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 1143ef351c58..dc720573fd53 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -258,7 +258,7 @@ static int pstore_decompress(void *in, void *out,
> 
>  static void allocate_buf_for_compression(void)
>  {
> -   if (!zbackend)
> +   if (!IS_ENABLED(CONFIG_PSTORE_COMPRESS) || !zbackend)
> return;
> 
> if (!crypto_has_comp(zbackend->name, 0, 0)) {
> @@ -287,7 +287,7 @@ static void allocate_buf_for_compression(void)
> 
>  static void free_buf_for_compression(void)
>  {
> -   if (!IS_ERR_OR_NULL(tfm))
> +   if (IS_ENABLED(CONFIG_PSTORE_COMPRESS) && !IS_ERR_OR_NULL(tfm))
> crypto_free_comp(tfm);
> kfree(big_oops_buf);
> big_oops_buf = NULL;

Hi Arnd,

this seems to be the better fix, the link error goes away with this change.
Thanks for the suggestion, I will send an updated patch.

--
Tobias


Re: [PATCH 4.4 52/97] mtip32xx: use runtime tag to initialize command header

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Apr 03, 2018 at 10:01:05PM +0100, Ben Hutchings wrote:
> On Fri, 2018-03-23 at 10:54 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Ming Lei 
> > 
> > 
> > [ Upstream commit a4e84aae8139aca9fbfbced1f45c51ca81b57488 ]
> > 
> > mtip32xx supposes that 'request_idx' passed to .init_request()
> > is tag of the request, and use that as request's tag to initialize
> > command header.
> > 
> > After MQ IO scheduler is in, request tag assigned isn't same with
> > the request index anymore, so cause strange hardware failure on
> > mtip32xx, even whole system panic is triggered.
> [...]
> 
> MQ IO schedulers were introduced in 4.11, so this shouldn't be needed
> in older branches.  It also causes a performance regression (fixed
> upstream).  Please revert this for 4.4 and 4.9.

Now reverted, thanks.

greg k-h


Re: [PATCH 4.4 63/97] md/raid10: skip spare disk as first disk

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Apr 03, 2018 at 10:32:09PM +0100, Ben Hutchings wrote:
> On Fri, 2018-03-23 at 10:54 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Shaohua Li 
> > 
> > 
> > [ Upstream commit b506335e5d2b4ec687dde392a3bdbf7601778f1d ]
> > 
> > Commit 6f287ca(md/raid10: reset the 'first' at the end of loop) ignores
> > a case in reshape, the first rdev could be a spare disk, which shouldn't
> > be accounted as the first disk since it doesn't include the offset info.
> > 
> > Fix: 6f287ca(md/raid10: reset the 'first' at the end of loop)
> 
> But that commit hasn't been applied to 4.4-stable.  It probably should
> be, since it fixes another instance of the problem in the run()
> function.  Take care not to add the wrongly placed assignment
> in raid10_start_reshape().

Thanks, now fixed up.

greg k-h


Re: [PATCH 4/8] Input: elantech - split device info into a separate structure

2018-04-06 Thread kbuild test robot
Hi Benjamin,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20180405]
[cannot apply to input/next v4.16 v4.16-rc7 v4.16-rc6 v4.16]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Benjamin-Tissoires/Input-support-for-latest-Lenovo-thinkpads-series-80/20180406-110729
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/input/mouse/elantech.c:1665:5: sparse: symbol 'elantech_query_info' 
>> was not declared. Should it be static?
   drivers/input/mouse/elantech.c: In function 'elantech_init':
   drivers/input/mouse/elantech.c:1839:9: warning: 'error' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
 return error;
^

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


[RFC PATCH] Input: elantech_query_info() can be static

2018-04-06 Thread kbuild test robot

Fixes: 3fedcdbcc4b9 ("Input: elantech - split device info into a separate 
structure")
Signed-off-by: Fengguang Wu 
---
 elantech.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c
index d485664..980dfd7 100644
--- a/drivers/input/mouse/elantech.c
+++ b/drivers/input/mouse/elantech.c
@@ -1662,8 +1662,8 @@ static int elantech_set_properties(struct 
elantech_device_info *info)
return 0;
 }
 
-int elantech_query_info(struct psmouse *psmouse,
-   struct elantech_device_info *info)
+static int elantech_query_info(struct psmouse *psmouse,
+  struct elantech_device_info *info)
 {
unsigned char param[3];
 


Re: [PATCH 4.4 68/97] net: hns: fix ethtool_get_strings overflow in hns driver

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Apr 03, 2018 at 10:39:43PM +0100, Ben Hutchings wrote:
> On Fri, 2018-03-23 at 10:54 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Timmy Li 
> > 
> > 
> > [ Upstream commit 412b65d15a7f8a93794653968308fc100f2aa87c ]
> [...]
> 
> This is not a correct fix; please also apply:
> 
> commit d61d263c8d82db7c4404a29ebc29674b1c0c05c9
> Author: Matthias Brugger 
> Date:   Thu Mar 15 17:54:20 2018 +0100
> 
> net: hns: Fix ethtool private flags

Thanks, now queued up.

greg k-h


Re: [PATCH v3 3/6] spi: sun6i: restrict transfer length in PIO-mode

2018-04-06 Thread Maxime Ripard
On Thu, Apr 05, 2018 at 04:44:16PM +0300, Sergey Suloev wrote:
> On 04/05/2018 04:17 PM, Mark Brown wrote:
> > On Thu, Apr 05, 2018 at 12:59:35PM +0300, Sergey Suloev wrote:
> > > On 04/05/2018 12:19 PM, Maxime Ripard wrote:
> > > > The point of that patch was precisely to allow to send more data than
> > > > the FIFO. You're breaking that behaviour without any justification,
> > > > and this is not ok.
> > > I am sorry, but you can't. That's a hardware limitation.
> > Are you positive about that?  Normally you can add things to hardware
> > FIFOs while they're being drained so so long as you can keep data
> > flowing in at least as fast as it's being consumed.
> 
> Well, normally yes, but this is not the case with the hardware that I own.
> My a20 (BPiM1+) and a31 (BPiM2) boards behaves differently. With a transfer
> larger than FIFO then TC interrupt never happens.

Because you're not supposed to have a transfer larger than the FIFO,
but to have to setup at first a transfer the size of the FIFO, and
then when it's (or starts to be) depleted, fill it up again.

That's the point of the patch you're reverting, and if it doesn't
work, you should make it work and not simply revert it.

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


[PATCH v2] pstore: fix crypto dependencies without compression

2018-04-06 Thread Tobias Regnery
Commit 58eb5b670747 ("pstore: fix crypto dependencies") fixed up the crypto
dependencies but missed the case when no compression is selected.

With CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n  and CONFIG_CRYPTO=m we see
the following link error:

fs/pstore/platform.o: In function `pstore_register':
(.text+0x1b1): undefined reference to `crypto_has_alg'
(.text+0x205): undefined reference to `crypto_alloc_base'
fs/pstore/platform.o: In function `pstore_unregister':
(.text+0x3b0): undefined reference to `crypto_destroy_tfm'

Fix this by checking at compile-time if CONFIG_PSTORE_COMPRESS is enabled.

Fixes: 58eb5b670747 ("pstore: fix crypto dependencies")
Signed-off-by: Tobias Regnery 
---
v2: check the config at compile-time rather than change the 
kconfig-dependency as suggested by Arnd.

---
 fs/pstore/platform.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 1143ef351c58..dc720573fd53 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -258,7 +258,7 @@ static int pstore_decompress(void *in, void *out,
 
 static void allocate_buf_for_compression(void)
 {
-   if (!zbackend)
+   if (!IS_ENABLED(CONFIG_PSTORE_COMPRESS) || !zbackend)
return;
 
if (!crypto_has_comp(zbackend->name, 0, 0)) {
@@ -287,7 +287,7 @@ static void allocate_buf_for_compression(void)
 
 static void free_buf_for_compression(void)
 {
-   if (!IS_ERR_OR_NULL(tfm))
+   if (IS_ENABLED(CONFIG_PSTORE_COMPRESS) && !IS_ERR_OR_NULL(tfm))
crypto_free_comp(tfm);
kfree(big_oops_buf);
big_oops_buf = NULL;
-- 
2.16.3



Re: [RFC PATCH 1/1 v2] vmscan: Support multiple kswapd threads per node

2018-04-06 Thread Michal Hocko
On Thu 05-04-18 23:25:14, Buddy Lumpkin wrote:
> 
> > On Apr 4, 2018, at 11:10 PM, Michal Hocko  wrote:
> > 
> > On Wed 04-04-18 21:49:54, Buddy Lumpkin wrote:
> >> v2:
> >> - Make update_kswapd_threads_node less racy
> >> - Handle locking for case where CONFIG_MEMORY_HOTPLUG=n
> > 
> > Please do not repost with such a small changes. It is much more
> > important to sort out the big picture first and only then deal with
> > minor implementation details. The more versions you post the more
> > fragmented and messy the discussion will become.
> > 
> > You will have to be patient because this is a rather big change and it
> > will take _quite_ some time to get sorted.
> > 
> > Thanks!
> > -- 
> > Michal Hocko
> > SUSE Labs
> > 
> 
> 
> Sorry about that, I actually had three people review my code internally,
> then I managed to send out an old version. 100% guilty of submitting
> code when I needed sleep. As for the change, that was in response
> to a request from Andrew to make the update function less racy.
> 
> Should I resend a correct v2 now that the thread exists?

Let's just discuss open questions for now. Specifics of the code are the
least interesting at this stage.

If you want some help with the code review, you can put it somewhere in
the git tree and send a reference for those who are interested.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 4.4 92/97] ip6_vti: adjust vti mtu according to mtu of lower device

2018-04-06 Thread Greg Kroah-Hartman
On Thu, Apr 05, 2018 at 05:36:36PM +0200, Stefano Brivio wrote:
> On Wed, 04 Apr 2018 01:09:16 +0100
> Ben Hutchings  wrote:
> 
> > On Fri, 2018-03-23 at 10:55 +0100, Greg Kroah-Hartman wrote:
> > > 4.4-stable review patch.  If anyone has any objections, please let me 
> > > know.
> > > 
> > > --
> > > 
> > > From: Alexey Kodanev 
> > > 
> > > 
> > > [ Upstream commit 53c81e95df1793933f87748d36070a721f6cb287 ]  
> > [...]
> > 
> > There are a couple of follow-ups to this:
> > 
> > c6741fbed6dc vti6: Properly adjust vti6 MTU from MTU of lower device
> > 7a67e69a339a vti6: Keep set MTU on link creation or change, validate it
> > 
> > The second of those will fail to build on branches older than 4.10
> > though.  It might be better to revert this one instead.
> 
> Thanks Ben for spotting this.
> 
> Actually,
>   53c81e95df17 ("ip6_vti: adjust vti mtu according to mtu of lower device")
> alone improves things already, despite being "fixed" by
>   c6741fbed6dc ("vti6: Properly adjust vti6 MTU from MTU of lower device")
> 
> With just 53c81e95df17 the MTU of a vti6 interface will be somewhat
> linked to the MTU of the lower layer, but will be underestimated.
> 
> With c6741fbed6dc the calculation of MTU from lower layer will be
> accurate instead.
> 
> However, without
>   7a67e69a339a ("vti6: Keep set MTU on link creation or change, validate it")
> but with
>   53c81e95df17 ("ip6_vti: adjust vti mtu according to mtu of lower device")
> assignment of MTU on link change is discarded, so this would actually
> introduce a bug.
> 
> Fixing
>   7a67e69a339a ("vti6: Keep set MTU on link creation or change, validate it")
> for 4.4 up to 4.9 is trivial, we simply need to adjust for the lack of
>   b96f9afee4eb ("ipv4/6: use core net MTU range checking")
> and reflect the change introduced by
>   f8a554b4aa96 ("vti6: Fix dev->max_mtu setting").
> 
> So, Greg, here comes the backport of
>   7a67e69a339a ("vti6: Keep set MTU on link creation or change, validate it")
> based on latest linux-4.4.y branch, in case you want to keep the existing
> change and add the follow-ups on top. Please let me know if I should submit
> it formally.

Ick, that's a mess.  How about I just revert this patch from the stable
trees, and then someone sends me either a list of git commits to apply,
or patches, for the different trees if it's really needed?

thanks,

greg k-h


Re: make xmldocs failed with error after 4.17 merge period

2018-04-06 Thread Heikki Krogerus
On Fri, Apr 06, 2018 at 12:38:42PM +0900, Masanari Iida wrote:
> After merge following patch during 4.17 merger period,
> make xmldocs start to fail with error.
> 
>  [bdecb33af34f79cbfbb656661210f77c8b8b5b5f]
> usb: typec: API for controlling USB Type-C Multiplexers
> 
> Error messages.
> reST markup error:
> /home/iida/Repo/linux-2.6/Documentation/driver-api/usb/typec.rst:215:
> (SEVERE/4) Unexpected section title or transition.
> 
> 
> Documentation/Makefile:93: recipe for target 'xmldocs' failed
> make[1]: *** [xmldocs] Error 1
> Makefile:1527: recipe for target 'xmldocs' failed
> make: *** [xmldocs] Error 2
> 
> $
> 
> An ascii graphic in typec.rst cause the error.

Thanks for the report. I'm going to propose that we fix this by
marking the ascii art as comment:

diff --git a/Documentation/driver-api/usb/typec.rst 
b/Documentation/driver-api/usb/typec.rst
index feb31946490b..972c11bf4141 100644
--- a/Documentation/driver-api/usb/typec.rst
+++ b/Documentation/driver-api/usb/typec.rst
@@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.

 Illustration of the muxes behind a connector that supports an alternate mode:

- 
+..   
  |   Connector  |
  
 | |

I hope that works.


Br,

-- 
heikki


Re: [PATCH v3] scsi: Introduce sdev_printk_ratelimited to throttle frequent printk

2018-04-06 Thread Petr Mladek
On Tue 2018-04-03 14:04:40, Wen Yang wrote:
> There would be so many same lines printed by frequent printk if one
> disk went wrong, like,
> [  546.185242] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185258] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185280] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185307] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185334] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185364] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185390] sd 0:1:0:0: rejecting I/O to offline device
> [  546.185410] sd 0:1:0:0: rejecting I/O to offline device
> For slow serial console, the frequent printk may be blocked for a
> long time, and if any spin_lock has been acquired before the printk
> like in scsi_request_fn, watchdog could be triggered.
> 
> Related disscussion can be found here,
> https://bugzilla.kernel.org/show_bug.cgi?id=199003
> And Petr brought the idea to throttle the frequent printk, it's
> useless to print the same lines frequently after all.
> 
> v2: fix some typos
> v3: limit the print only for the same device
> 
> Suggested-by: Petr Mladek 
> Suggested-by: Sergey Senozhatsky 
> Signed-off-by: Wen Yang 
> Signed-off-by: Jiang Biao 
> Signed-off-by: Tan Hu 
> Reviewed-by: Bart Van Assche 
> CC: BartVanAssche 
> CC: Petr Mladek 
> CC: Sergey Senozhatsky 
> CC: Martin K. Petersen 
> CC: "James E.J. Bottomley" 
> CC: Tejun Heo 
> CC: JasonYan 
> ---
>  drivers/scsi/scsi_lib.c| 6 +++---
>  drivers/scsi/scsi_scan.c   | 3 +++
>  include/scsi/scsi_device.h | 8 
>  3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index c84f931..f77e801 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1301,7 +1301,7 @@ static int scsi_setup_cmnd(struct scsi_device *sdev, 
> struct request *req)
>* commands.  The device must be brought online
>* before trying any recovery commands.
>*/
> - sdev_printk(KERN_ERR, sdev,
> + sdev_printk_ratelimited(KERN_ERR, sdev,
>   "rejecting I/O to offline device\n");
>   ret = BLKPREP_KILL;
>   break;
> @@ -1310,7 +1310,7 @@ static int scsi_setup_cmnd(struct scsi_device *sdev, 
> struct request *req)
>* If the device is fully deleted, we refuse to
>* process any commands as well.
>*/
> - sdev_printk(KERN_ERR, sdev,
> + sdev_printk_ratelimited(KERN_ERR, sdev,
>   "rejecting I/O to dead device\n");
>   ret = BLKPREP_KILL;
>   break;
> @@ -1802,7 +1802,7 @@ static void scsi_request_fn(struct request_queue *q)
>   break;
>  
>   if (unlikely(!scsi_device_online(sdev))) {
> - sdev_printk(KERN_ERR, sdev,
> + sdev_printk_ratelimited(KERN_ERR, sdev,
>   "rejecting I/O to offline device\n");
>   scsi_kill_request(req, q);
>   continue;
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 0880d97..a6da935 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -288,6 +288,9 @@ static struct scsi_device *scsi_alloc_sdev(struct 
> scsi_target *starget,
>   scsi_change_queue_depth(sdev, sdev->host->cmd_per_lun ?
>   sdev->host->cmd_per_lun : 1);
>  
> + /* Enable message ratelimiting. Default is 10 messages per 5 secs. */
> + ratelimit_state_init(&sdev->sdev_ratelimit_state,
> + DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);

This makes the ratelimiting device independent but it adds another
problem. Several unrelated messages share the ratelimit data now.
It means that cycling on one message might cause that people will
not see the others.

One question is if we really need to ratelimit all three messages.

Another question if we are really printing all the messages in
a single cycle without releasing the spin lock. Then I wonder what
event will cause that the cycle finishes. If the event is independent
then ratelimiting the messages need not help to avoid the softlockup.
I mean that we might cycle faster without the printk but it does
not mean the event would unblock the cycle faster.

Best Regards,
Petr

>   scsi_sysfs_device_initialize(sdev);
>  
>   if (shost->hostt->slave_alloc) {
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 7ae177c..f1db7f3 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -215,6 +215,8 @@ struct scsi_device {
>   struct device   sdev_gendev,
>   sdev_dev;
>  
> +  

Re: [PATCH 4.4 29/63] watchdog: hpwdt: fix unused variable warning

2018-04-06 Thread Greg Kroah-Hartman
On Tue, Mar 20, 2018 at 11:19:39PM +, Ben Hutchings wrote:
> On Sun, 2018-03-18 at 11:14 +0100, Greg Kroah-Hartman wrote:
> > On Fri, Mar 16, 2018 at 04:55:37PM -0600, Jerry Hoemann wrote:
> > > 
> > > Greg,
> > > 
> > > Sorry, if I'm missing something, but I see 3 patches for
> > > hpwdt queued up for 4.4:
> > > 
> > >   queue-4.4/watchdog-hpwdt-fix-unused-variable-warning.patch
> > >   queue-4.4/watchdog-hpwdt-smbios-check.patch
> > >   queue-4.4/watchdog-hpwdt-check-source-of-nmi.patch
> > > 
> > > 
> > > Shouldn't there also be a 4.4 patch for
> > > 
> > >   commit 2b3d89b402b085b08498e896c65267a145bed486
> > >   watchdog: hpwdt: Remove legacy NMI sourcing.
> > > 
> > > As there was for 4.15, 4.14, and 4.9?
> > 
> > It does not apply to the 4.4.y kernel branch.  If you feel it should be
> > there, please provide a working backport.
> > 
> > > commit 2b3d89b40 is the Spectre related patch.
> > 
> > If you look closely, not many Spectre-related patches are merged into
> > 4.4.y as no one has taken the time to do the backporting.  I thought
> > someone was working on this, but odds are they just moved to 4.9.y or
> > 4.14.y as everyone really should if they care about these issues with
> > their platforms.
> > 
> > So if you care about Spectre, I strongly recommend using 4.14.y or
> > newer.
> 
> I think you have most of the Spectre stuff aside from microcode
> supported fixes.  These are still missing on the 4.4 branch though:
> 
> 8fa80c503b48 nospec: Move array_index_nospec() parameter checking into 
> separate macro
> 1d91c1d2c80c nospec: Kill array_index_nospec_mask_check()
> 
> I think there may also be some extra uaccess functions that didn't get
> the nospec treatment.

I'm sure there are :(

I've queued up these 2 patches now, thanks.

greg k-h


[PATCH 2/2] clk: spear: fix WDT clock definition on SPEAr600

2018-04-06 Thread Quentin Schulz
There is no SPEAr600 device named "wdt". Instead, the description of the
WDT (watchdog) was recently added to the Device Tree, and the device
name is "fc88.wdt", so we should associate the WDT fixed rate clock
to this device name.

Signed-off-by: Quentin Schulz 
---
 drivers/clk/spear/spear6xx_clock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/spear/spear6xx_clock.c 
b/drivers/clk/spear/spear6xx_clock.c
index f911d9f..47810be 100644
--- a/drivers/clk/spear/spear6xx_clock.c
+++ b/drivers/clk/spear/spear6xx_clock.c
@@ -147,7 +147,7 @@ void __init spear6xx_clk_init(void __iomem *misc_base)
 
clk = clk_register_fixed_factor(NULL, "wdt_clk", "osc_30m_clk", 0, 1,
1);
-   clk_register_clkdev(clk, NULL, "wdt");
+   clk_register_clkdev(clk, NULL, "fc88.wdt");
 
/* clock derived from pll1 clk */
clk = clk_register_fixed_factor(NULL, "cpu_clk", "pll1_clk",
-- 
git-series 0.9.1


[PATCH 1/2] ARM: SPEAr600: add DT description of the watchdog

2018-04-06 Thread Quentin Schulz
The SPEAr600 has a built-in watchdog which already has a DT binding
described in Documentation/devicetree/bindings/watchdog/sp805-wdt.txt.

Let's add the description of the watchdog device in the SPEAr600 Device
Tree.

Signed-off-by: Quentin Schulz 
---
 arch/arm/boot/dts/spear600.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/spear600.dtsi b/arch/arm/boot/dts/spear600.dtsi
index 00166eb..d7c3096 100644
--- a/arch/arm/boot/dts/spear600.dtsi
+++ b/arch/arm/boot/dts/spear600.dtsi
@@ -213,6 +213,14 @@
interrupts = <6>;
status = "disabled";
};
+
+   wdt: wdt@fc88 {
+   compatible = "arm,sp805", "arm,primecell";
+   reg = <0xfc88 0x1000>;
+   interrupt-parent = <&vic1>;
+   interrupts = <20>;
+   status = "disabled";
+   };
};
};
 };

base-commit: 5e1dacccbb87780856219e29122f1eccec912ebb
-- 
git-series 0.9.1


[PATCH] qla2xxx: correctly shift host byte

2018-04-06 Thread Johannes Thumshirn
The host byte has to be shifted by 16 not 6.

Signed-off-by: Johannes Thumshirn 
---
 drivers/scsi/qla2xxx/qla_isr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 89f93ebd819d..49d67e1d571f 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2368,7 +2368,7 @@ qla25xx_process_bidir_status_iocb(scsi_qla_host_t *vha, 
void *pkt,
bsg_job->reply_len = sizeof(struct fc_bsg_reply);
/* Always return DID_OK, bsg will send the vendor specific response
 * in this case only */
-   sp->done(sp, DID_OK << 6);
+   sp->done(sp, DID_OK << 16);
 
 }
 
-- 
2.16.2



Re: WARNING in xfrm6_tunnel_net_exit

2018-04-06 Thread syzbot

syzbot has found reproducer for the following crash on upstream commit
3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 (Sat Mar 31 01:52:36 2018 +)
kernel.h: Retain constant expression output for max()/min()
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=777bf170a89e7b326405


So far this crash happened 10982 times on linux-next, mmots, net-next,  
upstream.
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=5399809707999232
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=4550974920196096
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-1647968177339044852

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+777bf170a89e7b326...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed.

IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
IPVS: ftp: loaded support on port[0] = 21
WARNING: CPU: 0 PID: 180 at net/ipv6/xfrm6_tunnel.c:345  
xfrm6_tunnel_net_exit+0x2c0/0x4f0 net/ipv6/xfrm6_tunnel.c:345

Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 180 Comm: kworker/u4:4 Not tainted 4.16.0+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x1b9/0x29f lib/dump_stack.c:53
 panic+0x22f/0x4de kernel/panic.c:183
 __warn.cold.8+0x163/0x1a3 kernel/panic.c:547
 report_bug+0x252/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1bc/0x470 arch/x86/kernel/traps.c:296
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:991
RIP: 0010:xfrm6_tunnel_net_exit+0x2c0/0x4f0 net/ipv6/xfrm6_tunnel.c:345
RSP: 0018:8801d96373d8 EFLAGS: 00010293
RAX: 8801d961c080 RBX: 8801b0e999a0 RCX: 866b08c6
RDX:  RSI: 866b08d0 RDI: 0007
RBP: 8801d96374f8 R08: 8801d961c080 R09: ed003b6046c2
R10: 0003 R11: 0003 R12: 007c
R13: ed003b2c6e82 R14: 8801d96374d0 R15: 8801b6185f80
 ops_exit_list.isra.7+0xb0/0x160 net/core/net_namespace.c:152
 cleanup_net+0x51d/0xb20 net/core/net_namespace.c:523
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:238
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..



Re: make xmldocs failed with error after 4.17 merge period

2018-04-06 Thread Greg KH
On Fri, Apr 06, 2018 at 10:51:09AM +0300, Heikki Krogerus wrote:
> On Fri, Apr 06, 2018 at 12:38:42PM +0900, Masanari Iida wrote:
> > After merge following patch during 4.17 merger period,
> > make xmldocs start to fail with error.
> > 
> >  [bdecb33af34f79cbfbb656661210f77c8b8b5b5f]
> > usb: typec: API for controlling USB Type-C Multiplexers
> > 
> > Error messages.
> > reST markup error:
> > /home/iida/Repo/linux-2.6/Documentation/driver-api/usb/typec.rst:215:
> > (SEVERE/4) Unexpected section title or transition.
> > 
> > 
> > Documentation/Makefile:93: recipe for target 'xmldocs' failed
> > make[1]: *** [xmldocs] Error 1
> > Makefile:1527: recipe for target 'xmldocs' failed
> > make: *** [xmldocs] Error 2
> > 
> > $
> > 
> > An ascii graphic in typec.rst cause the error.
> 
> Thanks for the report. I'm going to propose that we fix this by
> marking the ascii art as comment:
> 
> diff --git a/Documentation/driver-api/usb/typec.rst 
> b/Documentation/driver-api/usb/typec.rst
> index feb31946490b..972c11bf4141 100644
> --- a/Documentation/driver-api/usb/typec.rst
> +++ b/Documentation/driver-api/usb/typec.rst
> @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
> 
>  Illustration of the muxes behind a connector that supports an alternate mode:
> 
> - 
> +..   
>   |   Connector  |
>   
>  | |
> 
> I hope that works.

Try it and see!  :)


Re: [PATCH v2] pstore: fix crypto dependencies without compression

2018-04-06 Thread Arnd Bergmann
On Fri, Apr 6, 2018 at 9:25 AM, Tobias Regnery  wrote:
> Commit 58eb5b670747 ("pstore: fix crypto dependencies") fixed up the crypto
> dependencies but missed the case when no compression is selected.
>
> With CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n  and CONFIG_CRYPTO=m we see
> the following link error:
>
> fs/pstore/platform.o: In function `pstore_register':
> (.text+0x1b1): undefined reference to `crypto_has_alg'
> (.text+0x205): undefined reference to `crypto_alloc_base'
> fs/pstore/platform.o: In function `pstore_unregister':
> (.text+0x3b0): undefined reference to `crypto_destroy_tfm'
>
> Fix this by checking at compile-time if CONFIG_PSTORE_COMPRESS is enabled.
>
> Fixes: 58eb5b670747 ("pstore: fix crypto dependencies")
> Signed-off-by: Tobias Regnery 
> ---
> v2: check the config at compile-time rather than change the
> kconfig-dependency as suggested by Arnd.

Thanks!

Acked-by: Arnd Bergmann 


Re: [PATCH 2/2] clk: spear: fix WDT clock definition on SPEAr600

2018-04-06 Thread Viresh Kumar
On 06-04-18, 09:50, Quentin Schulz wrote:
> There is no SPEAr600 device named "wdt". Instead, the description of the
> WDT (watchdog) was recently added to the Device Tree, and the device
> name is "fc88.wdt", so we should associate the WDT fixed rate clock
> to this device name.
> 
> Signed-off-by: Quentin Schulz 
> ---
>  drivers/clk/spear/spear6xx_clock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/clk/spear/spear6xx_clock.c 
> b/drivers/clk/spear/spear6xx_clock.c
> index f911d9f..47810be 100644
> --- a/drivers/clk/spear/spear6xx_clock.c
> +++ b/drivers/clk/spear/spear6xx_clock.c
> @@ -147,7 +147,7 @@ void __init spear6xx_clk_init(void __iomem *misc_base)
>  
>   clk = clk_register_fixed_factor(NULL, "wdt_clk", "osc_30m_clk", 0, 1,
>   1);
> - clk_register_clkdev(clk, NULL, "wdt");
> + clk_register_clkdev(clk, NULL, "fc88.wdt");
>  
>   /* clock derived from pll1 clk */
>   clk = clk_register_fixed_factor(NULL, "cpu_clk", "pll1_clk",

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH 1/2] ARM: SPEAr600: add DT description of the watchdog

2018-04-06 Thread Viresh Kumar
On 06-04-18, 09:50, Quentin Schulz wrote:
> The SPEAr600 has a built-in watchdog which already has a DT binding
> described in Documentation/devicetree/bindings/watchdog/sp805-wdt.txt.
> 
> Let's add the description of the watchdog device in the SPEAr600 Device
> Tree.
> 
> Signed-off-by: Quentin Schulz 
> ---
>  arch/arm/boot/dts/spear600.dtsi | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/spear600.dtsi b/arch/arm/boot/dts/spear600.dtsi
> index 00166eb..d7c3096 100644
> --- a/arch/arm/boot/dts/spear600.dtsi
> +++ b/arch/arm/boot/dts/spear600.dtsi
> @@ -213,6 +213,14 @@
>   interrupts = <6>;
>   status = "disabled";
>   };
> +
> + wdt: wdt@fc88 {
> + compatible = "arm,sp805", "arm,primecell";
> + reg = <0xfc88 0x1000>;
> + interrupt-parent = <&vic1>;
> + interrupts = <20>;
> + status = "disabled";
> + };
>   };
>   };
>  };

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v9 05/10] cpuidle: Return nohz hint from cpuidle_select()

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 04:44:14AM +0200, Frederic Weisbecker wrote:
> You can move that to tick_sched_do_timer() to avoid code duplication.

I expect the reason I didn't was that it didn't have @ts, but that's
easily fixable.

> Also these constants are very opaque. And even with proper symbols it 
> wouldn't look
> right to extend ts->inidle that way.
> 
> Perhaps you should add a field such as ts->got_idle_tick under the boolean 
> fields
> after the below patch:

> @@ -45,14 +45,17 @@ struct tick_sched {
>   struct hrtimer  sched_timer;
>   unsigned long   check_clocks;
>   enum tick_nohz_mode nohz_mode;
> +
> + unsigned intinidle  : 1;
> + unsigned inttick_stopped: 1;
> + unsigned intidle_active : 1;
> + unsigned intdo_timer_last   : 1;

That would generate worse code, but yes, the C might be prettier.


Re: [PATCH] usbip: vhci_hcd: check rhport before using in vhci_hub_control()

2018-04-06 Thread Sergei Shtylyov

Hello!

On 4/6/2018 1:31 AM, Shuah Khan wrote:


Validate !rhport < 0 before using it to access port_status array.


   Why '!'?


Signed-off-by: Shuah Khan 
---
  drivers/usb/usbip/vhci_hcd.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index 20e3d4609583..d11f3f8dad40 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c

[...]

MBR, Sergei


[PATCH] writeback: safer lock nesting

2018-04-06 Thread Greg Thelen
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
the page's memcg is undergoing move accounting, which occurs when a
process leaves its memcg for a new one that has
memory.move_charge_at_immigrate set.

unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the
given inode is switching writeback domains.  Swithces occur when enough
writes are issued from a new domain.

This existing pattern is thus suspicious:
lock_page_memcg(page);
unlocked_inode_to_wb_begin(inode, &locked);
...
unlocked_inode_to_wb_end(inode, locked);
unlock_page_memcg(page);

If both inode switch and process memcg migration are both in-flight then
unlocked_inode_to_wb_end() will unconditionally enable interrupts while
still holding the lock_page_memcg() irq spinlock.  This suggests the
possibility of deadlock if an interrupt occurs before
unlock_page_memcg().

truncate
__cancel_dirty_page
lock_page_memcg
unlocked_inode_to_wb_begin
unlocked_inode_to_wb_end


end_page_writeback
test_clear_page_writeback
lock_page_memcg

unlock_page_memcg

Due to configuration limitations this deadlock is not currently possible
because we don't mix cgroup writeback (a cgroupv2 feature) and
memory.move_charge_at_immigrate (a cgroupv1 feature).

If the kernel is hacked to always claim inode switching and memcg
moving_account, then this script triggers lockup in less than a minute:
  cd /mnt/cgroup/memory
  mkdir a b
  echo 1 > a/memory.move_charge_at_immigrate
  echo 1 > b/memory.move_charge_at_immigrate
  (
echo $BASHPID > a/cgroup.procs
while true; do
  dd if=/dev/zero of=/mnt/big bs=1M count=256
done
  ) &
  while true; do
sync
  done &
  sleep 1h &
  SLEEP=$!
  while true; do
echo $SLEEP > a/cgroup.procs
echo $SLEEP > b/cgroup.procs
  done

Given the deadlock is not currently possible, it's debatable if there's
any reason to modify the kernel.  I suggest we should to prevent future
surprises.

Reported-by: Wang Long 
Signed-off-by: Greg Thelen 
---
 fs/fs-writeback.c   |  5 +++--
 include/linux/backing-dev.h | 18 --
 mm/page-writeback.c | 15 +--
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index d4d04fee568a..d51bae5a53e2 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -746,10 +746,11 @@ int inode_congested(struct inode *inode, int cong_bits)
if (inode && inode_to_wb_is_valid(inode)) {
struct bdi_writeback *wb;
bool locked, congested;
+   unsigned long flags;
 
-   wb = unlocked_inode_to_wb_begin(inode, &locked);
+   wb = unlocked_inode_to_wb_begin(inode, &locked, &flags);
congested = wb_congested(wb, cong_bits);
-   unlocked_inode_to_wb_end(inode, locked);
+   unlocked_inode_to_wb_end(inode, locked, flags);
return congested;
}
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 3e4ce54d84ab..6c74b64d6f56 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -347,6 +347,7 @@ static inline struct bdi_writeback *inode_to_wb(const 
struct inode *inode)
  * unlocked_inode_to_wb_begin - begin unlocked inode wb access transaction
  * @inode: target inode
  * @lockedp: temp bool output param, to be passed to the end function
+ * @flags: saved irq flags, to be passed to the end function
  *
  * The caller wants to access the wb associated with @inode but isn't
  * holding inode->i_lock, mapping->tree_lock or wb->list_lock.  This
@@ -359,7 +360,8 @@ static inline struct bdi_writeback *inode_to_wb(const 
struct inode *inode)
  * disabled on return.
  */
 static inline struct bdi_writeback *
-unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp)
+unlocked_inode_to_wb_begin(struct inode *inode, bool *lockedp,
+  unsigned long *flags)
 {
rcu_read_lock();
 
@@ -370,7 +372,7 @@ unlocked_inode_to_wb_begin(struct inode *inode, bool 
*lockedp)
*lockedp = smp_load_acquire(&inode->i_state) & I_WB_SWITCH;
 
if (unlikely(*lockedp))
-   spin_lock_irq(&inode->i_mapping->tree_lock);
+   spin_lock_irqsave(&inode->i_mapping->tree_lock, *flags);
 
/*
 * Protected by either !I_WB_SWITCH + rcu_read_lock() or tree_lock.
@@ -383,11 +385,13 @@ unlocked_inode_to_wb_begin(struct inode *inode, bool 
*lockedp)
  * unlocked_inode_to_wb_end - end inode wb access transaction
  * @inode: target inode
  * @locked: *@lockedp from unlocked_inode_to_wb_begin()
+ * @flags: *@flags from unlocked_inode_to_wb_begin()
  */
-static inline void unlocked_inode_to_wb_end(struct inode *inode, bool locked)
+static inline voi

Re: AMD graphics performance regression in 4.15 and later

2018-04-06 Thread Christian König

Hi Jean,

yeah, that is a known problem. Using huge pages improves the performance 
because of better TLB usage, but for the cost of higher allocation overhead.


What we found is that firefox is doing something rather strange by 
allocating large textures and then just trowing them away again immediately.


We mitigated the problem by avoiding the slow coherent DMA code path on 
almost all platforms on newer kernels, but essentially somebody needs to 
figure out why firefox and/or the user space stack is doing this 
constant allocation/freeing of memory.


There is also a bug tracker on bugs.kernel.org about this, but I can't 
find it any more of hand.


Regards,
Christian.

Am 06.04.2018 um 02:30 schrieb Jean-Marc Valin:

Hi,

I noticed a serious graphics performance regression between 4.14 and
4.15. It is most noticeable with Firefox (tried FF57 through FF60) and
causes scrolling to be really choppy/sluggish. I've confirmed that the
problem is also there on 4.16, while 4.13 works fine.

After a bisection, I've narrowed the regression down to this commit:

commit 648bc3574716400acc06f99915815f80d9563783
Author: Christian König 
Date:   Thu Jul 6 09:59:43 2017 +0200

 drm/ttm: add transparent huge page support for DMA allocations v2


Some details about my system:
Distro: Fedora 27 (up-to-date)
Video: MSI Radeon RX 560 AERO
CPU: Dual-socket Xeon E5-2640 v4 (20 cores total)
RAM: 128 GB ECC


As a comparison, when running Firefox with 4.15 on a Lenovo W540 laptop
(with Intel graphics only) the responsiveness is much better then what
I'm getting on the Xeon machine above with the Radeon card, so this
really seems to be an AMD-only issue.

Any way to fix the issue?

Thanks,

Jean-Marc
___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel




Re: [PATCH] writeback: safer lock nesting

2018-04-06 Thread Michal Hocko
On Fri 06-04-18 01:03:24, Greg Thelen wrote:
[...]
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index d4d04fee568a..d51bae5a53e2 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -746,10 +746,11 @@ int inode_congested(struct inode *inode, int cong_bits)
>   if (inode && inode_to_wb_is_valid(inode)) {
>   struct bdi_writeback *wb;
>   bool locked, congested;
> + unsigned long flags;
>  
> - wb = unlocked_inode_to_wb_begin(inode, &locked);
> + wb = unlocked_inode_to_wb_begin(inode, &locked, &flags);

Wouldn't it be better to have a cookie (struct) rather than 2 parameters
and let unlocked_inode_to_wb_end DTRT?

>   congested = wb_congested(wb, cong_bits);
> - unlocked_inode_to_wb_end(inode, locked);
> + unlocked_inode_to_wb_end(inode, locked, flags);
>   return congested;
>   }
-- 
Michal Hocko
SUSE Labs


Re: WARNING in kill_block_super

2018-04-06 Thread Michal Hocko
On Wed 04-04-18 19:53:07, Tetsuo Handa wrote:
> Al and Michal, are you OK with this patch?

Maybe I've misunderstood, but hasn't Al explained [1] that the
appropriate fix is in the fs code?

[1] http://lkml.kernel.org/r/20180402143415.gc30...@zeniv.linux.org.uk
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 05/20] afs: Implement @sys substitution handling

2018-04-06 Thread David Howells
Al Viro  wrote:

> > +static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry 
> > *dentry,
> > +  struct key *key)
> > +{
> 
> > +   ret = lookup_one_len(buf, parent, len);
> 
> Er...  Parent is locked only shared here and lookup_one_len() seriously
> depends upon exclusive lock.  As it is, race with lookup of the full name
> will mess the things up very badly.

How should it be done?  Do I have to use d_alloc_parallel(), analogous to
lookup_slow() without taking the rwsem again?

David

PS: Can you stick a banner comment on d_alloc_parallel() describing it?


Re: [PATCH v9 05/10] cpuidle: Return nohz hint from cpuidle_select()

2018-04-06 Thread Rafael J. Wysocki
On Friday, April 6, 2018 4:44:14 AM CEST Frederic Weisbecker wrote:
> On Wed, Apr 04, 2018 at 10:39:50AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > Index: linux-pm/kernel/time/tick-sched.c
> > ===
> > --- linux-pm.orig/kernel/time/tick-sched.c
> > +++ linux-pm/kernel/time/tick-sched.c
> > @@ -991,6 +991,20 @@ void tick_nohz_irq_exit(void)
> >  }
> >  
> >  /**
> > + * tick_nohz_idle_got_tick - Check whether or not the tick handler has run
> > + */
> > +bool tick_nohz_idle_got_tick(void)
> > +{
> > +   struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
> > +
> > +   if (ts->inidle > 1) {
> > +   ts->inidle = 1;
> > +   return true;
> > +   }
> > +   return false;
> > +}
> > +
> > +/**
> >   * tick_nohz_get_sleep_length - return the length of the current sleep
> >   *
> >   * Called from power state control code with interrupts disabled
> > @@ -1101,6 +1115,9 @@ static void tick_nohz_handler(struct clo
> > struct pt_regs *regs = get_irq_regs();
> > ktime_t now = ktime_get();
> >  
> > +   if (ts->inidle)
> > +   ts->inidle = 2;
> > +
> 
> You can move that to tick_sched_do_timer() to avoid code duplication.
> 
> Also these constants are very opaque. And even with proper symbols it 
> wouldn't look
> right to extend ts->inidle that way.
> 
> Perhaps you should add a field such as ts->got_idle_tick under the boolean 
> fields
> after the below patch:
> 
> --
> From c7b2ca5a4c512517ddfeb9f922d5999f82542ced Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker 
> Date: Fri, 6 Apr 2018 04:32:37 +0200
> Subject: [PATCH] nohz: Gather tick_sched booleans under a common flag field
> 
> This optimize the space and leave plenty of room for further flags.
> 
> Signed-off-by: Frederic Weisbecker 
> ---
>  kernel/time/tick-sched.h | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
> index 954b43d..38f24dc 100644
> --- a/kernel/time/tick-sched.h
> +++ b/kernel/time/tick-sched.h
> @@ -45,14 +45,17 @@ struct tick_sched {
>   struct hrtimer  sched_timer;
>   unsigned long   check_clocks;
>   enum tick_nohz_mode nohz_mode;
> +
> + unsigned intinidle  : 1;
> + unsigned inttick_stopped: 1;

This particular change breaks build, because tick_stopped is
accessed via __this_cpu_read() in tick_nohz_tick_stopped().

> + unsigned intidle_active : 1;
> + unsigned intdo_timer_last   : 1;
> +
>   ktime_t last_tick;
>   ktime_t next_tick;
> - int inidle;
> - int tick_stopped;
>   unsigned long   idle_jiffies;
>   unsigned long   idle_calls;
>   unsigned long   idle_sleeps;
> - int idle_active;
>   ktime_t idle_entrytime;
>   ktime_t idle_waketime;
>   ktime_t idle_exittime;
> @@ -62,7 +65,6 @@ struct tick_sched {
>   unsigned long   last_jiffies;
>   u64 next_timer;
>   ktime_t idle_expires;
> - int do_timer_last;
>   atomic_ttick_dep_mask;
>  };
>  
> 

So what about this?  And moving the duplicated piece of got_idle_tick
manipulation on top of it?

---
From: Frederic Weisbecker 
Subject: [PATCH] nohz: Gather tick_sched booleans under a common flag field

Optimize the space and leave plenty of room for further flags.

Signed-off-by: Frederic Weisbecker 
[ rjw: Do not use __this_cpu_read() to access tick_stopped and add
   got_idle_tick to avoid overloading inidle ]
Signed-off-by: Rafael J. Wysocki 
---
 kernel/time/tick-sched.c |   12 +++-
 kernel/time/tick-sched.h |   12 
 2 files changed, 15 insertions(+), 9 deletions(-)

Index: linux-pm/kernel/time/tick-sched.h
===
--- linux-pm.orig/kernel/time/tick-sched.h
+++ linux-pm/kernel/time/tick-sched.h
@@ -41,19 +41,24 @@ enum tick_nohz_mode {
  * @timer_expires: Anticipated timer expiration time (in case sched tick 
is stopped)
  * @timer_expires_base:Base time clock monotonic for @timer_expires
  * @do_timer_lst:  CPU was the last one doing do_timer before going idle
+ * @got_idle_tick: Tick timer function has run with @inidle set
  */
 struct tick_sched {
struct hrtimer  sched_timer;
unsigned long   check_clocks;
enum tick_nohz_mode nohz_mode;
+
+   unsigned intinidle  : 1;
+   unsigned 

Re: AMD graphics performance regression in 4.15 and later

2018-04-06 Thread Christian König

Hi Jean,

found the bug reports.

Here is the original bug report from the kernel: 
https://bugzilla.kernel.org/show_bug.cgi?id=198511


And here is an fdo bug report where we tried to investigate the root 
cause, but didn't had time for that yet: 
https://bugs.freedesktop.org/show_bug.cgi?id=105038


Regards,
Christian.

Am 06.04.2018 um 10:03 schrieb Christian König:

Hi Jean,

yeah, that is a known problem. Using huge pages improves the 
performance because of better TLB usage, but for the cost of higher 
allocation overhead.


What we found is that firefox is doing something rather strange by 
allocating large textures and then just trowing them away again 
immediately.


We mitigated the problem by avoiding the slow coherent DMA code path 
on almost all platforms on newer kernels, but essentially somebody 
needs to figure out why firefox and/or the user space stack is doing 
this constant allocation/freeing of memory.


There is also a bug tracker on bugs.kernel.org about this, but I can't 
find it any more of hand.


Regards,
Christian.

Am 06.04.2018 um 02:30 schrieb Jean-Marc Valin:

Hi,

I noticed a serious graphics performance regression between 4.14 and
4.15. It is most noticeable with Firefox (tried FF57 through FF60) and
causes scrolling to be really choppy/sluggish. I've confirmed that the
problem is also there on 4.16, while 4.13 works fine.

After a bisection, I've narrowed the regression down to this commit:

commit 648bc3574716400acc06f99915815f80d9563783
Author: Christian König 
Date:   Thu Jul 6 09:59:43 2017 +0200

 drm/ttm: add transparent huge page support for DMA allocations v2


Some details about my system:
Distro: Fedora 27 (up-to-date)
Video: MSI Radeon RX 560 AERO
CPU: Dual-socket Xeon E5-2640 v4 (20 cores total)
RAM: 128 GB ECC


As a comparison, when running Firefox with 4.15 on a Lenovo W540 laptop
(with Intel graphics only) the responsiveness is much better then what
I'm getting on the Xeon machine above with the Radeon card, so this
really seems to be an AMD-only issue.

Any way to fix the issue?

Thanks,

Jean-Marc
___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel






Re: [PATCH 05/20] afs: Implement @sys substitution handling

2018-04-06 Thread David Howells
Al Viro  wrote:

> lookup_one_len() seriously depends upon exclusive lock

In the code it says:

WARN_ON_ONCE(!inode_is_locked(base->d_inode));

which checks i_rwsem, but in the banner comment it says:

* The caller must hold base->i_mutex.

is one of these wrong?

David


Re: regression, imx6 and sgtl5000 sound problems

2018-04-06 Thread Mika Penttilä
On 04/06/2018 10:23 AM, Nicolin Chen wrote:
> On Fri, Apr 06, 2018 at 07:46:37AM +0300, Mika Penttilä wrote:
>>
>> With recent merge to pre 4.17-rc, audio stopped workin (or it's hearable but 
>> way too slow).
>> imx6q + sgtl5000 codec.
> 
> Could you please be more specific at your test cases?
> 
> Which board? Whose is the DAI master? Which sample rate?
> 
>> Maybe some of the soc/fsl changes is causing this.
> 
> There are quite a few clean-up patches of SSI driver being merged.
> Would you please try to revert/bisect the changes of fsl_ssi driver
> so as to figure out which one breaks your test cases?
> 
> If there is a regression because of one of the changes, I will need
> to fix it.
> 
> Thanks
> Nicolin
> 

Hi,

We have a custom board (very near to Karo tx evkit). The test is simply aplay 
file.wav, at least sample rate 48kHz tested
and not working. Nothing special there hw wise.

I try to bisect it and report back to you.

--Mika




Re: make xmldocs failed with error after 4.17 merge period

2018-04-06 Thread Heikki Krogerus
On Fri, Apr 06, 2018 at 09:57:34AM +0200, Greg KH wrote:
> On Fri, Apr 06, 2018 at 10:51:09AM +0300, Heikki Krogerus wrote:
> > On Fri, Apr 06, 2018 at 12:38:42PM +0900, Masanari Iida wrote:
> > > After merge following patch during 4.17 merger period,
> > > make xmldocs start to fail with error.
> > > 
> > >  [bdecb33af34f79cbfbb656661210f77c8b8b5b5f]
> > > usb: typec: API for controlling USB Type-C Multiplexers
> > > 
> > > Error messages.
> > > reST markup error:
> > > /home/iida/Repo/linux-2.6/Documentation/driver-api/usb/typec.rst:215:
> > > (SEVERE/4) Unexpected section title or transition.
> > > 
> > > 
> > > Documentation/Makefile:93: recipe for target 'xmldocs' failed
> > > make[1]: *** [xmldocs] Error 1
> > > Makefile:1527: recipe for target 'xmldocs' failed
> > > make: *** [xmldocs] Error 2
> > > 
> > > $
> > > 
> > > An ascii graphic in typec.rst cause the error.
> > 
> > Thanks for the report. I'm going to propose that we fix this by
> > marking the ascii art as comment:
> > 
> > diff --git a/Documentation/driver-api/usb/typec.rst 
> > b/Documentation/driver-api/usb/typec.rst
> > index feb31946490b..972c11bf4141 100644
> > --- a/Documentation/driver-api/usb/typec.rst
> > +++ b/Documentation/driver-api/usb/typec.rst
> > @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
> > 
> >  Illustration of the muxes behind a connector that supports an alternate 
> > mode:
> > 
> > - 
> > +..   
> >   |   Connector  |
> >   
> >  | |
> > 
> > I hope that works.
> 
> Try it and see!  :)

It will fix this issue. I was just wondering if use of ascii art is
acceptable in general with the .rst files? But then again, why
wouldn't it be.

Sorry for the noise. I'll just send that patch.


Thanks,

-- 
heikki


Re: [PATCH 4.4 071/101] spi: davinci: use dma_mapping_error()

2018-04-06 Thread Greg Kroah-Hartman
On Wed, Jul 05, 2017 at 03:24:37PM +0100, Ben Hutchings wrote:
> On Mon, 2017-07-03 at 15:35 +0200, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Kevin Hilman 
> > 
> > 
> > [ Upstream commit c5a2a394835f473ae23931eda5066d3771d7b2f8 ]
> > 
> > The correct error checking for dma_map_single() is to use
> > dma_mapping_error().
> > 
> > Signed-off-by: Kevin Hilman 
> > Signed-off-by: Mark Brown 
> > Signed-off-by: Sasha Levin 
> > Signed-off-by: Greg Kroah-Hartman 
> > ---
> >  drivers/spi/spi-davinci.c |4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > --- a/drivers/spi/spi-davinci.c
> > +++ b/drivers/spi/spi-davinci.c
> > @@ -651,7 +651,7 @@ static int davinci_spi_bufs(struct spi_d
> > buf = t->rx_buf;
> > t->rx_dma = dma_map_single(&spi->dev, buf,
> > t->len, DMA_FROM_DEVICE);
> > -   if (!t->rx_dma) {
> > +   if (dma_mapping_error(&spi->dev, !t->rx_dma)) {
> [...]
> 
> The '!' needs to be deleted.  This appears to have been fixed upstream
> by:
> 
> commit 8aedbf580d21121d2a032e4c8ea12d8d2d85e275
> Author: Fabien Parent 
> Date:   Thu Feb 23 19:01:56 2017 +0100
> 
> spi: davinci: Use SPI framework to handle DMA mapping
> 
> which is not suitable for stable.

Sorry for the delay, now fixed up.

greg k-h


[PATCH] Documentation: typec.rst: Mark ascii art as a comment

2018-04-06 Thread Heikki Krogerus
To prevent processing of ascii art as reStructuredText
elements, marking it as a comment.

Reported-by: Masanari Iida 
Fixes: bdecb33af34f ("usb: typec: API for controlling USB Type-C Multiplexers")
Signed-off-by: Heikki Krogerus 
---
 Documentation/driver-api/usb/typec.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/driver-api/usb/typec.rst 
b/Documentation/driver-api/usb/typec.rst
index feb31946490b..972c11bf4141 100644
--- a/Documentation/driver-api/usb/typec.rst
+++ b/Documentation/driver-api/usb/typec.rst
@@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
 
 Illustration of the muxes behind a connector that supports an alternate mode:
 
- 
+..   
  |   Connector  |
  
 | |
-- 
2.16.3



[PATCH] vhost-net: set packet weight of tx polling to 2 * vq size

2018-04-06 Thread 张海斌
handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
polling udp packets with small length(e.g. 1byte udp payload), because setting
VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.

Ping-Latencies shown below were tested between two Virtual Machines using
netperf (UDP_STREAM, len=1), and then another machine pinged the client:

Packet-Weight  Ping-Latencies(millisecond)
   min  avg   max
Origin   3.319   18.48957.303
64   1.6432.021 2.552
128  1.8252.600 3.224
256  1.9972.710 4.295
512  1.8603.171 4.631
1024 2.0024.173 9.056
2048 2.2575.650 9.688
4096 2.0938.50815.943

Ring size is a hint from device about a burst size it can tolerate. Based on
benchmarks, set the weight to 2 * vq size.

To evaluate this change, another tests were done using netperf(RR, TX) between
two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
tweaked through qemu. Results shown below does not show obvious changes.

vq size=256 TCP_RRvq size=512 TCP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/   1/  -7%/-2%  1/   1/   0%/-2%
   1/   4/  +1%/ 0%  1/   4/  +1%/ 0%
   1/   8/  +1%/-2%  1/   8/   0%/+1%
  64/   1/  -6%/ 0% 64/   1/  +7%/+3%
  64/   4/   0%/+2% 64/   4/  -1%/+1%
  64/   8/   0%/ 0% 64/   8/  -1%/-2%
 256/   1/  -3%/-4%256/   1/  -4%/-2%
 256/   4/  +3%/+4%256/   4/  +1%/+2%
 256/   8/  +2%/ 0%256/   8/  +1%/-1%

vq size=256 UDP_RRvq size=512 UDP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/   1/  -5%/+1%  1/   1/  -3%/-2%
   1/   4/  +4%/+1%  1/   4/  -2%/+2%
   1/   8/  -1%/-1%  1/   8/  -1%/ 0%
  64/   1/  -2%/-3% 64/   1/  +1%/+1%
  64/   4/  -5%/-1% 64/   4/  +2%/ 0%
  64/   8/   0%/-1% 64/   8/  -2%/+1%
 256/   1/  +7%/+1%256/   1/  -7%/ 0%
 256/   4/  +1%/+1%256/   4/  -3%/-4%
 256/   8/  +2%/+2%256/   8/  +1%/+1%

vq size=256 TCP_STREAMvq size=512 TCP_STREAM
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
  64/   1/   0%/-3% 64/   1/   0%/ 0%
  64/   4/  +3%/-1% 64/   4/  -2%/+4%
  64/   8/  +9%/-4% 64/   8/  -1%/+2%
 256/   1/  +1%/-4%256/   1/  +1%/+1%
 256/   4/  -1%/-1%256/   4/  -3%/ 0%
 256/   8/  +7%/+5%256/   8/  -3%/ 0%
 512/   1/  +1%/ 0%512/   1/  -1%/-1%
 512/   4/  +1%/-1%512/   4/   0%/ 0%
 512/   8/  +7%/-5%512/   8/  +6%/-1%
1024/   1/   0%/-1%   1024/   1/   0%/+1%
1024/   4/  +3%/ 0%   1024/   4/  +1%/ 0%
1024/   8/  +8%/+5%   1024/   8/  -1%/ 0%
2048/   1/  +2%/+2%   2048/   1/  -1%/ 0%
2048/   4/  +1%/ 0%   2048/   4/   0%/-1%
2048/   8/  -2%/ 0%   2048/   8/   5%/-1%
4096/   1/  -2%/ 0%   4096/   1/  -2%/ 0%
4096/   4/  +2%/ 0%   4096/   4/   0%/ 0%
4096/   8/  +9%/-2%   4096/   8/  -5%/-1%

Signed-off-by: Haibin Zhang 
Signed-off-by: Yunfang Tai 
Signed-off-by: Lidong Chen 
---
 drivers/vhost/net.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8139bc70ad7d..3563a305cc0a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_NET_WEIGHT 0x8
 
+/* Max number of packets transferred before requeueing the job.
+ * Using this limit prevents one virtqueue from starving rx. */
+#define VHOST_NET_PKT_WEIGHT(vq) ((vq)->num * 2)
+
 /* MAX number of TX used buffers for outstanding zerocopy */
 #define VHOST_MAX_PEND 128
 #define VHOST_GOODCOPY_LEN 256
@@ -473,6 +477,7 @@ static void handle_tx(struct vhost_net *net)
struct socket *sock;
struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
bool zcopy, zcopy_used;
+   int sent_pkts = 0;
 
mutex_lock(&vq->mutex);
sock = vq->private_data;
@@ -5

Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64

2018-04-06 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > Ok, this series looks mostly good to me, but AFAICS this breaks the UML 
> > build:
> > 
> >  make[2]: *** No rule to make target 'archheaders'.  Stop.
> >  arch/um/Makefile:119: recipe for target 'archheaders' failed
> >  make[1]: *** [archheaders] Error 2
> >  make[1]: *** Waiting for unfinished jobs
> 
> Ah, that's caused by patch 8/8 which I did and do not like all that much
> anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> referenced there. Fixup patch below; could be folded with patch 8/8. Or
> patch 8/8 could simply be dropped from the series altogether...

I still like the 'truth in advertising' aspect. For example if I see this in 
the 
syscall table:

 10  common  mprotect__sys_x86_mprotect

I can immediately find the _real_ syscall entry point:

81180a10 <__sys_x86_mprotect>:
81180a10:   48 8b 57 60 mov0x60(%rdi),%rdx
81180a14:   48 8b 77 68 mov0x68(%rdi),%rsi
81180a18:   b9 ff ff ff ff  mov$0x,%ecx
81180a1d:   48 8b 7f 70 mov0x70(%rdi),%rdi
81180a21:   e8 fa fc ff ff  callq  81180720 

81180a26:   48 98   cltq   
81180a28:   c3  retq   
81180a29:   0f 1f 80 00 00 00 00nopl   0x0(%rax)

If, on the other hand, I see this entry:

 10 common  mprotectsys_mprotect

Then, as a first step, no symbol anywhere matches with this:

 triton:~/tip> grep sys_mprotect System.map 
 triton:~/tip> 

"sys_mprotect" does not exist in any easily discoverable sense. You have to 
*know* 
to replace the sys_ prefix with __sys_x86_ to find it.

Now arguably we could use a __sys_ prefix instead of the grep-barrier __sys_x86 
prefix - but that too would be somewhat confusing I think.

I mean, the fact that we are passing in a ptregs pointer is a complexity of the 
x86 kernel that *exists*, why hide it and make it harder to discover what's 
happening, for something as important as system calls?

In terms of UML breakage, UML arguably is tightly coupled to its host 
architecture:

> Subject: [PATCH] syscalls/x86: fix UML syscall table

Even with your patch applied I still see build failures:

  $ make ARCH=um defconfig
  $ make ARCH=um linux
  ...
  arch/um/os-Linux/signal.c: In function ‘hard_handler’:
  arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete 
type 
  ‘struct ucontext’
mcontext_t *mc = &uc->uc_mcontext;
^~
  scripts/Makefile.build:324: recipe for target 'arch/um/os-Linux/signal.o' 
failed
  make[1]: *** [arch/um/os-Linux/signal.o] Error 1

Thanks,

Ingo


Re: [PATCH 4.4 10/43] net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata

2018-04-06 Thread Greg Kroah-Hartman
On Wed, May 10, 2017 at 04:30:34PM +0100, Ben Hutchings wrote:
> On Mon, 2017-05-01 at 14:27 -0700, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Florian Fainelli 
> > 
> > commit 8e6ce7ebeb34f0992f56de078c3744fb383657fa upstream.
> > 
> > The label lio_xmit_failed is used 3 times through liquidio_xmit() but it
> > always makes a call to dma_unmap_single() using potentially
> > uninitialized variables from "ndata" variable. Out of the 3 gotos, 2 run
> > after ndata has been initialized, and had a prior dma_map_single() call.
> > 
> > Fix this by adding a new error label: lio_xmit_dma_failed which does
> > this dma_unmap_single() and then processed with the lio_xmit_failed
> > fallthrough.
> > 
> > Fixes: f21fb3ed364bb ("Add support of Cavium Liquidio ethernet adapters")
> > Reported-by: coverity (CID 1309740)
> > Signed-off-by: Florian Fainelli 
> > Signed-off-by: David S. Miller 
> > Cc: Julia Lawall 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> This is not a complete fix:
> 
> > ---
> >  drivers/net/ethernet/cavium/liquidio/lio_main.c |9 +
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
> > +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
> > @@ -2823,7 +2823,7 @@ static int liquidio_xmit(struct sk_buff
> > if (!g) {
> > netif_info(lio, tx_err, lio->netdev,
> >"Transmit scatter gather: glist null!\n");
> > -   goto lio_xmit_failed;
> > +   goto lio_xmit_dma_failed;
> > }
> >  
> > cmdsetup.s.gather = 1;
> [...]
> 
> This goto should not have been changed, as no DMA mapping has been
> attempted at this point in the function.
> 
> This seems to have been fixed upstream by commit 6a885b60dad2 "liquidio:
> Introduce new octeon2/3 header".  I leave it to you to work out how it
> should be fixed in 4.4-stable.

I've fixed this up "by hand" now, thanks for the review.

greg k-h


Re: [PATCH] ata: ahci-platform: add reset control support

2018-04-06 Thread Hans de Goede

Hi,

On 06-04-18 06:48, Kunihiko Hayashi wrote:

Hi Hans,

On Thu, 5 Apr 2018 16:08:24 +0200
Hans de Goede  wrote:


Hi,

On 05-04-18 16:00, Hans de Goede wrote:

Hi,

On 05-04-18 15:54, Thierry Reding wrote:
On Thu, Apr 05, 2018 at 03:27:03PM +0200, Hans de Goede wrote:

Hi,

On 05-04-18 15:17, Patrice CHOTARD wrote:

Hi Thierry

On 04/05/2018 11:54 AM, Thierry Reding wrote:

On Fri, Mar 23, 2018 at 10:30:53AM +0900, Kunihiko Hayashi wrote:

Add support to get and control a list of resets for the device
as optional and shared. These resets must be kept de-asserted until
the device is enabled.

This is specified as shared because some SoCs like UniPhier series
have common reset controls with all ahci controller instances.

Signed-off-by: Kunihiko Hayashi 
---
??? .../devicetree/bindings/ata/ahci-platform.txt? |? 1 +
??? drivers/ata/ahci.h |? 1 +
??? drivers/ata/libahci_platform.c | 24 
+++---
??? 3 files changed, 23 insertions(+), 3 deletions(-)


This causes a regression on Tegra because we explicitly request the
resets after the call to ahci_platform_get_resources().


I confirm, we got exactly the same behavior on STi platform.



?? From a quick look, ahci_mtk and ahci_st are in the same boat, adding the
corresponding maintainers to Cc.

Patrice, Matthias: does SATA still work for you after this patch? This
has been in linux-next since next-20180327.


SATA is still working after this patch, but a kernel warning is
triggered due to the fact that resets are both requested by
libahci_platform and by ahci_st driver.


So in your case you might be able to remove the reset handling
from the ahci_st driver and rely on the new libahci_platform
handling instead? If that works that seems like a win to me.

As said elsewhere in this thread I think it makes sense to keep (or re-add
after a revert) the libahci_platform reset code, but make it conditional
on a flag passed to ahci_platform_get_resources(). This way we get
the shared code for most cases and platforms which need special handling
can opt-out.


Agreed, although I prefer such helpers to be opt-in, rather than
opt-out. In my experience that tends make the helpers more resilient to
this kind of regression. It also simplifies things because instead of
drivers saying "I want all the helpers except this one and that one",
they can simply say "I want these helpers and that one". In the former
case whenever you add some new (opt-out) feature, you have to update all
drivers and add the exception. In the latter you only need to extend the
drivers that want to make use of the new helper.


Erm, the idea never was to make this opt-out but rather opt in, so
we add a flags parameter to ahci_platform_get_resources() and all
current users pass in 0 for that to keep the current behavior.

And only the generic drivers/ata/ahci_platform.c driver will pass
in a the new AHCI_PLATFORM_GET_RESETS flag, which makes
ahci_platform_get_resources() (and the other functions) also deal
with resets.


With that in mind, rather than adding a flag to the
ahci_platform_get_resources() function, it might be more flexible to
split the helpers into finer-grained functions. That way drivers can
pick whatever functionality they want from the helpers.
Good point, so lets:
1) Revert the patch for now

2) Have a new version of the patch which adds a ahci_platform_get_resets() 
helper
3) Modify the generic drivers/ata/ahci_platform.c driver to call the new
  ?? ahci_platform_get_resets() between its ahci_platform_get_resources()
  ?? and ahci_platform_enable_resources() calls.
  ?? I think that ahci_platform_enable_resources() should still automatically
  ?? do the right thing wrt resets if ahci_platform_get_resets() was called
  ?? (otherwise the resets array will be empty and should be skipped)

This should make the generic driver usable for the UniPhier SoCs and

maybe some other drivers like the ahci_st driver can also switch to the
new ahci_platform_get_resets() functionality to reduce their code a bit.


So thinking slightly longer about this, with the opt-in variant
(which is what I intended all along) I do think that a flags parameter
is better, because the whole idea behind lib_ahci_platform is to avoid
having to do err = get_resource_a(), if (err) bail, err = get_resource_b()
if (err) bail, etc. in all the ahci (platform) drivers. And having fine
grained helpers re-introduces that.


In case of adding a flag instead of get_resource_a(),
for example, we add the flag for use of resets,

-struct ahci_host_priv *ahci_platform_get_resources(struct platform_device 
*pdev)
+struct ahci_host_priv *ahci_platform_get_resources(struct platform_device 
*pdev,
+  bool use_reset)

and for now all the drivers using this function need to add the argument as 
false
to the caller.

-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, fa

Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-06 Thread Petr Mladek
On Tue 2018-04-03 14:19:43, wen.yan...@zte.com.cn wrote:
> On the other hand,queue_lock is big, looping doing something under spinlock 
> 
> may locked many things and taking a long time, may cause some problems.
> 
> So This code needs to be optimized later:
> 
> scsi_request_fn()
> {
>   for (;;) {
>   int rtn;
>   /*
>* get next queueable request.  We do this early to make sure
>* that the request is fully prepared even if we cannot
>* accept it.
>*/
> 
>   req = blk_peek_request(q);
> 
>   if (!req)
>   break;
> 
>   if (unlikely(!scsi_device_online(sdev))) {
>   sdev_printk(KERN_ERR, sdev,
>   "rejected I/O to offline device\n");
>   scsi_kill_request(req, q);
>   continue;
> 
>   ^ still under spinlock
>   }

I wonder if the following might be the best solution after all:

if (unlikely(!scsi_device_online(sdev))) {
scsi_kill_request(req, q);

/*
 * printk() might take a while on slow consoles.
 * Prevent solftlockups by releasing the lock.
 */
spin_unlock_irq(q->queue_lock);
sdev_printk(KERN_ERR, sdev,
"rejecting I/O to offline device\n");
spin_lock_irq(q->queue_lock);
continue;
}

I see that the lock is released also in several other situations.
Therefore it looks safe. Also handling too many requests without
releasing the lock seems to be a bad idea in general. I think
that this solution was already suggested earlier.

Please, note that I moved scsi_kill_request() up. It looks natural
to remove it from the queue before we release the queue lock.

Best Regards,
Petr

BTW: Your mail had strange formatting. Please, try to avoid using
html.


Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64

2018-04-06 Thread Dominik Brodowski
On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> 
> * Dominik Brodowski  wrote:
> 
> > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML 
> > > build:
> > > 
> > >  make[2]: *** No rule to make target 'archheaders'.  Stop.
> > >  arch/um/Makefile:119: recipe for target 'archheaders' failed
> > >  make[1]: *** [archheaders] Error 2
> > >  make[1]: *** Waiting for unfinished jobs
> > 
> > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > patch 8/8 could simply be dropped from the series altogether...
> 
> I still like the 'truth in advertising' aspect. For example if I see this in 
> the 
> syscall table:
> 
>  10  common  mprotect__sys_x86_mprotect
> 
> I can immediately find the _real_ syscall entry point:
> 
> 81180a10 <__sys_x86_mprotect>:
> 81180a10:   48 8b 57 60 mov0x60(%rdi),%rdx
> 81180a14:   48 8b 77 68 mov0x68(%rdi),%rsi
> 81180a18:   b9 ff ff ff ff  mov$0x,%ecx
> 81180a1d:   48 8b 7f 70 mov0x70(%rdi),%rdi
> 81180a21:   e8 fa fc ff ff  callq  81180720 
> 
> 81180a26:   48 98   cltq   
> 81180a28:   c3  retq   
> 81180a29:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
> 
> If, on the other hand, I see this entry:
> 
>  10 common  mprotectsys_mprotect
> 
> Then, as a first step, no symbol anywhere matches with this:
> 
>  triton:~/tip> grep sys_mprotect System.map 
>  triton:~/tip> 
> 
> "sys_mprotect" does not exist in any easily discoverable sense. You have to 
> *know* 
> to replace the sys_ prefix with __sys_x86_ to find it.
> 
> Now arguably we could use a __sys_ prefix instead of the grep-barrier 
> __sys_x86 
> prefix - but that too would be somewhat confusing I think.
> 
> I mean, the fact that we are passing in a ptregs pointer is a complexity of 
> the 
> x86 kernel that *exists*, why hide it and make it harder to discover what's 
> happening, for something as important as system calls?
> 
> In terms of UML breakage, UML arguably is tightly coupled to its host 
> architecture:
> 
> > Subject: [PATCH] syscalls/x86: fix UML syscall table
> 
> Even with your patch applied I still see build failures:
> 
>   $ make ARCH=um defconfig
>   $ make ARCH=um linux
>   ...
>   arch/um/os-Linux/signal.c: In function ‘hard_handler’:
>   arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to 
> incomplete type 
>   ‘struct ucontext’
> mcontext_t *mc = &uc->uc_mcontext;
> ^~
>   scripts/Makefile.build:324: recipe for target 'arch/um/os-Linux/signal.o' 
> failed
>   make[1]: *** [arch/um/os-Linux/signal.o] Error 1

That build failure is already present in mainline as of 38c23685b273
(when building on Arch / gcc-7.3.1; building on Debian oldstable / gcc-4.9
works fine). And -- just checked -- this build failure also exists for
plain v4.16.

Thanks,
Dominik


Re: make xmldocs failed with error after 4.17 merge period

2018-04-06 Thread Greg KH
On Fri, Apr 06, 2018 at 11:15:55AM +0300, Heikki Krogerus wrote:
> On Fri, Apr 06, 2018 at 09:57:34AM +0200, Greg KH wrote:
> > On Fri, Apr 06, 2018 at 10:51:09AM +0300, Heikki Krogerus wrote:
> > > On Fri, Apr 06, 2018 at 12:38:42PM +0900, Masanari Iida wrote:
> > > > After merge following patch during 4.17 merger period,
> > > > make xmldocs start to fail with error.
> > > > 
> > > >  [bdecb33af34f79cbfbb656661210f77c8b8b5b5f]
> > > > usb: typec: API for controlling USB Type-C Multiplexers
> > > > 
> > > > Error messages.
> > > > reST markup error:
> > > > /home/iida/Repo/linux-2.6/Documentation/driver-api/usb/typec.rst:215:
> > > > (SEVERE/4) Unexpected section title or transition.
> > > > 
> > > > 
> > > > Documentation/Makefile:93: recipe for target 'xmldocs' failed
> > > > make[1]: *** [xmldocs] Error 1
> > > > Makefile:1527: recipe for target 'xmldocs' failed
> > > > make: *** [xmldocs] Error 2
> > > > 
> > > > $
> > > > 
> > > > An ascii graphic in typec.rst cause the error.
> > > 
> > > Thanks for the report. I'm going to propose that we fix this by
> > > marking the ascii art as comment:
> > > 
> > > diff --git a/Documentation/driver-api/usb/typec.rst 
> > > b/Documentation/driver-api/usb/typec.rst
> > > index feb31946490b..972c11bf4141 100644
> > > --- a/Documentation/driver-api/usb/typec.rst
> > > +++ b/Documentation/driver-api/usb/typec.rst
> > > @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
> > > 
> > >  Illustration of the muxes behind a connector that supports an alternate 
> > > mode:
> > > 
> > > - 
> > > +..   
> > >   |   Connector  |
> > >   
> > >  | |
> > > 
> > > I hope that works.
> > 
> > Try it and see!  :)
> 
> It will fix this issue. I was just wondering if use of ascii art is
> acceptable in general with the .rst files? But then again, why
> wouldn't it be.

There are ways to do this, look at how the v4l2 and I think the drm
subsystems handle ascii art such that "real" drawings end up being
produced.

thanks,

greg k-h


Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64

2018-04-06 Thread Dominik Brodowski
On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> 
> * Dominik Brodowski  wrote:
> 
> > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML 
> > > build:
> > > 
> > >  make[2]: *** No rule to make target 'archheaders'.  Stop.
> > >  arch/um/Makefile:119: recipe for target 'archheaders' failed
> > >  make[1]: *** [archheaders] Error 2
> > >  make[1]: *** Waiting for unfinished jobs
> > 
> > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > patch 8/8 could simply be dropped from the series altogether...
> 
> I still like the 'truth in advertising' aspect. For example if I see this in 
> the 
> syscall table:
> 
>  10  common  mprotect__sys_x86_mprotect
> 
> I can immediately find the _real_ syscall entry point:
> 
> 81180a10 <__sys_x86_mprotect>:
> 81180a10:   48 8b 57 60 mov0x60(%rdi),%rdx
> 81180a14:   48 8b 77 68 mov0x68(%rdi),%rsi
> 81180a18:   b9 ff ff ff ff  mov$0x,%ecx
> 81180a1d:   48 8b 7f 70 mov0x70(%rdi),%rdi
> 81180a21:   e8 fa fc ff ff  callq  81180720 
> 
> 81180a26:   48 98   cltq   
> 81180a28:   c3  retq   
> 81180a29:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
> 
> If, on the other hand, I see this entry:
> 
>  10 common  mprotectsys_mprotect
> 
> Then, as a first step, no symbol anywhere matches with this:
> 
>  triton:~/tip> grep sys_mprotect System.map 
>  triton:~/tip> 
> 
> "sys_mprotect" does not exist in any easily discoverable sense. You have to 
> *know* 
> to replace the sys_ prefix with __sys_x86_ to find it.
> 
> Now arguably we could use a __sys_ prefix instead of the grep-barrier 
> __sys_x86 
> prefix - but that too would be somewhat confusing I think.

Well, if looking at the ARCH="um" kernel, you won't find the
__sys_x86_mprotect there in its System.map -- so we either have to
disentangle um and plain x86, or live with some cause for confusion.

__sys_mprotect as prefix won't work by the way, as the double-underscore
__sys_ variant is already used in net/* for internal syscall helpers.

Thanks,
Dominik


Re: [PATCH] tpm: moves the delay_msec increment after sleep in tpm_transmit()

2018-04-06 Thread Nayna Jain



On 04/05/2018 03:42 PM, Jarkko Sakkinen wrote:

On Mon, Apr 02, 2018 at 09:50:06PM +0530, Nayna Jain wrote:

Commit e2fb992d82c6 ("tpm: add retry logic") introduced a new loop to
handle the TPM2_RC_RETRY error. The loop retries the command after
sleeping for the specified time, which is incremented exponentially in
every iteration. This patch fixes the initial sleep to be the default
sleep time.

I think I understand the code change but do not understand what the
long description.


It tells that the first sleep is delay_msec * 2 and not delay_msec.




Fixes: commit e2fb992d82c6 ("tpm: add retry logic")
Signed-off-by: Nayna Jain 
Reviewed-by: Mimi Zohar 
---
  drivers/char/tpm/tpm-interface.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c43a9e28995e..6201aab374e6 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -587,7 +587,7 @@ ssize_t tpm_transmit(struct tpm_chip *chip, struct 
tpm_space *space,
 */
if (rc == TPM2_RC_TESTING && cc == TPM2_CC_SELF_TEST)
break;
-   delay_msec *= 2;
+

Extra whitespace


I left just for clarity, but if not needed then I can remove it.

Thanks & Regards,
   - Nayna




if (delay_msec > TPM2_DURATION_LONG) {
if (rc == TPM2_RC_RETRY)
dev_err(&chip->dev, "in retry loop\n");
@@ -597,6 +597,7 @@ ssize_t tpm_transmit(struct tpm_chip *chip, struct 
tpm_space *space,
break;
}
tpm_msleep(delay_msec);
+   delay_msec *= 2;
memcpy(buf, save, save_size);
}
return ret;
--
2.13.6


/Jarkko





Re: [RFC] virtio: Use DMA MAP API for devices without an IOMMU

2018-04-06 Thread Benjamin Herrenschmidt
On Fri, 2018-04-06 at 00:16 -0700, Christoph Hellwig wrote:
> On Fri, Apr 06, 2018 at 08:23:10AM +0530, Anshuman Khandual wrote:
> > On 04/06/2018 02:48 AM, Benjamin Herrenschmidt wrote:
> > > On Thu, 2018-04-05 at 21:34 +0300, Michael S. Tsirkin wrote:
> > > > > In this specific case, because that would make qemu expect an iommu,
> > > > > and there isn't one.
> > > > 
> > > > 
> > > > I think that you can set iommu_platform in qemu without an iommu.
> > > 
> > > No I mean the platform has one but it's not desirable for it to be used
> > > due to the performance hit.
> > 
> > Also the only requirement is to bounce the I/O buffers through SWIOTLB
> > implemented as DMA API which the virtio core understands. There is no
> > need for an IOMMU to be involved for the device representation in this
> > case IMHO.
> 
> This whole virtio translation issue is a mess.  I think we need to
> switch it to the dma API, and then quirk the legacy case to always
> use the direct mapping inside the dma API.

Fine with using a dma API always on the Linux side, but we do want to
special case virtio still at the arch and qemu side to have a "direct
mapping" mode. Not sure how (special flags on PCI devices) to avoid
actually going through an emulated IOMMU on the qemu side, because that
slows things down, esp. with vhost.

IE, we can't I think just treat it the same as a physical device.

Cheers,
Ben.



Re: [PATCH] genirq: only scan the present CPUs

2018-04-06 Thread Dou Liyang

Hi Thomas, Peter,

At 04/03/2018 07:23 PM, Peter Zijlstra wrote:

On Tue, Apr 03, 2018 at 12:25:56PM +0200, Thomas Gleixner wrote:

On Mon, 2 Apr 2018, Li RongQing wrote:


lots of application will read /proc/stat, like ps and vmstat, but we
find the reading time are spreading on Purley platform which has lots
of possible CPUs and interrupt.

To reduce the reading time, only scan the present CPUs, not all possible
CPUs, which speeds the reading of /proc/stat 20 times on Purley platform
which has 56 present CPUs, and 224 possible CPUs


Why is BIOS/ACPI telling the kernel that there are 224 possible CPUs unless
it supports physical CPU hotplug.


BIOS is crap, news at 11. I've got boxes like that too. Use
possible_cpu=$nr if you're bothered by it -- it's what I do.



Yes, I think so. it is a manual way to reset the number.

For this situation, I am investigating to restrict the number of
possible CPUs automatically, But, due to the limitation of ACPI
subsystem, I can do it _before_ setup_percpu_area where the number will
be used.

But, I can provider an indicator to tell the system that whether the 
physical CPU hotplug is support or not later. Can we use this indicator

like that in this situation:

   if ture

Using for_each_possible_cpu(cpu)
   else

Using for_each_present_cpu(cpu) 



Thanks,

dou









Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-06 Thread Petr Mladek
On Fri 2018-04-06 10:30:16, Petr Mladek wrote:
> On Tue 2018-04-03 14:19:43, wen.yan...@zte.com.cn wrote:
> > On the other hand,queue_lock is big, looping doing something under spinlock 
> > 
> > may locked many things and taking a long time, may cause some problems.
> > 
> > So This code needs to be optimized later:
> > 
> > scsi_request_fn()
> > {
> > for (;;) {
> > int rtn;
> > /*
> >  * get next queueable request.  We do this early to make sure
> >  * that the request is fully prepared even if we cannot
> >  * accept it.
> >  */
> > 
> > req = blk_peek_request(q);
> > 
> > if (!req)
> > break;
> > 
> > if (unlikely(!scsi_device_online(sdev))) {
> > sdev_printk(KERN_ERR, sdev,
> > "rejected I/O to offline device\n");
> > scsi_kill_request(req, q);
> > continue;
> > 
> > ^ still under spinlock
> > }
> 
> I wonder if the following might be the best solution after all:
> 
>   if (unlikely(!scsi_device_online(sdev))) {
>   scsi_kill_request(req, q);
> 
>   /*
>* printk() might take a while on slow consoles.
>* Prevent solftlockups by releasing the lock.
>*/
>   spin_unlock_irq(q->queue_lock);
>   sdev_printk(KERN_ERR, sdev,
>   "rejecting I/O to offline device\n");
>   spin_lock_irq(q->queue_lock);
>   continue;
>   }
> 
> I see that the lock is released also in several other situations.
> Therefore it looks safe. Also handling too many requests without
> releasing the lock seems to be a bad idea in general. I think
> that this solution was already suggested earlier.

Just to be sure. Is it safe to kill first few requests and proceed
the others?

I wonder if the device could actually get online without releasing
the queue lock. If not, we normally killed all requests.

I wonder if a local flag might actually help to reduce the number
of messages but keep the existing behavior. I mean something like

static void scsi_request_fn(struct request_queue *q)
{
struct scsi_device *sdev = q->queuedata;
   ^
   The device is the same for each request
   in this queue.


struct request *req;
+   bool offline_reported = false;

/*
 * To start with, we keep looping until the queue is empty, or until
 * the host is no longer able to accept any more requests.
 */
shost = sdev->host;
for (;;) {
int rtn;
req = blk_peek_request(q);
if (!req)
break;

if (unlikely(!scsi_device_online(sdev))) {
+   if (!offline_reported) {
sdev_printk(KERN_ERR, sdev,
"rejecting I/O to offline device\n");
+   offline_reported = true;
+   }
scsi_kill_request(req, q);
continue;
}


Please, note that I am not familiar with the scsi code. I am involved
because this is printk related. Unfortunately, we could not make
printk() faster. The main principle is to get messages on the console
ASAP. Nobody knows when the system might die and any message might
be important.

Best Regards,
Petr


Re: [PATCH] genirq: only scan the present CPUs

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 04:42:14PM +0800, Dou Liyang wrote:
> Hi Thomas, Peter,
> 
> At 04/03/2018 07:23 PM, Peter Zijlstra wrote:
> > On Tue, Apr 03, 2018 at 12:25:56PM +0200, Thomas Gleixner wrote:
> > > On Mon, 2 Apr 2018, Li RongQing wrote:
> > > 
> > > > lots of application will read /proc/stat, like ps and vmstat, but we
> > > > find the reading time are spreading on Purley platform which has lots
> > > > of possible CPUs and interrupt.
> > > > 
> > > > To reduce the reading time, only scan the present CPUs, not all possible
> > > > CPUs, which speeds the reading of /proc/stat 20 times on Purley platform
> > > > which has 56 present CPUs, and 224 possible CPUs
> > > 
> > > Why is BIOS/ACPI telling the kernel that there are 224 possible CPUs 
> > > unless
> > > it supports physical CPU hotplug.
> > 
> > BIOS is crap, news at 11. I've got boxes like that too. Use
> > possible_cpu=$nr if you're bothered by it -- it's what I do.
> > 
> 
> Yes, I think so. it is a manual way to reset the number.
> 
> For this situation, I am investigating to restrict the number of
> possible CPUs automatically, But, due to the limitation of ACPI
> subsystem, I can do it _before_ setup_percpu_area where the number will
> be used.
> 
> But, I can provider an indicator to tell the system that whether the
> physical CPU hotplug is support or not later. Can we use this indicator
> like that in this situation:

If anything you should fix up the enumeration; not random users after
the fact.

So if you see it enumerates a gazillion empty spots but the system does
not in fact support physical hotplug, we should discard those.


mmotm git tree since-4.16 branch created (was: mmotm 2018-04-05-16-59 uploaded)

2018-04-06 Thread Michal Hocko
I have just created since-4.16 branch in mm git tree
(http://git.kernel.org/?p=linux/kernel/git/mhocko/mm.git;a=summary). It
is based on v2018-04-05-16-59 tag in Linus tree and mmotm-2018-04-05-16-59.

As usual mmotm trees are tagged with signed tag
(finger print BB43 1E25 7FB8 660F F2F1 D22D 48E2 09A2 B310 E347)

The shortlog says:
AKASHI Takahiro (1):
  kernel/kexec_file.c: add walk_system_ram_res_rev()

Aaron Lu (3):
  mm/free_pcppages_bulk: update pcp->count inside
  mm/free_pcppages_bulk: do not hold lock when picking pages to free
  mm/free_pcppages_bulk: prefetch buddy while not holding lock

Alexey Dobriyan (26):
  mm/slab_common.c: mark kmalloc machinery as __ro_after_init
  slab: fixup calculate_alignment() argument type
  slab: make kmalloc_index() return "unsigned int"
  slab: make kmalloc_size() return "unsigned int"
  slab: make create_kmalloc_cache() work with 32-bit sizes
  slab: make create_boot_cache() work with 32-bit sizes
  slab: make kmem_cache_create() work with 32-bit sizes
  slab: make size_index[] array u8
  slab: make size_index_elem() unsigned int
  slub: make ->remote_node_defrag_ratio unsigned int
  slub: make ->max_attr_size unsigned int
  slub: make ->red_left_pad unsigned int
  slub: make ->reserved unsigned int
  slub: make ->align unsigned int
  slub: make ->inuse unsigned int
  slub: make ->cpu_partial unsigned int
  slub: make ->offset unsigned int
  slub: make ->object_size unsigned int
  slub: make ->size unsigned int
  slab: make kmem_cache_flags accept 32-bit object size
  kasan: make kasan_cache_create() work with 32-bit slab cache sizes
  slab: make usercopy region 32-bit
  slub: make slab_index() return unsigned int
  slub: make struct kmem_cache_order_objects::x unsigned int
  slub: make size_from_object() return unsigned int
  slab: use 32-bit arithmetic in freelist_randomize()

Andi Kleen (1):
  drivers/media/platform/sti/delta/delta-ipc.c: fix read buffer overflow

Andrew Morton (5):
  z3fold-fix-memory-leak-fix
  list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix
  mm-oom-cgroup-aware-oom-killer-fix
  mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix
  fs-fsnotify-account-fsnotify-metadata-to-kmemcg-fix

Andrey Konovalov (4):
  kasan, slub: fix handling of kasan_slab_free hook
  kasan-slub-fix-handling-of-kasan_slab_free-hook-v2
  kasan: fix invalid-free test crashing the kernel
  kasan: prevent compiler from optimizing away memset in tests

Andrey Ryabinin (5):
  mm/vmscan: update stale comments
  mm/vmscan: remove redundant current_may_throttle() check
  mm/vmscan: don't change pgdat state on base of a single LRU list state
  mm/vmscan: don't mess with pgdat->flags in memcg reclaim
  mm/kasan: don't vfree() nonexistent vm_area

Andy Shevchenko (1):
  mm: reuse DEFINE_SHOW_ATTRIBUTE() macro

Anshuman Khandual (1):
  mm/migrate: rename migration reason MR_CMA to MR_CONTIG_RANGE

Arnd Bergmann (1):
  mm/hmm: fix header file if/else/endif maze, again

Baoquan He (4):
  mm/sparse.c: add a static variable nr_present_sections
  mm/sparsemem.c: defer the ms->section_mem_map clearing
  mm/sparse.c: add a new parameter 'data_unit_size' for 
alloc_usemap_and_memmap
  kernel/kexec_file.c: load kernel at top of system RAM if required

Changbin Du (1):
  scripts/faddr2line: show the code context

Chintan Pandya (1):
  mm/slub.c: use jitter-free reference while printing age

Claudio Imbrenda (2):
  mm/ksm: fix interaction with THP
  mm/ksm.c: fix inconsistent accounting of zero pages

Colin Ian King (3):
  mm/ksm.c: make stable_node_dup() static
  mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() 
static
  mm/swapfile.c: make pointer swap_avail_heads static

Dan Williams (3):
  mm, powerpc: use vma_kernel_pagesize() in vma_mmu_pagesize()
  mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct
  device-dax: implement ->pagesize() for smaps to report MMUPageSize

David Rientjes (6):
  mm, page_alloc: extend kernelcore and movablecore for percent
  mm, page_alloc: move mirrored_kernelcore to __meminitdata
  mm, compaction: drain pcps for zone when kcompactd fails
  mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
  mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
  mm: memcg: remote memcg charging for kmem allocations fix

David Woodhouse (1):
  mm: always print RLIMIT_DATA warning

Dou Liyang (3):
  mm/kmemleak.c: make kmemleak_boot_config() __init
  mm/page_owner.c: make early_page_owner_param() __init
  mm/page_poison.c: make early_page_poison_param() __init

Guenter Roeck (1):
  include/linux/mm.h: provide consistent declaration for num_poisoned_pages

Howard McLauchlan (1):
  mm: make should_fai

Re: [PATCH] genirq: only scan the present CPUs

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 11:02:28AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 06, 2018 at 04:42:14PM +0800, Dou Liyang wrote:
> > Hi Thomas, Peter,
> > 
> > At 04/03/2018 07:23 PM, Peter Zijlstra wrote:
> > > On Tue, Apr 03, 2018 at 12:25:56PM +0200, Thomas Gleixner wrote:
> > > > On Mon, 2 Apr 2018, Li RongQing wrote:
> > > > 
> > > > > lots of application will read /proc/stat, like ps and vmstat, but we
> > > > > find the reading time are spreading on Purley platform which has lots
> > > > > of possible CPUs and interrupt.
> > > > > 
> > > > > To reduce the reading time, only scan the present CPUs, not all 
> > > > > possible
> > > > > CPUs, which speeds the reading of /proc/stat 20 times on Purley 
> > > > > platform
> > > > > which has 56 present CPUs, and 224 possible CPUs
> > > > 
> > > > Why is BIOS/ACPI telling the kernel that there are 224 possible CPUs 
> > > > unless
> > > > it supports physical CPU hotplug.
> > > 
> > > BIOS is crap, news at 11. I've got boxes like that too. Use
> > > possible_cpu=$nr if you're bothered by it -- it's what I do.
> > > 
> > 
> > Yes, I think so. it is a manual way to reset the number.
> > 
> > For this situation, I am investigating to restrict the number of
> > possible CPUs automatically, But, due to the limitation of ACPI
> > subsystem, I can do it _before_ setup_percpu_area where the number will
> > be used.

Ah, did you mean to day "I can _NOT_ do it" ? Still I don't see the
point of frobbing random users if the whole thing is buggered.


Re: [PATCH 05/20] afs: Implement @sys substitution handling

2018-04-06 Thread David Howells
How about the attached changes?

Note that I've put the checks for ".", ".." and names containing '/' in the
functions where the strings for @sys and @cell are set.

David
---
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index a14ea4280590..5e3a0ed2043f 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -760,6 +760,55 @@ static struct inode *afs_do_lookup(struct inode *dir, 
struct dentry *dentry,
return inode;
 }
 
+/*
+ * Do a parallel recursive lookup.
+ *
+ * Ideally, we'd call lookup_one_len(), but we can't because we'd need to be
+ * holding i_mutex but we only hold i_rwsem for read.
+ */
+struct dentry *afs_lookup_rec(struct dentry *dir, const char *name, int len)
+{
+   DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
+   struct dentry *dentry, *old;
+   struct inode *inode = dir->d_inode;
+   struct qstr this;
+   int ret;
+
+   this.name = name;
+   this.len = len;
+   this.hash = full_name_hash(dir, name, len);
+
+   if (unlikely(IS_DEADDIR(inode)))
+   return ERR_PTR(-ESTALE);
+
+again:
+   dentry = d_alloc_parallel(dir, &this, &wq);
+   if (IS_ERR(dentry))
+   return ERR_CAST(dentry);
+
+   if (unlikely(!d_in_lookup(dentry))) {
+   ret = dentry->d_op->d_revalidate(dentry, 0);
+   if (unlikely(ret <= 0)) {
+   if (!ret) {
+   d_invalidate(dentry);
+   dput(dentry);
+   goto again;
+   }
+   dput(dentry);
+   dentry = ERR_PTR(ret);
+   }
+   } else {
+   old = inode->i_op->lookup(inode, dentry, 0);
+   d_lookup_done(dentry); /* Clean up wq */
+   if (unlikely(old)) {
+   dput(dentry);
+   dentry = old;
+   }
+   }
+
+   return dentry;
+}
+
 /*
  * Look up an entry in a directory with @sys substitution.
  */
@@ -810,7 +859,7 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, 
struct dentry *dentry,
strcpy(p, name);
read_unlock(&net->sysnames_lock);
 
-   ret = lookup_one_len(buf, parent, len);
+   ret = afs_lookup_rec(parent, buf, len);
if (IS_ERR(ret) || d_is_positive(ret))
goto out_b;
dput(ret);
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
index b70380e5d0e3..42d03a80310d 100644
--- a/fs/afs/dynroot.c
+++ b/fs/afs/dynroot.c
@@ -125,7 +125,7 @@ static struct dentry *afs_lookup_atcell(struct dentry 
*dentry)
if (!cell)
goto out_n;
 
-   ret = lookup_one_len(name, parent, len);
+   ret = afs_lookup_rec(parent, name, len);
if (IS_ERR(ret) || d_is_positive(ret))
goto out_n;
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 70dd41e06363..1dbcdafb25a0 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -682,6 +682,7 @@ extern const struct inode_operations 
afs_dir_inode_operations;
 extern const struct address_space_operations afs_dir_aops;
 extern const struct dentry_operations afs_fs_dentry_operations;
 
+extern struct dentry *afs_lookup_rec(struct dentry *, const char *, int);
 extern void afs_d_release(struct dentry *);
 
 /*
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 870b0bad03d0..47a0ac21ee44 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -393,6 +393,12 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
if (IS_ERR(kbuf))
return PTR_ERR(kbuf);
 
+   ret = -EINVAL;
+   if (kbuf[0] == '.')
+   goto out;
+   if (memchr(kbuf, '/', size))
+   goto out;
+
/* trim to first NL */
s = memchr(kbuf, '\n', size);
if (s)
@@ -405,6 +411,7 @@ static ssize_t afs_proc_rootcell_write(struct file *file,
if (ret >= 0)
ret = size; /* consume everything, always */
 
+out:
kfree(kbuf);
_leave(" = %d", ret);
return ret;
@@ -775,27 +782,28 @@ static ssize_t afs_proc_sysname_write(struct file *file,
len = strlen(s);
if (len == 0)
continue;
-   if (len >= AFSNAMEMAX) {
-   sysnames->error = -ENAMETOOLONG;
-   ret = -ENAMETOOLONG;
-   goto out;
-   }
+   ret = -ENAMETOOLONG;
+   if (len >= AFSNAMEMAX)
+   goto error;
+
if (len >= 4 &&
s[len - 4] == '@' &&
s[len - 3] == 's' &&
s[len - 2] == 'y' &&
-   s[len - 1] == 's') {
+   s[len - 1] == 's')
/* Protect against recursion */
-   sysnames->error = -EINVAL;
-   ret = -EINVAL;
-   goto out;
-   }

Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Gerd Hoffmann
  Hi,

> >   * The general interface should be able to express sharing from any
> > guest:guest, not just guest:host.  Arbitrary G:G sharing might be
> > something some hypervisors simply aren't able to support, but the
> > userspace API itself shouldn't make assumptions or restrict that.  I
> > think ideally the sharing API would include some kind of
> > query_targets interface that would return a list of VM's that your
> > current OS is allowed to share with; that list would be depend on the
> > policy established by the system integrator, but obviously wouldn't
> > include targets that the hypervisor itself wouldn't be capable of
> > handling.

> Can you give a use-case for this? I mean that the system integrator
> is the one who defines which guests/hosts talk to each other,
> but querying means that it is possible that VMs have some sort
> of discovery mechanism, so they can decide on their own whom
> to connect to.

Note that vsock (created by vmware, these days also has a virtio
transport for kvm) started with support for both guest <=> host and
guest <=> guest support.  But later on guest <=> guest was dropped.
As far I know the reasons where (a) lack of use cases and (b) security.

So, I likewise would know more details on the use cases you have in mind
here.  Unless we have a compelling use case here I'd suggest to drop the
guest <=> guest requirement as it makes the whole thing alot more
complex.

> >   * The sharing API could be used to share multiple kinds of content in a
> > single system.  The sharing sink driver running in the content
> > producer's VM should accept some additional metadata that will be
> > passed over to the target VM as well.

Not sure this should be part of hyper-dmabuf.  A dma-buf is nothing but
a block of data, period.  Therefore protocols with dma-buf support
(wayland for example) typically already send over metadata describing
the content, so duplicating that in hyper-dmabuf looks pointless.

> 1. We are targeting ARM and one of the major requirements for the buffer
> sharing is the ability to allocate physically contiguous buffers, which gets
> even more complicated for systems not backed with an IOMMU. So, for some
> use-cases it is enough to make the buffers contiguous in terms of IPA and
> sometimes those need to be contiguous in terms of PA.

Which pretty much implies the host must to the allocation.

> 2. For Xen we would love to see UAPI to create a dma-buf from grant
> references provided, so we can use this generic solution to implement
> zero-copying without breaking the existing Xen protocols. This can
> probably be extended to other hypervizors as well.

I'm not sure we can create something which works on both kvm and xen.
The memory management model is quite different ...


On xen the hypervisor manages all memory.  Guests can allow other guests
to access specific pages (using grant tables).  In theory any guest <=>
guest communication is possible.  In practice is mostly guest <=> dom0
because guests access their virtual hardware that way.  dom0 is the
priviledged guest which owns any hardware not managed by xen itself.

Xen guests can ask the hypervisor to update the mapping of guest
physical pages.  They can ballon down (unmap and free pages).  They can
ballon up (ask the hypervisor to map fresh pages).  They can map pages
exported by other guests using grant tables.  xen-zcopy makes heavy use
of this.  It balloons down, to make room in the guest physical address
space, then goes map the exported pages there, finally composes a
dma-buf.


On kvm qemu manages all guest memory.  qemu also has all guest memory
mapped, so a grant-table like mechanism isn't needed to implement
virtual devices.  qemu can decide how it backs memory for the guest.
qemu propagates the guest memory map to the kvm driver in the linux
kernel.  kvm guests have some control over the guest memory map, for
example they can map pci bars wherever they want in their guest physical
address space by programming the base registers accordingly, but unlike
xen guests they can't ask the host to remap individual pages.

Due to qemu having all guest memory mapped virtual devices are typically
designed to have the guest allocate resources, then notify the host
where they are located.  This is where the udmabuf idea comes from:
Guest tells the host (qemu) where the gem object is, and qemu then can
create a dmabuf backed by those pages to pass it on to other processes
such as the wayland display server.  Possibly even without the guest
explicitly asking for it, i.e. export the framebuffer placed by the
guest in the (virtual) vga pci memory bar as dma-buf.  And I can imagine
that this is useful outsize virtualization too.


I fail to see any common ground for xen-zcopy and udmabuf ...

Beside that there is the problem that the udmabuf idea has its own share
of issues, for example the fork() issue pointed out by Christian
König[1].  So I still need to f

Re: KASAN: alloca-out-of-bounds Read in unwind_next_frame

2018-04-06 Thread Dmitry Vyukov
On Fri, Apr 6, 2018 at 2:02 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 06dd3dfeea60e2a6457a6aedf97afc8e6d2ba497 (Thu Apr 5 03:07:20 2018 +)
> Merge tag 'char-misc-4.17-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=9d1d9866b0b8ee6e0a8c
>
> syzkaller reproducer:
> https://syzkaller.appspot.com/x/repro.syz?id=6478299081474048
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=5324218015154176
> Kernel config: https://syzkaller.appspot.com/x/.config?id=216543573824217049
> compiler: gcc (GCC) 8.0.1 20180301 (experimental)
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+9d1d9866b0b8ee6e0...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.


This looks like a false positives related to interactions between
alloca, interrupts, KASAN and -mno-red-zones. I've filed gcc bug for
this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85230


> ==
> BUG: KASAN: alloca-out-of-bounds in __read_once_size
> include/linux/compiler.h:188 [inline]
> BUG: KASAN: alloca-out-of-bounds in unwind_next_frame.part.7+0x7ce/0x9c0
> arch/x86/kernel/unwind_frame.c:326
> Read of size 8 at addr 8801b05e67f8 by task syz-executor2/11326
>
> CPU: 0 PID: 11326 Comm: syz-executor2 Not tainted 4.16.0+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x1b9/0x29f lib/dump_stack.c:53
>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.7+0xac/0x2f5 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __read_once_size include/linux/compiler.h:188 [inline]
>  unwind_next_frame.part.7+0x7ce/0x9c0 arch/x86/kernel/unwind_frame.c:326
>  unwind_next_frame+0x3e/0x50 arch/x86/kernel/unwind_frame.c:287
>  __save_stack_trace+0x6e/0xd0 arch/x86/kernel/stacktrace.c:44
>  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
>  __cache_free mm/slab.c:3486 [inline]
>  kmem_cache_free+0x86/0x2d0 mm/slab.c:3744
>  __d_free+0x20/0x30 fs/dcache.c:257
>  __rcu_reclaim kernel/rcu/rcu.h:178 [inline]
>  rcu_do_batch kernel/rcu/tree.c:2675 [inline]
>  invoke_rcu_callbacks kernel/rcu/tree.c:2930 [inline]
>  __rcu_process_callbacks kernel/rcu/tree.c:2897 [inline]
>  rcu_process_callbacks+0x941/0x15f0 kernel/rcu/tree.c:2914
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>  
> RIP: 0010:kasan_unpoison_shadow+0x1/0x50 mm/kasan/kasan.c:68
> RSP: 0018:8801b05e67e8 EFLAGS: 0202 ORIG_RAX: ff13
> RAX: 8801c1b24100 RBX:  RCX: 859d4219
> RDX:  RSI: 00a8 RDI: 8801b05e6760
> RBP: 8801b05e67f8 R08: 8801c1b24100 R09: ed003b6046c2
> R10: ed003b6046c2 R11: 8801db023613 R12: 0015
> R13: 0001 R14: 0016 R15: dc00
>  constrain_params_by_rules+0xbaa/0x1410 sound/core/pcm_param_trace.h:28
>  snd_pcm_hw_refine+0x8e9/0x1180 sound/core/pcm_native.c:502
>  snd_pcm_hw_param_mask sound/core/oss/pcm_oss.c:205 [inline]
>  snd_pcm_oss_change_params+0x8ce/0x3d10 sound/core/oss/pcm_oss.c:870
>  snd_pcm_oss_make_ready+0xe3/0x140 sound/core/oss/pcm_oss.c:1127
>  snd_pcm_oss_sync.isra.27+0x24b/0x850 sound/core/oss/pcm_oss.c:1651
>  snd_pcm_oss_release+0x214/0x290 sound/core/oss/pcm_oss.c:2446
>  __fput+0x34d/0x890 fs/file_table.c:209
>  fput+0x15/0x20 fs/file_table.c:243
>  task_work_run+0x1e4/0x290 kernel/task_work.c:113
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0x1aee/0x2730 kernel/exit.c:865
>  do_group_exit+0x16f/0x430 kernel/exit.c:968
>  get_signal+0x886/0x1960 kernel/signal.c:2469
>  do_signal+0x90/0x2020 arch/x86/kernel/signal.c:810
>  exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
>  do_syscall_64+0x792/0x9d0 arch/x86/entry/common.c:292
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x4552d9
> RSP: 002b:7fd689eb5c68 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: 000

Re: [PATCH v7 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

2018-04-06 Thread Russell King - ARM Linux
On Thu, Apr 05, 2018 at 05:50:54AM -0700, Matthew Wilcox wrote:
> On Thu, Apr 05, 2018 at 08:44:12PM +0800, Jia He wrote:
> > 
> > 
> > On 4/5/2018 7:34 PM, Matthew Wilcox Wrote:
> > > On Thu, Apr 05, 2018 at 01:04:35AM -0700, Jia He wrote:
> > > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> > > > where possible") optimized the loop in memmap_init_zone(). But there is
> > > > still some room for improvement. E.g. if pfn and pfn+1 are in the same
> > > > memblock region, we can simply pfn++ instead of doing the binary search
> > > > in memblock_next_valid_pfn.
> > > Sure, but I bet if we are >end_pfn, we're almost certainly going to the
> > > start_pfn of the next block, so why not test that as well?
> > > 
> > > > +   /* fast path, return pfn+1 if next pfn is in the same region */
> > > > +   if (early_region_idx != -1) {
> > > > +   start_pfn = PFN_DOWN(regions[early_region_idx].base);
> > > > +   end_pfn = PFN_DOWN(regions[early_region_idx].base +
> > > > +   regions[early_region_idx].size);
> > > > +
> > > > +   if (pfn >= start_pfn && pfn < end_pfn)
> > > > +   return pfn;
> > >   early_region_idx++;
> > >   start_pfn = PFN_DOWN(regions[early_region_idx].base);
> > >   if (pfn >= end_pfn && pfn <= start_pfn)
> > >   return start_pfn;
> > Thanks, thus the binary search in next step can be discarded?
> 
> I don't know all the circumstances in which this is called.  Maybe a linear
> search with memo is more appropriate than a binary search.

That's been brought up before, and the reasoning appears to be
something along the lines of...

Academics and published wisdom is that on cached architectures, binary
searches are bad because it doesn't operate efficiently due to the
overhead from having to load cache lines.  Consequently, there seems
to be a knee-jerk reaction that "all binary searches are bad, we must
eliminate them."

What is failed to be grasped here, though, is that it is typical that
the number of entries in this array tend to be small, so the entire
array takes up one or two cache lines, maybe a maximum of four lines
depending on your cache line length and number of entries.

This means that the binary search expense is reduced, and is lower
than a linear search for the majority of cases.

What is key here as far as performance is concerned is whether the
general usage of pfn_valid() by the kernel is optimal.  We should
not optimise only for the boot case, which means evaluating the
effect of these changes with _real_ workloads, not just "does my
machine boot a milliseconds faster".

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: make xmldocs failed with error after 4.17 merge period

2018-04-06 Thread Heikki Krogerus
On Fri, Apr 06, 2018 at 10:30:10AM +0200, Greg KH wrote:
> On Fri, Apr 06, 2018 at 11:15:55AM +0300, Heikki Krogerus wrote:
> > On Fri, Apr 06, 2018 at 09:57:34AM +0200, Greg KH wrote:
> > > On Fri, Apr 06, 2018 at 10:51:09AM +0300, Heikki Krogerus wrote:
> > > > On Fri, Apr 06, 2018 at 12:38:42PM +0900, Masanari Iida wrote:
> > > > > After merge following patch during 4.17 merger period,
> > > > > make xmldocs start to fail with error.
> > > > > 
> > > > >  [bdecb33af34f79cbfbb656661210f77c8b8b5b5f]
> > > > > usb: typec: API for controlling USB Type-C Multiplexers
> > > > > 
> > > > > Error messages.
> > > > > reST markup error:
> > > > > /home/iida/Repo/linux-2.6/Documentation/driver-api/usb/typec.rst:215:
> > > > > (SEVERE/4) Unexpected section title or transition.
> > > > > 
> > > > > 
> > > > > Documentation/Makefile:93: recipe for target 'xmldocs' failed
> > > > > make[1]: *** [xmldocs] Error 1
> > > > > Makefile:1527: recipe for target 'xmldocs' failed
> > > > > make: *** [xmldocs] Error 2
> > > > > 
> > > > > $
> > > > > 
> > > > > An ascii graphic in typec.rst cause the error.
> > > > 
> > > > Thanks for the report. I'm going to propose that we fix this by
> > > > marking the ascii art as comment:
> > > > 
> > > > diff --git a/Documentation/driver-api/usb/typec.rst 
> > > > b/Documentation/driver-api/usb/typec.rst
> > > > index feb31946490b..972c11bf4141 100644
> > > > --- a/Documentation/driver-api/usb/typec.rst
> > > > +++ b/Documentation/driver-api/usb/typec.rst
> > > > @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
> > > > 
> > > >  Illustration of the muxes behind a connector that supports an 
> > > > alternate mode:
> > > > 
> > > > - 
> > > > +..   
> > > >   |   Connector  |
> > > >   
> > > >  | |
> > > > 
> > > > I hope that works.
> > > 
> > > Try it and see!  :)
> > 
> > It will fix this issue. I was just wondering if use of ascii art is
> > acceptable in general with the .rst files? But then again, why
> > wouldn't it be.
> 
> There are ways to do this, look at how the v4l2 and I think the drm
> subsystems handle ascii art such that "real" drawings end up being
> produced.

Thanks. I did not actually find anything else except use of tables and
code-blocks in v4l documentation. Is that what you were referring?

I was propsed to use something called "Literal Block" with ascii art.
http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#literal-blocks

Unless you object, that's what I will use.


Thanks,

-- 
heikki


Re: [PATCH] Documentation: typec.rst: Mark ascii art as a comment

2018-04-06 Thread Heikki Krogerus
On Fri, Apr 06, 2018 at 11:22:29AM +0300, Heikki Krogerus wrote:
> To prevent processing of ascii art as reStructuredText
> elements, marking it as a comment.

I will change this, and use literal-block instead.

> Reported-by: Masanari Iida 
> Fixes: bdecb33af34f ("usb: typec: API for controlling USB Type-C 
> Multiplexers")
> Signed-off-by: Heikki Krogerus 
> ---
>  Documentation/driver-api/usb/typec.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/driver-api/usb/typec.rst 
> b/Documentation/driver-api/usb/typec.rst
> index feb31946490b..972c11bf4141 100644
> --- a/Documentation/driver-api/usb/typec.rst
> +++ b/Documentation/driver-api/usb/typec.rst
> @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
>  
>  Illustration of the muxes behind a connector that supports an alternate mode:
>  
> - 
> +..   
>   |   Connector  |
>   
>  | |

Thanks,

-- 
heikki


Re: [PATCH V3 4/4] genirq/affinity: irq vector spread among online CPUs as far as possible

2018-04-06 Thread Ming Lei
Hi Thomas,

On Wed, Apr 04, 2018 at 09:38:26PM +0200, Thomas Gleixner wrote:
> On Wed, 4 Apr 2018, Ming Lei wrote:
> > On Wed, Apr 04, 2018 at 10:25:16AM +0200, Thomas Gleixner wrote:
> > > In the example above:
> > > 
> > > > > > irq 39, cpu list 0,4
> > > > > > irq 40, cpu list 1,6
> > > > > > irq 41, cpu list 2,5
> > > > > > irq 42, cpu list 3,7
> > > 
> > > and assumed that at driver init time only CPU 0-3 are online then the
> > > hotplug of CPU 4-7 will not result in any interrupt delivered to CPU 4-7.
> > 
> > Indeed, and I just tested this case, and found that no interrupts are
> > delivered to CPU 4-7.
> > 
> > In theory, the affinity has been assigned to these irq vectors, and
> > programmed to interrupt controller, I understand it should work.
> > 
> > Could you explain it a bit why interrupts aren't delivered to CPU 4-7?
> 
> As I explained before:
> 
> "If the device is already in use when the offline CPUs get hot plugged, then
>  the interrupts still stay on cpu 0-3 because the effective affinity of
>  interrupts on X86 (and other architectures) is always a single CPU."
> 
> IOW. If you set the affinity mask so it contains more than one CPU then the
> kernel selects a single CPU as target. The selected CPU must be online and
> if there is more than one online CPU in the mask then the kernel picks the
> one which has the least number of interrupts targeted at it. This selected
> CPU target is programmed into the corresponding interrupt chip
> (IOAPIC/MSI/MSIX) and it stays that way until the selected target CPU
> goes offline or the affinity mask changes.
> 
> The reasons why we use single target delivery on X86 are:
> 
>1) Not all X86 systems support multi target delivery
> 
>2) If a system supports multi target delivery then the interrupt is
>   preferrably delivered to the CPU with the lowest APIC ID (which
>   usually corresponds to the lowest CPU number) due to hardware magic
>   and only a very small percentage of interrupts are delivered to the
>   other CPUs in the multi target set. So the benefit is rather dubious
>   and extensive performance testing did not show any significant
>   difference.
> 
>3) The management of multi targets on the software side is painful as
>   the same low level vector number has to be allocated on all possible
>   target CPUs. That's making a lot of things including hotplug more
>   complex for very little - if at all - benefit.
> 
> So at some point we ripped out the multi target support on X86 and moved
> everything to single target delivery mode.
> 
> Other architectures never supported multi target delivery either due to
> hardware restrictions or for similar reasons why X86 dropped it. There
> might be a few architectures which support it, but I have no overview at
> the moment.
> 
> The information is in procfs
> 
> # cat /proc/irq/9/smp_affinity_list 
> 0-3
> # cat /proc/irq/9/effective_affinity_list 
> 1
> 
> # cat /proc/irq/10/smp_affinity_list 
> 0-3
> # cat /proc/irq/10/effective_affinity_list 
> 2
> 
> smp_affinity[_list] is the affinity which is set either by the kernel or by
> writing to /proc/irq/$N/smp_affinity[_list]
> 
> effective_affinity[_list] is the affinity which is effective, i.e. the
> single target CPU to which the interrupt is affine at this point.
> 
> As you can see in the above examples the target CPU is selected from the
> given possible target set and the internal spreading of the low level x86
> vector allocation code picks a CPU which has the lowest number of
> interrupts targeted at it.
> 
> Let's assume for the example below
> 
> # cat /proc/irq/10/smp_affinity_list 
> 0-3
> # cat /proc/irq/10/effective_affinity_list 
> 2
> 
> that CPU 3 was offline when the device was initialized. So there was no way
> to select it and when CPU 3 comes online there is no reason to change the
> affinity of that interrupt, at least not from the kernel POV. Actually we
> don't even have a mechanism to do so automagically.
> 
> If I offline CPU 2 after onlining CPU 3 then the kernel has to move the
> interrupt away from CPU 2, so it selects CPU 3 as it's the one with the
> lowest number of interrupts targeted at it.
> 
> Now this is a bit different if you use affinity managed interrupts like
> NVME and other devices do.
> 
> Many of these devices create one queue per possible CPU, so the spreading
> is simple; One interrupt per possible cpu. Pretty boring.
> 
> When the device has less queues than possible CPUs, then stuff gets more
> interesting. The queues and therefore the interrupts must be targeted at
> multiple CPUs. There is some logic which spreads them over the numa nodes
> and takes siblings into account when Hyperthreading is enabled.
> 
> In both cases the managed interrupts are handled over CPU soft
> hotplug/unplug:
> 
>   1) If a CPU is soft unplugged and an interrupt is targeted at the CPU
>  then the interrupt is either moved to a still online CPU in t

Re: [PATCH v4 3/9] vsprintf: Do not check address of well-known strings

2018-04-06 Thread Petr Mladek
On Thu 2018-04-05 15:30:51, Rasmus Villemoes wrote:
> On 2018-04-04 10:58, Petr Mladek wrote:
> > We are going to check the address using probe_kernel_address(). It will
> > be more expensive and it does not make sense for well known address.
> > 
> > This patch splits the string() function. The variant without the check
> > is then used on locations that handle string constants or strings defined
> > as local variables.
> > 
> > This patch does not change the existing behavior.
> 
> Please leave string() alone, except for moving the < PAGE_SIZE check to
> a new helper checked_string (feel free to find a better name), and use
> checked_string for handling %s and possibly the few other cases where
> we're passing a user-supplied pointer. That avoids cluttering the entire
> file with double-underscore calls, and e.g. in the %pO case, it's easier
> to understand why one uses two different *string() helpers if the name
> of one somehow conveys how it is different from the other.

I understand your reasoning. I thought about exactly this as well.
My problem is that string() will then be unsafe. It might be dangerous
when porting patches.

This is why I wanted a different name for the variant without the
check. But I was not able to come up with anything short and clear
at the same time.

Is _string() really that bad? I think that it is a rather common
practice to use _func() for functions that are less safe than func()
variants. People should use _func() variants with care and this is
what we want here.

In addition, it is an internal API. IMHO, only few people do changes
there. They will get used to it quickly. Which is not true for people
that might need to port patches.

Best Regards,
Petr


Re: [PATCH] Documentation: typec.rst: Mark ascii art as a comment

2018-04-06 Thread Jani Nikula
On Fri, 06 Apr 2018, Heikki Krogerus  wrote:
> To prevent processing of ascii art as reStructuredText
> elements, marking it as a comment.

Please don't. This hides the ascii art from the generated documentation.

The right fix is to use a reStructuredText literal block like this:

diff --git a/Documentation/driver-api/usb/typec.rst 
b/Documentation/driver-api/usb/typec.rst
index feb31946490b..48ff58095f11 100644
--- a/Documentation/driver-api/usb/typec.rst
+++ b/Documentation/driver-api/usb/typec.rst
@@ -210,7 +210,7 @@ If the connector is dual-role capable, there may also be a 
switch for the data
 role. USB Type-C Connector Class does not supply separate API for them. The
 port drivers can use USB Role Class API with those.
 
-Illustration of the muxes behind a connector that supports an alternate mode:
+Illustration of the muxes behind a connector that supports an alternate mode::
 
  
  |   Connector  |


BR,
Jani.


>
> Reported-by: Masanari Iida 
> Fixes: bdecb33af34f ("usb: typec: API for controlling USB Type-C 
> Multiplexers")
> Signed-off-by: Heikki Krogerus 
> ---
>  Documentation/driver-api/usb/typec.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/driver-api/usb/typec.rst 
> b/Documentation/driver-api/usb/typec.rst
> index feb31946490b..972c11bf4141 100644
> --- a/Documentation/driver-api/usb/typec.rst
> +++ b/Documentation/driver-api/usb/typec.rst
> @@ -212,7 +212,7 @@ port drivers can use USB Role Class API with those.
>  
>  Illustration of the muxes behind a connector that supports an alternate mode:
>  
> - 
> +..   
>   |   Connector  |
>   
>  | |

-- 
Jani Nikula, Intel Open Source Technology Center


Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64

2018-04-06 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> > 
> > * Dominik Brodowski  wrote:
> > 
> > > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > > Ok, this series looks mostly good to me, but AFAICS this breaks the UML 
> > > > build:
> > > > 
> > > >  make[2]: *** No rule to make target 'archheaders'.  Stop.
> > > >  arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > >  make[1]: *** [archheaders] Error 2
> > > >  make[1]: *** Waiting for unfinished jobs
> > > 
> > > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries like
> > > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > > patch 8/8 could simply be dropped from the series altogether...
> > 
> > I still like the 'truth in advertising' aspect. For example if I see this 
> > in the 
> > syscall table:
> > 
> >  10  common  mprotect__sys_x86_mprotect
> > 
> > I can immediately find the _real_ syscall entry point:
> > 
> > 81180a10 <__sys_x86_mprotect>:
> > 81180a10:   48 8b 57 60 mov0x60(%rdi),%rdx
> > 81180a14:   48 8b 77 68 mov0x68(%rdi),%rsi
> > 81180a18:   b9 ff ff ff ff  mov$0x,%ecx
> > 81180a1d:   48 8b 7f 70 mov0x70(%rdi),%rdi
> > 81180a21:   e8 fa fc ff ff  callq  81180720 
> > 
> > 81180a26:   48 98   cltq   
> > 81180a28:   c3  retq   
> > 81180a29:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
> > 
> > If, on the other hand, I see this entry:
> > 
> >  10 common  mprotectsys_mprotect
> > 
> > Then, as a first step, no symbol anywhere matches with this:
> > 
> >  triton:~/tip> grep sys_mprotect System.map 
> >  triton:~/tip> 
> > 
> > "sys_mprotect" does not exist in any easily discoverable sense. You have to 
> > *know* 
> > to replace the sys_ prefix with __sys_x86_ to find it.
> > 
> > Now arguably we could use a __sys_ prefix instead of the grep-barrier 
> > __sys_x86 
> > prefix - but that too would be somewhat confusing I think.
> 
> Well, if looking at the ARCH="um" kernel, you won't find the 
> __sys_x86_mprotect 
> there in its System.map -- so we either have to disentangle um and plain x86, 
> or 
> live with some cause for confusion.

I'm primarily concerned about everything making sense on x86 - UML is an 
entirely 
separate architecture with heavy tradeoffs and kludges.

> __sys_mprotect as prefix won't work by the way, as the double-underscore 
> __sys_ 
> variant is already used in net/* for internal syscall helpers.

Ok - then triple underscore - but overall I think it's more confusing.

Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?

The only reason I suggested the __sys_x86_ prefix was because you originally 
suggested that there's symbol name overlap, but I don't think that's the case 
within the same kernel build, as the regular non-ptregs prototype:

  asmlinkage long sys_mprotect(unsigned long start, size_t len, unsigned long 
prot);

... will only exist on !CONFIG_ARCH_HAS_SYSCALL_WRAPPER kernels.

So maybe that's the simplest and least confusing solution.

Thanks,

Ingo


Re: [PATCH v2] Add udmabuf misc device

2018-04-06 Thread Gerd Hoffmann
  Hi,

> The pages backing a DMA-buf are not allowed to move (at least not without a
> patch set I'm currently working on), but for certain MM operations to work
> correctly you must be able to modify the page tables entries and move the
> pages backing them around.
> 
> For example try to use fork() with some copy on write pages with this
> approach. You will find that you have only two options to correctly handle
> this.

The fork() issue should go away with shared memory pages (no cow).
I guess this is the reason why vgem is internally backed by shmem.

Hmm.  So I could try to limit the udmabuf driver to shmem too (i.e.
have the ioctl take a shmem filehandle and offset instead of a virtual
address).

But maybe it is better then to just extend vgem, i.e. add support to
create gem objects from existing shmem.

Comments?

cheers,
  Gerd



Re: [PATCH 0/8] use struct pt_regs based syscall calling for x86-64

2018-04-06 Thread Dominik Brodowski
On Fri, Apr 06, 2018 at 11:20:46AM +0200, Ingo Molnar wrote:
> 
> * Dominik Brodowski  wrote:
> 
> > On Fri, Apr 06, 2018 at 10:23:22AM +0200, Ingo Molnar wrote:
> > > 
> > > * Dominik Brodowski  wrote:
> > > 
> > > > On Thu, Apr 05, 2018 at 05:19:33PM +0200, Ingo Molnar wrote:
> > > > > Ok, this series looks mostly good to me, but AFAICS this breaks the 
> > > > > UML build:
> > > > > 
> > > > >  make[2]: *** No rule to make target 'archheaders'.  Stop.
> > > > >  arch/um/Makefile:119: recipe for target 'archheaders' failed
> > > > >  make[1]: *** [archheaders] Error 2
> > > > >  make[1]: *** Waiting for unfinished jobs
> > > > 
> > > > Ah, that's caused by patch 8/8 which I did and do not like all that much
> > > > anyway: UML re-uses syscall_64.tbl which now has x86-specific entries 
> > > > like
> > > > __sys_x86_pread64, but expects the generic syscall stub sys_pread64
> > > > referenced there. Fixup patch below; could be folded with patch 8/8. Or
> > > > patch 8/8 could simply be dropped from the series altogether...
> > > 
> > > I still like the 'truth in advertising' aspect. For example if I see this 
> > > in the 
> > > syscall table:
> > > 
> > >  10  common  mprotect__sys_x86_mprotect
> > > 
> > > I can immediately find the _real_ syscall entry point:
> > > 
> > > 81180a10 <__sys_x86_mprotect>:
> > > 81180a10:   48 8b 57 60 mov0x60(%rdi),%rdx
> > > 81180a14:   48 8b 77 68 mov0x68(%rdi),%rsi
> > > 81180a18:   b9 ff ff ff ff  mov$0x,%ecx
> > > 81180a1d:   48 8b 7f 70 mov0x70(%rdi),%rdi
> > > 81180a21:   e8 fa fc ff ff  callq  81180720 
> > > 
> > > 81180a26:   48 98   cltq   
> > > 81180a28:   c3  retq   
> > > 81180a29:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
> > > 
> > > If, on the other hand, I see this entry:
> > > 
> > >  10 common  mprotectsys_mprotect
> > > 
> > > Then, as a first step, no symbol anywhere matches with this:
> > > 
> > >  triton:~/tip> grep sys_mprotect System.map 
> > >  triton:~/tip> 
> > > 
> > > "sys_mprotect" does not exist in any easily discoverable sense. You have 
> > > to *know* 
> > > to replace the sys_ prefix with __sys_x86_ to find it.
> > > 
> > > Now arguably we could use a __sys_ prefix instead of the grep-barrier 
> > > __sys_x86 
> > > prefix - but that too would be somewhat confusing I think.
> > 
> > Well, if looking at the ARCH="um" kernel, you won't find the 
> > __sys_x86_mprotect 
> > there in its System.map -- so we either have to disentangle um and plain 
> > x86, or 
> > live with some cause for confusion.
> 
> I'm primarily concerned about everything making sense on x86 - UML is an 
> entirely 
> separate architecture with heavy tradeoffs and kludges.

Agreed.

> > __sys_mprotect as prefix won't work by the way, as the double-underscore 
> > __sys_ 
> > variant is already used in net/* for internal syscall helpers.
> 
> Ok - then triple underscore - but overall I think it's more confusing.
> 
> Btw., what was the problem with calling the x86 ptregs wrapper sys_mprotect?
> 
> The only reason I suggested the __sys_x86_ prefix was because you originally 
> suggested that there's symbol name overlap, but I don't think that's the case 
> within the same kernel build, as the regular non-ptregs prototype:

Indeed, there's no symbol name overlap within the same kernel build, but
technically different stubs named the same. If that's fine, just drop patch
8/8 (including the UML fixup) and things should be fine, with the stub and
the entry in the syscall table both named sys_mprotect.

For IA32_EMULATION, we have __sys_ia32_mprotect as stub for the same
syscall, including this name as entry in syscall_32.tbl.

More problematic is the naming for the compat stubs for IA32_EMAULATION and
X32, where we have

__compat_sys_ia32_waitid
__compat_sys_x32_waitid

for example. We *could* rename one of those to compat_sys_waitid() and levae
the other as-is, but actually I prefer it now how it is.

Thanks,
Dominik


Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Oleksandr Andrushchenko

On 04/06/2018 12:07 PM, Gerd Hoffmann wrote:

   Hi,


   * The general interface should be able to express sharing from any
 guest:guest, not just guest:host.  Arbitrary G:G sharing might be
 something some hypervisors simply aren't able to support, but the
 userspace API itself shouldn't make assumptions or restrict that.  I
 think ideally the sharing API would include some kind of
 query_targets interface that would return a list of VM's that your
 current OS is allowed to share with; that list would be depend on the
 policy established by the system integrator, but obviously wouldn't
 include targets that the hypervisor itself wouldn't be capable of
 handling.

Can you give a use-case for this? I mean that the system integrator
is the one who defines which guests/hosts talk to each other,
but querying means that it is possible that VMs have some sort
of discovery mechanism, so they can decide on their own whom
to connect to.

Note that vsock (created by vmware, these days also has a virtio
transport for kvm) started with support for both guest <=> host and
guest <=> guest support.  But later on guest <=> guest was dropped.
As far I know the reasons where (a) lack of use cases and (b) security.

So, I likewise would know more details on the use cases you have in mind
here.  Unless we have a compelling use case here I'd suggest to drop the
guest <=> guest requirement as it makes the whole thing alot more
complex.

This is exactly the use-case we have: in our setup Dom0 doesn't
own any HW at all and all the HW is passed into a dedicated
driver domain (DomD) which is still a guest domain.
Then, buffers are shared between two guests, for example,
DomD and DomA (Android guest)

   * The sharing API could be used to share multiple kinds of content in a
 single system.  The sharing sink driver running in the content
 producer's VM should accept some additional metadata that will be
 passed over to the target VM as well.

Not sure this should be part of hyper-dmabuf.  A dma-buf is nothing but
a block of data, period.  Therefore protocols with dma-buf support
(wayland for example) typically already send over metadata describing
the content, so duplicating that in hyper-dmabuf looks pointless.


1. We are targeting ARM and one of the major requirements for the buffer
sharing is the ability to allocate physically contiguous buffers, which gets
even more complicated for systems not backed with an IOMMU. So, for some
use-cases it is enough to make the buffers contiguous in terms of IPA and
sometimes those need to be contiguous in terms of PA.

Which pretty much implies the host must to the allocation.


2. For Xen we would love to see UAPI to create a dma-buf from grant
references provided, so we can use this generic solution to implement
zero-copying without breaking the existing Xen protocols. This can
probably be extended to other hypervizors as well.

I'm not sure we can create something which works on both kvm and xen.
The memory management model is quite different ...


On xen the hypervisor manages all memory.  Guests can allow other guests
to access specific pages (using grant tables).  In theory any guest <=>
guest communication is possible.  In practice is mostly guest <=> dom0
because guests access their virtual hardware that way.  dom0 is the
priviledged guest which owns any hardware not managed by xen itself.

Please see above for our setup with DomD and Dom0 being
a generic ARMv8 domain, no HW

Xen guests can ask the hypervisor to update the mapping of guest
physical pages.  They can ballon down (unmap and free pages).  They can
ballon up (ask the hypervisor to map fresh pages).  They can map pages
exported by other guests using grant tables.  xen-zcopy makes heavy use
of this.  It balloons down, to make room in the guest physical address
space, then goes map the exported pages there, finally composes a
dma-buf.

This is what it does


On kvm qemu manages all guest memory.  qemu also has all guest memory
mapped, so a grant-table like mechanism isn't needed to implement
virtual devices.  qemu can decide how it backs memory for the guest.
qemu propagates the guest memory map to the kvm driver in the linux
kernel.  kvm guests have some control over the guest memory map, for
example they can map pci bars wherever they want in their guest physical
address space by programming the base registers accordingly, but unlike
xen guests they can't ask the host to remap individual pages.

Due to qemu having all guest memory mapped virtual devices are typically
designed to have the guest allocate resources, then notify the host
where they are located.  This is where the udmabuf idea comes from:
Guest tells the host (qemu) where the gem object is, and qemu then can
create a dmabuf backed by those pages to pass it on to other processes
such as the wayland display server.  Possibly even without the guest
explicitly asking for it, i.e. export the framebuffer placed by the

Re: [PATCH] ata: ahci-platform: add reset control support

2018-04-06 Thread Kunihiko Hayashi
Hi Hans,

On Fri, 6 Apr 2018 10:29:37 +0200
Hans de Goede  wrote:

> Hi,
> 
> On 06-04-18 06:48, Kunihiko Hayashi wrote:
> > Hi Hans,
> > > On Thu, 5 Apr 2018 16:08:24 +0200
> > Hans de Goede  wrote:
> > >> Hi,
> >>
> >> On 05-04-18 16:00, Hans de Goede wrote:
> >>> Hi,
>  On 05-04-18 15:54, Thierry Reding wrote:
>  On Thu, Apr 05, 2018 at 03:27:03PM +0200, Hans de Goede wrote:
> > Hi,
> >
> > On 05-04-18 15:17, Patrice CHOTARD wrote:
> >> Hi Thierry
> >>
> >> On 04/05/2018 11:54 AM, Thierry Reding wrote:
> >>> On Fri, Mar 23, 2018 at 10:30:53AM +0900, Kunihiko Hayashi wrote:
>  Add support to get and control a list of resets for the device
>  as optional and shared. These resets must be kept de-asserted until
>  the device is enabled.
> 
>  This is specified as shared because some SoCs like UniPhier series
>  have common reset controls with all ahci controller instances.
> 
>  Signed-off-by: Kunihiko Hayashi 
>  ---
>  ??? .../devicetree/bindings/ata/ahci-platform.txt? |? 1 +
>  ??? drivers/ata/ahci.h |? 1 +
>  ??? drivers/ata/libahci_platform.c | 24 
>  +++---
>  ??? 3 files changed, 23 insertions(+), 3 deletions(-)
> >>>
> >>> This causes a regression on Tegra because we explicitly request the
> >>> resets after the call to ahci_platform_get_resources().
> >>
> >> I confirm, we got exactly the same behavior on STi platform.
> >>
> >>>
> >>> ?? From a quick look, ahci_mtk and ahci_st are in the same boat, 
> >>> adding the
> >>> corresponding maintainers to Cc.
> >>>
> >>> Patrice, Matthias: does SATA still work for you after this patch? This
> >>> has been in linux-next since next-20180327.
> >>
> >> SATA is still working after this patch, but a kernel warning is
> >> triggered due to the fact that resets are both requested by
> >> libahci_platform and by ahci_st driver.
> >
> > So in your case you might be able to remove the reset handling
> > from the ahci_st driver and rely on the new libahci_platform
> > handling instead? If that works that seems like a win to me.
> >
> > As said elsewhere in this thread I think it makes sense to keep (or 
> > re-add
> > after a revert) the libahci_platform reset code, but make it conditional
> > on a flag passed to ahci_platform_get_resources(). This way we get
> > the shared code for most cases and platforms which need special handling
> > can opt-out.
> 
>  Agreed, although I prefer such helpers to be opt-in, rather than
>  opt-out. In my experience that tends make the helpers more resilient to
>  this kind of regression. It also simplifies things because instead of
>  drivers saying "I want all the helpers except this one and that one",
>  they can simply say "I want these helpers and that one". In the former
>  case whenever you add some new (opt-out) feature, you have to update all
>  drivers and add the exception. In the latter you only need to extend the
>  drivers that want to make use of the new helper.
> >>
> >> Erm, the idea never was to make this opt-out but rather opt in, so
> >> we add a flags parameter to ahci_platform_get_resources() and all
> >> current users pass in 0 for that to keep the current behavior.
> >>
> >> And only the generic drivers/ata/ahci_platform.c driver will pass
> >> in a the new AHCI_PLATFORM_GET_RESETS flag, which makes
> >> ahci_platform_get_resources() (and the other functions) also deal
> >> with resets.
> >>
>  With that in mind, rather than adding a flag to the
>  ahci_platform_get_resources() function, it might be more flexible to
>  split the helpers into finer-grained functions. That way drivers can
>  pick whatever functionality they want from the helpers.
>  Good point, so lets:
>  1) Revert the patch for now
> >>> 2) Have a new version of the patch which adds a 
> >>> ahci_platform_get_resets() helper
> >>> 3) Modify the generic drivers/ata/ahci_platform.c driver to call the new
> >>>   ?? ahci_platform_get_resets() between its ahci_platform_get_resources()
> >>>   ?? and ahci_platform_enable_resources() calls.
> >>>   ?? I think that ahci_platform_enable_resources() should still 
> >>> automatically
> >>>   ?? do the right thing wrt resets if ahci_platform_get_resets() was 
> >>> called
> >>>   ?? (otherwise the resets array will be empty and should be skipped)
>  This should make the generic driver usable for the UniPhier SoCs and
> >>> maybe some other drivers like the ahci_st driver can also switch to the
> >>> new ahci_platform_get_resets() functionality to reduce their code a bit.
> >>
> >> So thinking slightly longer about this, with the opt-in variant
> >> (which is what I intended all along) I do think that a flags par

Re: [PATCH v4 8/9] vsprintf: Prevent crash when dereferencing invalid pointers

2018-04-06 Thread Rasmus Villemoes
On 2018-04-04 10:58, Petr Mladek wrote:

> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 3551b7957d9e..1a080a75a825 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -599,12 +599,46 @@ char *__string(char *buf, char *end, const char *s, 
> struct printf_spec spec)
>   return widen_string(buf, len, end, spec);
>  }
>  
> + /*
> +  * This is not a fool-proof test. 99% of the time that this will fault is
> +  * due to a bad pointer, not one that crosses into bad memory. Just test
> +  * the address to make sure it doesn't fault due to a poorly added printk
> +  * during debugging.
> +  */
> +static const char *check_pointer_access(const void *ptr)
> +{
> + char byte;
> +
> + if (!ptr)
> + return "(null)";
> +
> + if (probe_kernel_address(ptr, byte))
> + return "(efault)";
> +
> + return NULL;
> +}
> +
> +
> +static bool valid_pointer_access(char **buf, char *end, const void *ptr,
> +  struct printf_spec spec)
> +{
> + const char *err_msg;
> +
> + err_msg = check_pointer_access(ptr);
> + if (err_msg) {
> + *buf = __string(*buf, end, err_msg, spec);
> + return false;
> + }
> +
> + return true;
> +}
> +
>  static noinline_for_stack
>  char *string(char *buf, char *end, const char *s,
>  struct printf_spec spec)
>  {
> - if ((unsigned long)s < PAGE_SIZE)
> - s = "(null)";
> + if (!valid_pointer_access(&buf, end, s, spec))
> + return buf;
>  
>   return __string(buf, end, s, spec);
>  }

Obviously, if you do add a WARN to the check_pointer_access (and please
do), that somehow needs to be suppressed for the "%s", NULL and "%s",
ZEROPTR cases, which are grandfathered in and I think is relied upon in
some places. It should be as simple as keeping  the < PAGE_SIZE check
and do "else if (!valid...())".

Rasmus


Re: [PATCH v5 12/44] clk: davinci: Add platform information for TI DA850 PSC

2018-04-06 Thread Sekhar Nori
Hi Bart,

On Thursday 05 April 2018 09:21 PM, Bartosz Golaszewski wrote:
> 2018-04-05 16:36 GMT+02:00 Sekhar Nori :
>> On Thursday 05 April 2018 07:14 PM, Bartosz Golaszewski wrote:
>>> 2018-04-05 15:09 GMT+02:00 Sekhar Nori :
 Hi Bartosz,

 On Friday 09 February 2018 10:18 PM, Michael Turquette wrote:
> On Fri, Feb 9, 2018 at 8:22 AM, Bartosz Golaszewski  wrote:
>> 2018-01-08 3:17 GMT+01:00 David Lechner :

>> Hi David,
>>
>> I've been working on moving the genpd code from its own driver to the
>> psc one. I couldn't get the system to boot though and problems
>> happened very early in the boot sequence. I struggled to figure out
>> what's happening, but eventually I noticed that psc uses
>> CLK_OF_DECLARE() to initialize clocks. The functions registered this
>> way are called very early in the boot sequence, way before
>> late_initcall() in which the genpd framework is initialized. This of

 late_initcall() is too late for genpd to be initialized. As you may have
 seen with the latest set of patches, we have problems with timer
 initialization. After converting to platform devices, PSC and PLL clocks
 get initialized post time_init(). We are working that around using
 fixed-clocks, which hopefully will work (I still need to test many of
 the affected platforms).

 Can you please reply with the exact issue you faced with genpd framework
 initialization so we do have that on record.

>>>
>>> The exact issue manifested itself in a NULL-pointer dereference panic
>>> when I tried moving the genpd code I had initially implemented as a
>>> separate platform driver to what I believe was v6 or v7 of David's
>>> series (before the psc driver became a platform driver, when it was
>>> still using CLK_OF_DECLARE()). When I had tested a simple conversion
>>> of that version to a platform_driver, genpd worked fine.
>>>
>>> I don't have the stack traces from these panics, but I recall some
>>> debugfs functions being involved and the genpd late_initcalls are
>>> related to debugfs. Looking at it now I don't see how exactly it could
>>> fail though.
>>
>> Do you have the code where you faced the problem stashed somewhere? I am
>> not (yet) advocating going back to CLK_OF_DECLARE(). But there is a
>> definite issue with timer being ready when not using CLK_OF_DECLARE().
>> So, I want to make sure there the reason why we are going down the
>> platform device path is a amply clear.
>>
>> Thanks,
>> Sekhar
> 
> Yes, you can still find it on my github[1].
> 
> Bart
> 
> [1] github.com:brgl/linux.git  topic/davinci-genpd-final-v2

The panic issue in your branch is not related to genpd. Its because you 
are accessing platform bus before it is initialized. The attached[1] 
patch on top of your branch made it boot again.

With your branch booting, I can see genpd related debugfs entries 
getting created. I don't see devices being attached to the domains you 
have though. I did not debug that. I suspect some matching issue.

Can you please check that and confirm there is no issue with genpd and 
using CLK_OF_DECLARE() to initialize clocks?

Unless you report an issue back, or Mike and Stephen have ideas about 
how to handle the dependency between PSC/PLL derived timer clock 
initialization and and timer_probe(), I think we need to move back to 
using CLK_OF_DECLARE(). 

Thanks,
Sekhar

---8<---
diff --git a/drivers/clk/davinci/psc.c b/drivers/clk/davinci/psc.c
index 9d9f94eee544..7e3d114efdac 100644
--- a/drivers/clk/davinci/psc.c
+++ b/drivers/clk/davinci/psc.c
@@ -281,12 +281,18 @@ int __davinci_psc_register_clocks(const struct 
davinci_psc_clk_info *info,
}
}
 
-   DO_ONCE(pm_clk_add_notifier,
-   &platform_bus_type, &platform_bus_notifier);
-
return 0;
 }
 
+static int __init davinci_pm_runtime_init(void)
+{
+   pm_clk_add_notifier(&platform_bus_type, &platform_bus_notifier);
+
+   return 0;
+}
+core_initcall(davinci_pm_runtime_init);
+
 int davinci_psc_register_clocks(const struct davinci_psc_clk_info *info,
void __iomem *base)
 {


Re: [PATCH V3 4/4] genirq/affinity: irq vector spread among online CPUs as far as possible

2018-04-06 Thread Thomas Gleixner
On Fri, 6 Apr 2018, Ming Lei wrote:
> 
> I will post V4 soon by using cpu_present_mask in the 1st stage irq spread.
> And it should work fine for Kashyap's case in normal cases.

No need to resend. I've changed it already and will push it out after
lunch.

Thanks,

tglx


[bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390

2018-04-06 Thread Sebastian Ott
Hi,

Today's kernel oopsed on s390. Bisect points to:
3c8ba0d61d04 ("kernel.h: Retain constant expression output for max()/min()")

[1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB total 
size, 48 KB/track, compatible disk layout
[1.898308] [ cut here ]
[1.898310] kernel BUG at block/bio.c:1798!
[1.898320] illegal operation: 0001 ilc:1 [#1] PREEMPT SMP 
[1.898322] Modules linked in:
[1.898325] CPU: 2 PID: 68 Comm: kworker/2:1 Not tainted 
4.16.0-09576-g38c23685b273 #130
[1.898326] Hardware name: IBM 3906 M04 704 (LPAR)
[1.898332] Workqueue: events do_kick_device
[1.898334] Krnl PSW : 08df7044 dbc29dcd 
(bio_split+0xb2/0xb8)
[1.898338]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
[1.898340] Krnl GPRS: 03d10008 03d1 facfa100 

[1.898341]0140 fae0b100 fa9c7738 
facfa100
[1.898342]facfa188   

[1.898343]f7bb 03d1 00521530 
fa9c75e8
[1.898351] Krnl Code: 0050e41a: f0c4ebafsrp 
4(13,%r0),2991(%r14),0
  0050e420: f0a407f4srp 
4(11,%r0),2036,0
 #0050e426: a7f40001brc 
15,50e428
 >0050e42a: a7f40001brc 
15,50e42c
  0050e42e: 0707bcr 0,%r7
  0050e430: c004brcl0,50e430
  0050e436: eb6ff0480024stmg
%r6,%r15,72(%r15)
  0050e43c: a7f13f80tmll
%r15,16256
[1.898365] Call Trace:
[1.898367] ([<01088020>] 0x1088020)
[1.898371]  [<00528de6>] blk_mq_make_request+0x76/0x768 
[1.898375]  [<0051923a>] generic_make_request+0xea/0x2c0 
[1.898376]  [<005194b4>] submit_bio+0xa4/0x1a0 
[1.898378]  [<003ab934>] submit_bh_wbc+0x1c4/0x218 
[1.898380]  [<003ac5d2>] block_read_full_page+0x352/0x3f8 
[1.898383]  [<002ad1ae>] do_read_cache_page+0x19e/0x3c0 
[1.898385]  [<002ad400>] read_cache_page+0x30/0x40 
[1.898387]  [<00533900>] read_dev_sector+0x58/0xe8 
[1.898389]  [<00538252>] read_lba.isra.0+0x12a/0x1b8 
[1.898391] dasd-eckd 0.0.3307: DASD with 4 KB/block, 21636720 KB total 
size, 48 KB/track, compatible disk layout
[1.898394]  [<00538748>] efi_partition+0x198/0x660 
[1.898396]  [<00535c8c>] check_partition+0x15c/0x2b8 
[1.898398]  [<00534092>] rescan_partitions+0xea/0x3f8 
[1.898400]  [<0052ee10>] blkdev_reread_part+0x40/0x60 
[1.898403]  [<00674a10>] dasd_scan_partitions+0x90/0x148 
[1.898405]  [<0066eaa2>] dasd_change_state+0x912/0xba0 
[1.898407]  [<0066ed78>] do_kick_device+0x48/0x98 
[1.898410]  [<001644b2>] process_one_work+0x1d2/0x458 
[1.898410] [ cut here ]
[1.898411] kernel BUG at block/bio.c:1798!
[1.898420]  [<00164794>] worker_thread+0x5c/0x478 
[1.898422]  [<0016b700>] kthread+0x148/0x160 
[1.898425]  [<0083e54a>] kernel_thread_starter+0x6/0xc 
[1.898427]  [<0083e544>] kernel_thread_starter+0x0/0xc 
[1.898428] Last Breaking-Event-Address:
[1.898429]  [<0050e426>] bio_split+0xae/0xb8
[1.898431]  
[1.898432] illegal operation: 0001 ilc:1 [#2] 
[1.898432] Kernel panic - not syncing: Fatal exception: panic_on_oops



Bisect log and config attached. I'll look at min/max users in the affected
areas later today.

Sebastian#
# Automatically generated file; DO NOT EDIT.
# Linux/s390 4.16.0 Kernel Configuration
#
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_CPU_BIG_ENDIAN=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_GENERIC_LOCKBREAK=y
CONFIG_PGSTE=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_KEXEC=y
CONFIG_AUDIT_ARCH=y
CONFIG_NO_IOPORT_MAP=y
# CONFIG_PCI_QUIRKS is not set
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_S390=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_PGTABLE_LEVELS=5
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not se

Re: [PATCH V4 2/2] mmc: sdhci-msm: support voltage pad switching

2018-04-06 Thread Vijay Viswanath



On 3/29/2018 4:23 AM, Doug Anderson wrote:

Hi,

On Wed, Mar 28, 2018 at 6:08 AM, Vijay Viswanath
 wrote:

From: Krishna Konda 

The PADs for SD card are dual-voltage that support 3v/1.8v. Those PADs
have a control signal  (io_pad_pwr_switch/mode18 ) that indicates
whether the PAD works in 3v or 1.8v.

SDHC core on msm platforms should have IO_PAD_PWR_SWITCH bit set/unset
based on actual voltage used for IO lines. So when power irq is
triggered for io high or io low, the driver should check the voltages
supported and set the pad accordingly.

Signed-off-by: Krishna Konda 
Signed-off-by: Venkat Gopalakrishnan 
Signed-off-by: Vijay Viswanath 
---
  drivers/mmc/host/sdhci-msm.c | 64 ++--
  1 file changed, 62 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index 2fcd9010..bbf9626 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -78,12 +78,15 @@
  #define CORE_HC_MCLK_SEL_DFLT  (2 << 8)
  #define CORE_HC_MCLK_SEL_HS400 (3 << 8)
  #define CORE_HC_MCLK_SEL_MASK  (3 << 8)
+#define CORE_IO_PAD_PWR_SWITCH_EN  (1 << 15)
+#define CORE_IO_PAD_PWR_SWITCH  (1 << 16)
  #define CORE_HC_SELECT_IN_EN   BIT(18)
  #define CORE_HC_SELECT_IN_HS400(6 << 19)
  #define CORE_HC_SELECT_IN_MASK (7 << 19)

  #define CORE_3_0V_SUPPORT  (1 << 25)
  #define CORE_1_8V_SUPPORT  (1 << 26)
+#define CORE_VOLT_SUPPORT  (CORE_3_0V_SUPPORT | CORE_1_8V_SUPPORT)

  #define CORE_CSR_CDC_CTLR_CFG0 0x130
  #define CORE_SW_TRIG_FULL_CALIBBIT(16)
@@ -1109,7 +1112,7 @@ static void sdhci_msm_handle_pwr_irq(struct sdhci_host 
*host, int irq)
 u32 irq_status, irq_ack = 0;
 int retry = 10;
 u32 pwr_state = 0, io_level = 0;
-
+   u32 config;

 irq_status = readl_relaxed(msm_host->core_mem + CORE_PWRCTL_STATUS);
 irq_status &= INT_MASK;
@@ -1166,6 +1169,45 @@ static void sdhci_msm_handle_pwr_irq(struct sdhci_host 
*host, int irq)
  */
 writel_relaxed(irq_ack, msm_host->core_mem + CORE_PWRCTL_CTL);

+   /*
+* If we don't have info regarding the voltage levels supported by
+* regulators, don't change the IO PAD PWR SWITCH.
+*/
+   if (msm_host->caps_0 & CORE_VOLT_SUPPORT) {
+   /* Ensure order between core_mem and hc_mem */
+   mb();


Like in v2, I don't understand why you need a mb() before the read
from CORE_VENDOR_SPEC.  No reads or writes to the core_mem will affect
the value you're reading here, so you need no barrier.

If you need a barrier before the _write_ to CORE_VENDOR_SPEC then add
it below.  Then in the case where the config doesn't change you have
no barriers.



+   /*
+* We should unset IO PAD PWR switch only if the register write
+* can set IO lines high and the regulator also switches to 3 V.
+* Else, we should keep the IO PAD PWR switch set.
+* This is applicable to certain targets where eMMC vccq supply
+* is only 1.8V. In such targets, even during REQ_IO_HIGH, the
+* IO PAD PWR switch must be kept set to reflect actual
+* regulator voltage. This way, during initialization of
+* controllers with only 1.8V, we will set the IO PAD bit
+* without waiting for a REQ_IO_LOW.
+*/


For the above comment, what about just:

new_config = config
if (msm_host->caps_0 == CORE_1_8V_SUPPORT) {
   new_config |= CORE_IO_PAD_PWR_SWITCH;
} else if (msm_host->caps_0 == CORE_3_3V_SUPPORT) {
   new_config &= ~CORE_IO_PAD_PWR_SWITCH;
} else if (msm_host->caps_0 & CORE_VOLT_SUPPORT) {
   if (io_level & REQ_IO_HIGH)
 new_config &= ~CORE_IO_PAD_PWR_SWITCH;
   else if (io_level & REQ_IO_LOW)
 new_config |= CORE_IO_PAD_PWR_SWITCH;
}


This looks a big mess of if/else. Does the above implementation have 
better performance compared to having two if/else with bit operations 
inside ? The latter looks much cleaner and faster.


If regulator only supports 3V and we get a io_low from BUS_OFF ( 
REQ_IO_LOW should never come if we don't support 1.8V), it is ok to set 
io pad.



if (config != new_config) {
  ...
}

AKA: first check if it only supports one voltage and pick that one.
Else if it supports both you can use the request.  This might be more
important if you get rid of the initial setting in
sdhci_msm_set_regulator_caps() as I'm suggesting.



+   config = readl_relaxed(host->ioaddr + CORE_VENDOR_SPEC);
+
+   if (((io_level & REQ_IO_HIGH) && (msm_host->caps_0 &
+   CORE_3_0V_SUPPORT)) &&
+   (config & CORE_IO_PAD_PWR_SWITCH)) {
+   config &= ~CORE_IO_PAD_PWR_SWITCH;
+   writel_relaxed(config,
+   host->ioaddr + CORE_VENDOR_SPEC);
+  

Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Daniel Stone
Hi Gerd,

On 14 March 2018 at 08:03, Gerd Hoffmann  wrote:
>> Either mlock account (because it's mlocked defacto), and get_user_pages
>> won't do that for you.
>>
>> Or you write the full-blown userptr implementation, including mmu_notifier
>> support (see i915 or amdgpu), but that also requires Christian Königs
>> latest ->invalidate_mapping RFC for dma-buf (since atm exporting userptr
>> buffers is a no-go).
>
> I guess I'll look at mlock accounting for starters then.  Easier for
> now, and leaves the door open to switch to userptr later as this should
> be transparent to userspace.

Out of interest, do you have usecases for full userptr support? Maybe
another way would be to allow creation of dmabufs from memfds.

Cheers,
Daniel


A question of sleeping with interrupts are disabled in start_kernel()

2018-04-06 Thread Jia-Ju Bai

Hello,

I have a question of the call path init/main.c:
init/main.c: start_kernel() ->
kernel/events/core.c: perf_pmu_register() ->
kernel/events/core.c: perf_event_init() ->
kernel/events/core.c: pmu_dev_alloc()

In this call path, start_kernel() calls local_irq_disable() to disable 
the interrupt;
perf_pmu_register() calls mutex_lock() and idr_alloc(GFP_KERNEL), and 
they can sleep;

pmu_dev_alloc() calls kzalloc(GFP_KERNEL), and it can sleep.

In my opinion, this code may sleep with interrupts are disabled.
I wonder why this code is okay?


Best wishes,
Jia-Ju Bai


drivers/gpu/drm/bridge/sil-sii8620.c:2405: undefined reference to `extcon_unregister_notifier'

2018-04-06 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   38c23685b273cfb4ccf31a199feccce3bdcb5d83
commit: 688838442147d9dd94c2ef7c2c31a35cf150c5fa drm/bridge/sii8620: use 
micro-USB cable detection logic to detect MHL
date:   4 weeks ago
config: i386-randconfig-x0-04061534 (attached as .config)
compiler: gcc-5 (Debian 5.5.0-3) 5.4.1 20171010
reproduce:
git checkout 688838442147d9dd94c2ef7c2c31a35cf150c5fa
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/bridge/sil-sii8620.o: In function `sii8620_remove':
>> drivers/gpu/drm/bridge/sil-sii8620.c:2405: undefined reference to 
>> `extcon_unregister_notifier'
   drivers/gpu/drm/bridge/sil-sii8620.o: In function `sii8620_extcon_init':
>> drivers/gpu/drm/bridge/sil-sii8620.c:2229: undefined reference to 
>> `extcon_find_edev_by_node'
>> drivers/gpu/drm/bridge/sil-sii8620.c:2241: undefined reference to 
>> `extcon_register_notifier'
   drivers/gpu/drm/bridge/sil-sii8620.o: In function `sii8620_extcon_work':
>> drivers/gpu/drm/bridge/sil-sii8620.c:2189: undefined reference to 
>> `extcon_get_state'

vim +2405 drivers/gpu/drm/bridge/sil-sii8620.c

  2212  
  2213  static int sii8620_extcon_init(struct sii8620 *ctx)
  2214  {
  2215  struct extcon_dev *edev;
  2216  struct device_node *musb, *muic;
  2217  int ret;
  2218  
  2219  /* get micro-USB connector node */
  2220  musb = of_graph_get_remote_node(ctx->dev->of_node, 1, -1);
  2221  /* next get micro-USB Interface Controller node */
    muic = of_get_next_parent(musb);
  2223  
  2224  if (!muic) {
  2225  dev_info(ctx->dev, "no extcon found, switching to 
'always on' mode\n");
  2226  return 0;
  2227  }
  2228  
> 2229  edev = extcon_find_edev_by_node(muic);
  2230  of_node_put(muic);
  2231  if (IS_ERR(edev)) {
  2232  if (PTR_ERR(edev) == -EPROBE_DEFER)
  2233  return -EPROBE_DEFER;
  2234  dev_err(ctx->dev, "Invalid or missing extcon\n");
  2235  return PTR_ERR(edev);
  2236  }
  2237  
  2238  ctx->extcon = edev;
  2239  ctx->extcon_nb.notifier_call = sii8620_extcon_notifier;
  2240  INIT_WORK(&ctx->extcon_wq, sii8620_extcon_work);
> 2241  ret = extcon_register_notifier(edev, EXTCON_DISP_MHL, 
> &ctx->extcon_nb);
  2242  if (ret) {
  2243  dev_err(ctx->dev, "failed to register notifier for 
MHL\n");
  2244  return ret;
  2245  }
  2246  
  2247  return 0;
  2248  }
  2249  
  2250  static inline struct sii8620 *bridge_to_sii8620(struct drm_bridge 
*bridge)
  2251  {
  2252  return container_of(bridge, struct sii8620, bridge);
  2253  }
  2254  
  2255  static int sii8620_attach(struct drm_bridge *bridge)
  2256  {
  2257  struct sii8620 *ctx = bridge_to_sii8620(bridge);
  2258  
  2259  sii8620_init_rcp_input_dev(ctx);
  2260  
  2261  return sii8620_clear_error(ctx);
  2262  }
  2263  
  2264  static void sii8620_detach(struct drm_bridge *bridge)
  2265  {
  2266  struct sii8620 *ctx = bridge_to_sii8620(bridge);
  2267  
  2268  rc_unregister_device(ctx->rc_dev);
  2269  }
  2270  
  2271  static enum drm_mode_status sii8620_mode_valid(struct drm_bridge 
*bridge,
  2272   const struct drm_display_mode 
*mode)
  2273  {
  2274  struct sii8620 *ctx = bridge_to_sii8620(bridge);
  2275  bool can_pack = ctx->devcap[MHL_DCAP_VID_LINK_MODE] &
  2276  MHL_DCAP_VID_LINK_PPIXEL;
  2277  unsigned int max_pclk = sii8620_is_mhl3(ctx) ? MHL3_MAX_LCLK :
  2278 MHL1_MAX_LCLK;
  2279  max_pclk /= can_pack ? 2 : 3;
  2280  
  2281  return (mode->clock > max_pclk) ? MODE_CLOCK_HIGH : MODE_OK;
  2282  }
  2283  
  2284  static bool sii8620_mode_fixup(struct drm_bridge *bridge,
  2285 const struct drm_display_mode *mode,
  2286 struct drm_display_mode *adjusted_mode)
  2287  {
  2288  struct sii8620 *ctx = bridge_to_sii8620(bridge);
  2289  int max_lclk;
  2290  bool ret = true;
  2291  
  2292  mutex_lock(&ctx->lock);
  2293  
  2294  max_lclk = sii8620_is_mhl3(ctx) ? MHL3_MAX_LCLK : MHL1_MAX_LCLK;
  2295  if (max_lclk > 3 * adjusted_mode->clock) {
  2296  ctx->use_packed_pixel = 0;
  2297  goto end;
  2298  }
  2299  if ((ctx->devcap[MHL_DCAP_VID_LINK_MODE] & 
MHL_DCAP_VID_LINK_PPIXEL) &&
  2300  max_lclk > 2 * adjusted_mode->clock) {
  2301  ctx->use_packed_pixel = 1;
  2302  goto end;
  2303 

Re: arch/arm/kernel/setup.c fails to compile for NOMMU

2018-04-06 Thread Michal Hocko
On Fri 25-08-17 08:45:40, Michal Hocko wrote:
> On Thu 24-08-17 17:17:41, Russell King - ARM Linux wrote:
> > On Fri, Aug 18, 2017 at 01:24:02PM +0200, Michal Hocko wrote:
> > > Hi Russel,
> > > I have a battery of configs for compile testing and for some time I've
> > > been seeing the following compilation error with nommu config (attached)
> > > 
> > > arch/arm/kernel/setup.c: In function 'reserve_crashkernel':
> > > arch/arm/kernel/setup.c:1005:25: error: 'SECTION_SIZE' undeclared (first
> > > use in this function)
> > >  crash_size, SECTION_SIZE);
> > > 
> > > I didn't get to look what is going on here, maybe my config is just too
> > > artificial but the primary reason is that SECTION_SIZE is not defined in
> > > pgtable-nommu.h. To be honest I am not familiar with nommu very much and
> > > it smells like the whole reserve_crashkernel doesn't really make any
> > > sense on those configs. Could you have a look what is the best fix
> > > please?
> > 
> > Hi,
> > 
> > I suspect that mach-netx has never been tested in nommu configurations
> > (ditto for many of the older platforms, which pre-date merging nommu
> > support.)
> > 
> > Maybe the best solution is to make these old platforms depend on MMU.
> > 
> > However, I'm wondering whether kexec makes sense for !MMU - that's
> > probably something that hasn't been tested and doesn't actually work.
> > So maybe another approach would be to make kexec depend on MMU for
> > ARM - but I'm afraid I don't really know.
> 
> Yeah, I've disabled KEXEC in my testing config. All I do care about is
> to test nommu specific code paths in MM code.
> 
> > I only have very limited nommu experience.
> 
> me too
> 
> So what would you say about the following?

It's been some time and it seems this has fallen between cracks. Is this
worth puruing or I should just forget about it and drop it on the floor?
> ---
> From 2707f3bf00181bbc9dcf6a1f287eb7369141e955 Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Fri, 25 Aug 2017 08:40:09 +0200
> Subject: [PATCH] arm: make kexec depend on MMU
> 
> arm nommu config with KEXEC enabled doesn't compile
> arch/arm/kernel/setup.c: In function 'reserve_crashkernel':
> arch/arm/kernel/setup.c:1005:25: error: 'SECTION_SIZE' undeclared (first
> use in this function)
>  crash_size, SECTION_SIZE);
> 
> since 61603016e212 ("ARM: kexec: fix crashkernel= handling") which is
> over one year without anybody noticing. I have only noticed beause of
> my testing nommu config which somehow gained CONFIG_KEXEC without
> an intention. This suggests that nobody is actually using KEXEC
> on nommu ARM configs. It is even a question whether kexec works with
> nommu.
> 
> Make KEXEC depend on MMU to make this clear. If somebody wants to enable
> there will be probably more things to take care.
> 
> Signed-off-by: Michal Hocko 
> ---
>  arch/arm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 3f4aa9179337..c8603195d7fc 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -2003,6 +2003,7 @@ config KEXEC
>   bool "Kexec system call (EXPERIMENTAL)"
>   depends on (!SMP || PM_SLEEP_SMP)
>   depends on !CPU_V7M
> + depends on MMU
>   select KEXEC_CORE
>   help
> kexec is a system call that implements the ability to shutdown your
> -- 
> 2.13.2
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs


Re: [PATCH] gpio: dwapb: Add support for 32 interrupts

2018-04-06 Thread Geert Uytterhoeven
Hi Phil,

On Thu, Apr 5, 2018 at 11:42 AM, Phil Edworthy
 wrote:
> On 30 March 2018 22:26 Andy Shevchenko wrote:
>> On Wed, Mar 28, 2018 at 5:22 PM, Phil Edworthy wrote:
>> > The DesignWare GPIO IP can be configured for either 1 or 32
>> > interrupts,
>>
>> 1 to 32, or just a choice between two?
> Just a choice of 1 or 32.
> Note that by 'configured' I am talking about the hardware being configured in
> RTL prior to manufacturing a device. Once made, you cannot change it.
> This configuration affects the number of output interrupt signals from the 
> GPIO
> Controller block that are connected to an interrupt controller.

Differentiating between different versions of an IP block using DT properties
is usually a bad idea, for several reasons:
  - What if you discover another difference later?
  - You cannot add differentiating properties retroactively, because
of backwards
 compatibility with old DTBS.

Hence I think you should introduce a new compatible value instead.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH resend] mm/page_alloc: fix comment is __get_free_pages

2018-04-06 Thread Michal Hocko
On Wed 29-11-17 17:04:46, Michal Hocko wrote:
[...]
> From 000bb422fe07adbfa8cd8ed953b18f48647a45d6 Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Wed, 29 Nov 2017 17:02:33 +0100
> Subject: [PATCH] mm: drop VM_BUG_ON from __get_free_pages
> 
> There is no real reason to blow up just because the caller doesn't know
> that __get_free_pages cannot return highmem pages. Simply fix that up
> silently. Even if we have some confused users such a fixup will not be
> harmful.
> 
> Signed-off-by: Michal Hocko 

Andrew, have we reached any conclusion for this? Should I repost or drop
it on the floor?

> ---
>  mm/page_alloc.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0d518e9b2ee8..3dd960ea8c13 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4284,9 +4284,7 @@ unsigned long __get_free_pages(gfp_t gfp_mask, unsigned 
> int order)
>* __get_free_pages() returns a virtual address, which cannot represent
>* a highmem page
>*/
> - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0);
> -
> - page = alloc_pages(gfp_mask, order);
> + page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order);
>   if (!page)
>   return 0;
>   return (unsigned long) page_address(page);
> -- 
> 2.15.0
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/5] arm64: entry: isb in el1_irq

2018-04-06 Thread James Morse
Hi Yury,

On 05/04/18 18:17, Yury Norov wrote:
> Kernel text patching framework relies on IPI to ensure that other
> SMP cores observe the change. Target core calls isb() in IPI handler

(Odd, if its just to synchronize the CPU, taking the IPI should be enough).


> path, but not at the beginning of el1_irq entry. There's a chance
> that modified instruction will appear prior isb(), and so will not be
> observed.
> 
> This patch inserts isb early at el1_irq entry to avoid that chance.


> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ec2ee720e33e..9c06b4b80060 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -593,6 +593,7 @@ ENDPROC(el1_sync)
>  
>   .align  6
>  el1_irq:
> + isb // pairs with 
> aarch64_insn_patch_text
>   kernel_entry 1
>   enable_da_f
>  #ifdef CONFIG_TRACE_IRQFLAGS
> 

An ISB at the beginning of the vectors? This is odd, taking an IRQ to get in
here would be a context-synchronization-event too, so the ISB is superfluous.

The ARM-ARM  has a list of 'Context-Synchronization event's (Glossary on page
6480 of DDI0487B.b), paraphrasing:
* ISB
* Taking an exception
* ERET
* (...loads of debug stuff...)


Thanks,

James


  1   2   3   4   5   6   7   8   9   10   >