Re: [PATCH V3 6/6] crypto/nx: Add P9 NX support for 842 compression engine

2017-09-03 Thread Haren Myneni
On 09/02/2017 09:17 AM, Dan Streetman wrote:
> On Sat, Sep 2, 2017 at 4:40 AM, Haren Myneni  wrote:
>> On 08/29/2017 06:58 AM, Dan Streetman wrote:
>>> On Sat, Jul 22, 2017 at 1:01 AM, Haren Myneni  
>>> wrote:

 This patch adds P9 NX support for 842 compression engine. Virtual
 Accelerator Switchboard (VAS) is used to access 842 engine on P9.

 For each NX engine per chip, setup receive window using
 vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
 pid and tid values. This unique (lpid, pid, tid) combination will
 be used to identify the target engine.

 For crypto open request, open send window on the NX engine for
 the corresponding chip / cpu where the open request is executed.
 This send window will be closed upon crypto close request.

 NX provides high and normal priority FIFOs. For compression /
 decompression requests, we use only hight priority FIFOs in kernel.

 Each NX request will be communicated to VAS using copy/paste
 instructions with vas_copy_crb() / vas_paste_crb() functions.

 Signed-off-by: Haren Myneni 
 ---
  drivers/crypto/nx/Kconfig  |   1 +
  drivers/crypto/nx/nx-842-powernv.c | 375 
 -
  drivers/crypto/nx/nx-842.c |   2 +-
  3 files changed, 371 insertions(+), 7 deletions(-)

 diff --git a/drivers/crypto/nx/Kconfig b/drivers/crypto/nx/Kconfig
 index ad7552a6998c..cd5dda9c48f4 100644
 --- a/drivers/crypto/nx/Kconfig
 +++ b/drivers/crypto/nx/Kconfig
 @@ -38,6 +38,7 @@ config CRYPTO_DEV_NX_COMPRESS_PSERIES
  config CRYPTO_DEV_NX_COMPRESS_POWERNV
 tristate "Compression acceleration support on PowerNV platform"
 depends on PPC_POWERNV
 +   depends on PPC_VAS
 default y
 help
   Support for PowerPC Nest (NX) compression acceleration. This
 diff --git a/drivers/crypto/nx/nx-842-powernv.c 
 b/drivers/crypto/nx/nx-842-powernv.c
 index c0dd4c7e17d3..13089a0b9dfa 100644
 --- a/drivers/crypto/nx/nx-842-powernv.c
 +++ b/drivers/crypto/nx/nx-842-powernv.c
 @@ -23,6 +23,7 @@
  #include 
  #include 
  #include 
 +#include 

  MODULE_LICENSE("GPL");
  MODULE_AUTHOR("Dan Streetman ");
 @@ -32,6 +33,9 @@ MODULE_ALIAS_CRYPTO("842-nx");

  #define WORKMEM_ALIGN  (CRB_ALIGN)
  #define CSB_WAIT_MAX   (5000) /* ms */
 +#define VAS_RETRIES(10)
 +/* # of requests allowed per RxFIFO at a time. 0 for unlimited */
 +#define MAX_CREDITS_PER_RXFIFO (1024)

  struct nx842_workmem {
 /* Below fields must be properly aligned */
 @@ -42,16 +46,27 @@ struct nx842_workmem {

 ktime_t start;

 +   struct vas_window *txwin;   /* Used with VAS function */
 char padding[WORKMEM_ALIGN]; /* unused, to allow alignment */
  } __packed __aligned(WORKMEM_ALIGN);

  struct nx842_coproc {
 unsigned int chip_id;
 unsigned int ct;
 -   unsigned int ci;
 +   unsigned int ci;/* Coprocessor instance, used with icswx */
 +   struct {
 +   struct vas_window *rxwin;
 +   int id;
 +   } vas;
 struct list_head list;
  };

 +/*
 + * Send the request to NX engine on the chip for the corresponding CPU
 + * where the process is executing. Use with VAS function.
 + */
 +static DEFINE_PER_CPU(struct nx842_coproc *, coproc_inst);
 +
  /* no cpu hotplug on powernv, so this list never changes after init */
  static LIST_HEAD(nx842_coprocs);
  static unsigned int nx842_ct;  /* used in icswx function */
 @@ -513,6 +528,105 @@ static int nx842_exec_icswx(const unsigned char *in, 
 unsigned int inlen,
  }

  /**
 + * nx842_exec_vas - compress/decompress data using the 842 algorithm
 + *
 + * (De)compression provided by the NX842 coprocessor on IBM PowerNV 
 systems.
 + * This compresses or decompresses the provided input buffer into the 
 provided
 + * output buffer.
 + *
 + * Upon return from this function @outlen contains the length of the
 + * output data.  If there is an error then @outlen will be 0 and an
 + * error will be specified by the return code from this function.
 + *
 + * The @workmem buffer should only be used by one function call at a time.
 + *
 + * @in: input buffer pointer
 + * @inlen: input buffer size
 + * @out: output buffer pointer
 + * @outlenp: output buffer size pointer
 + * @workmem: working memory buffer pointer, size determined by
 + *   nx842_powernv_driver.workmem_size
 + * @fc: function code, see CCW Function Codes in nx-842.h
 + *
 + * Returns:
 + *   0 Success, output of length @outlenp stored in the buffer
>>>

Linux 4.13: Reported regressions as of Sunday, 2017-09-03

2017-09-03 Thread Thorsten Leemhuis
Hi! Find below my fifth regression report for Linux 4.13. It lists 4
regressions I'm currently aware of. There are no new ones; 2 got fixed
since the last report.

You can also find the report at http://bit.ly/lnxregrep413 where I try
to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid And please tell me if
there is anything in the report that shouldn't be there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
Status: Asked on the list, but looks like issue gets ignored by everyone
Note: I'm a bit unsure if adding this issue to this list was a good
idea. Side note: Was reported against linux-next in May already
Reported: 2017-07-10
http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Cause: https://git.kernel.org/torvalds/c/e585513b76f7

[lkp-robot] [Btrfs] 28785f70ef: xfstests.generic.273.fail
Status: Jeff: "We're not ignoring it. […] collection of bugs that
approximate a correct result, and we're addressing them individually.[…]"
Reported: 2017-07-26 Last known developer activity: 2017-08-10
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
https://lkml.kernel.org/r/bcd49705-e63a-4439-1620-57cd16f5b...@suse.com
Cause: https://git.kernel.org/torvalds/c/28785f70ef88
Linux-Regression-ID: lr#a7d273

Build regression: cc1: error: '-march=r3000' requires '-mfp32'
Note: is there any way to query 0-day to see if this is still happening?
Reported: 2017-08-13
https://lkml.org/lkml/2017/8/13/38
Cause: https://git.kernel.org/torvalds/c/89a55278dee4

usb:xhci: regression when ATI chipsets detected
Status: Fix in usb-next/usb-testing
Reported: 2017-08-23 Last known developer activity: 2017-08-28
https://lkml.kernel.org/r/1503485760-15146-1-git-send-email-sandeep.si...@amd.com
Cause: https://git.kernel.org/torvalds/c/e788787ef4f9


== Fixed since last report ==

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
Status: was a tracking bug that got closed by the developer that created it
Reported: 2017-07-24
https://bugzilla.kernel.org/show_bug.cgi?id=196459
Cause: https://git.kernel.org/torvalds/c/33e4f80ee69b
Linux-Regression-ID: lr#bd29ab

CIFS mount error -112 due to "SMB3 by default for security reasons"
Status: Situation not perfect, but improved a lot by
https://git.kernel.org/torvalds/c/7e682f766f28
Note: https://lkml.org/lkml/2017/8/31/843
Reported: 2017-08-06
https://bugzilla.kernel.org/show_bug.cgi?id=196599
Cause: https://git.kernel.org/torvalds/c/eef914a9eb5e /
https://git.kernel.org/torvalds/c/908b852df1d5
Linux-Regression-ID: lr#60efe5


Re: [PATCH] devicetree: Remove remaining references/tests for "chosen@0"

2017-09-03 Thread Robert P. J. Day
On Sun, 3 Sep 2017, Benjamin Herrenschmidt wrote:

> On Sat, 2017-09-02 at 04:43 -0400, Robert P. J. Day wrote:
> > Since, according to a recent devicetree ML posting by Rob Herring,
> > the node "/chosen@0" is most likely for real Open Firmware and
> > does not apply to DTSpec, remove all remaining tests and
> > references for that node, of which there are very few left:
>
> Technically that would break Open Firmware systems where the node is
> really called chosen@0
>
> Now I'm not sure such a thing actually exist however.
>
> My collection of DTs don't seem to have one, except in the ancient
> html variants that were extracted by the pengionppc folks for the
> original PowerMac 8600 but I wonder if that's a bug in the
> extraction script since they also have @0 on /packages etc...

  obviously, this isn't a priority issue, i was just working off a
comment by rob herring that "chosen@0" is not defined by the current
DTSpec 0.1, so it seemed appropriate to toss it. if there's a reason
to hang onto it, that's fine with me.

  however, given the diff stat of the change to remove every single
reference to that node name in the current kernel source:

 arch/microblaze/kernel/prom.c | 3 +--
 arch/mips/generic/yamon-dt.c  | 4 
 arch/powerpc/boot/oflib.c | 7 ++-
 drivers/of/base.c | 2 --
 drivers/of/fdt.c  | 5 +
 5 files changed, 4 insertions(+), 17 deletions(-)

it seems inconsistent that three architectures would be testing for
that node, but none of the rest. consistency suggests that every
architecture should take it into account, or none should.

  anyway, not a big deal, i'm fine with any decision.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH] devicetree: Remove remaining references/tests for "chosen@0"

2017-09-03 Thread Benjamin Herrenschmidt
On Sun, 2017-09-03 at 06:43 -0400, Robert P. J. Day wrote:
>   however, given the diff stat of the change to remove every single
> reference to that node name in the current kernel source:
> 
>  arch/microblaze/kernel/prom.c | 3 +--
>  arch/mips/generic/yamon-dt.c  | 4 
>  arch/powerpc/boot/oflib.c | 7 ++-
>  drivers/of/base.c | 2 --
>  drivers/of/fdt.c  | 5 +
>  5 files changed, 4 insertions(+), 17 deletions(-)
> 
> it seems inconsistent that three architectures would be testing for
> that node, but none of the rest. consistency suggests that every
> architecture should take it into account, or none should.
> 
>   anyway, not a big deal, i'm fine with any decision.

powerpc is the only one of the 3 who has an actual open firmware
implementation afaik.

In any case, I think you can probably remove from microblaze and
possibly mips but I'm a bit worried about the generic case and powerpc
boot.

Cheers,
Ben.



Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-09-03 Thread Jeff Cook
On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote:
> On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote:
> > I will wait for people to test and for result of my own test before
> > reposting if need be, otherwise i will post as separate patch.
> >
> > > But from a _very_ quick read-through this looks fine. But it obviously
> > > needs testing.
> > > 
> > > People - *especially* the people who saw issues under KVM - can you
> > > try out Jérôme's patch-series? I aded some people to the cc, the full
> > > series is on lkml. Jérôme - do you have a git branch for people to
> > > test that they could easily pull and try out?
> > 
> > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> > git://people.freedesktop.org/~glisse/linux
> 
> Tested your branch as of 10f07641, on a long list of guest VMs.
> No earth-shattering kaboom.

I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36
hours or so, also without incident.

Unlike most other reporters, I experienced a similar splat on 4.12:

Aug 03 15:02:47 kvm_master kernel: [ cut here ]
Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at
arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm]
Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost
tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4
xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_
MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class
aes_x86_6
4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc
videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio
videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid
intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec
snd_oxygen_lib snd_hda_core
Aug 03 15:02:47 kvm_master kernel:  snd_mpu401_uart snd_rawmidi
snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit
pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis
tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci
virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg
ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic
hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci
xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO)
zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass
vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto
mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio
dm_raid raid456 libcrc32c 
Aug 03 15:02:47 kvm_master kernel:  crc32c_generic crc32c_intel
async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
dm_mod dax raid1 md_mod  
Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2
Tainted: PB D W  O4.12.3-1-ARCH #1 
Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro
SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016  
Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn  
Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack:
c179880d8000 
Aug 03 15:02:47 kvm_master kernel: RIP:
0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm]  
Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS:
00010246 
Aug 03 15:02:47 kvm_master kernel: RAX:  RBX:
0009c07cce77 RCX: dead00ff   
Aug 03 15:02:47 kvm_master kernel: RDX:  RSI:
9fa82d6d6f08 RDI: f6e76701f300   
Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08:
0010 R09: 000d   
Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11:
9fa0a56b R12: 009c07cc   
Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14:
9f9e19dbb1b8 R15:    
Aug 03 15:02:47 kvm_master kernel: FS:  ()
GS:9fac5f34() knlGS:
Aug 03 15:02:47 kvm_master kernel: CS:  0010 DS:  ES:  CR0:
80050033   
Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000 CR3:
000570a09000 CR4: 003426e0   
Aug 03 15:02:47 kvm_master kernel: DR0:  DR1

Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-09-03 Thread taskboxtester
taskboxtes...@gmail.com liked your message with Boxer for Android.


On Sep 1, 2017 10:48 AM, Jeff Cook  wrote:

On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote:
> On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote:
> > I will wait for people to test and for result of my own test before
> > reposting if need be, otherwise i will post as separate patch.
> >
> > > But from a _very_ quick read-through this looks fine. But it obviously
> > > needs testing.
> > > 
> > > People - *especially* the people who saw issues under KVM - can you
> > > try out Jérôme's patch-series? I aded some people to the cc, the full
> > > series is on lkml. Jérôme - do you have a git branch for people to
> > > test that they could easily pull and try out?
> > 
> > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> > git://people.freedesktop.org/~glisse/linux
> 
> Tested your branch as of 10f07641, on a long list of guest VMs.
> No earth-shattering kaboom.

I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36
hours or so, also without incident.

Unlike most other reporters, I experienced a similar splat on 4.12:

Aug 03 15:02:47 kvm_master kernel: [ cut here ]
Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at
arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm]
Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost
tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4
xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_
MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class
aes_x86_6
4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc
videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio
videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid
intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec
snd_oxygen_lib snd_hda_core
Aug 03 15:02:47 kvm_master kernel:  snd_mpu401_uart snd_rawmidi
snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit
pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis
tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci
virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg
ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic
hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci
xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO)
zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass
vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto
mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio
dm_raid raid456 libcrc32c 
Aug 03 15:02:47 kvm_master kernel:  crc32c_generic crc32c_intel
async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
dm_mod dax raid1 md_mod  
Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2
Tainted: PB D W  O4.12.3-1-ARCH #1 
Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro
SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016  
Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn  
Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack:
c179880d8000 
Aug 03 15:02:47 kvm_master kernel: RIP:
0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm]  
Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS:
00010246 
Aug 03 15:02:47 kvm_master kernel: RAX:  RBX:
0009c07cce77 RCX: dead00ff   
Aug 03 15:02:47 kvm_master kernel: RDX:  RSI:
9fa82d6d6f08 RDI: f6e76701f300   
Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08:
0010 R09: 000d   
Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11:
9fa0a56b R12: 009c07cc   
Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14:
9f9e19dbb1b8 R15:    
Aug 03 15:02:47 kvm_master kernel: FS:  ()
GS:9fac5f34() knlGS:
Aug 03 15:02:47 kvm_master kernel: CS:  0010 DS:  ES:  CR0:
80050033   
Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000 CR3:
000570a09000 

[PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to system freq

2017-09-03 Thread Lukasz Majewski
The problem is visible in the following setup (on the imx6q):
"simple-audio-card" -> ssi2 -> I2S + I2C -> codec

The function call log (simple-card probe -> CONFIG_SND_SIMPLE_CARD):

asoc_simple_card_init_dai() @ sound/soc/generic/simple-card-utils.c
snd_soc_dai_set_sysclk()
fsl_ssi_set_dai_sysclk() @ sound/soc/fsl/fsl_ssi.c

The last call is changing the bit clock (BCLK) frequency to SSI's IP
block clock (ipg = 66 MHz) [1].
This is wrong, since IMX SSI block requires the I2S BCLK to be less
than 1/5 of [1].

As a result the driver initialization passes without any errors, but the
speaker-test test case breaks.

This commit checks if the fsl_ssi_set_dai_sysclk() frequency passed is
not equal to [1].

Signed-off-by: Lukasz Majewski 
---
 sound/soc/fsl/fsl_ssi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 173cb84..1186fa9 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -809,6 +809,8 @@ static int fsl_ssi_set_dai_sysclk(struct snd_soc_dai 
*cpu_dai,
int clk_id, unsigned int freq, int dir)
 {
struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
+   if (clk_get_rate(ssi_private->clk) == freq)
+   return 0;
 
ssi_private->bitclk_freq = freq;
 
-- 
2.1.4



Re: [PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to system freq

2017-09-03 Thread Fabio Estevam
[Sorry for the top-posting]


The driver currently has:


/*
* Hardware limitation: The bclk rate must be
* never greater than 1/5 IPG clock rate
*/
if (freq * 5 > clk_get_rate(ssi_private->clk)) {
dev_err(cpu_dai->dev, "bitclk > ipgclk/5\n");
return -EINVAL;
}


Isn't this properly taking care of the clock restriction?


From: Lukasz Majewski 
Sent: Sunday, September 3, 2017 8:05:01 AM
To: Timur Tabi; Nicolin Chen; Xiubo Li; Fabio Estevam; Liam Girdwood; Mark 
Brown; Jaroslav Kysela; Takashi Iwai
Cc: alsa-de...@alsa-project.org; linuxppc-dev@lists.ozlabs.org; 
linux-ker...@vger.kernel.org; Lukasz Majewski
Subject: [PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to 
system freq

The problem is visible in the following setup (on the imx6q):
"simple-audio-card" -> ssi2 -> I2S + I2C -> codec

The function call log (simple-card probe -> CONFIG_SND_SIMPLE_CARD):

asoc_simple_card_init_dai() @ sound/soc/generic/simple-card-utils.c
snd_soc_dai_set_sysclk()
fsl_ssi_set_dai_sysclk() @ sound/soc/fsl/fsl_ssi.c

The last call is changing the bit clock (BCLK) frequency to SSI's IP
block clock (ipg = 66 MHz) [1].
This is wrong, since IMX SSI block requires the I2S BCLK to be less
than 1/5 of [1].

As a result the driver initialization passes without any errors, but the
speaker-test test case breaks.

This commit checks if the fsl_ssi_set_dai_sysclk() frequency passed is
not equal to [1].

Signed-off-by: Lukasz Majewski 
---
 sound/soc/fsl/fsl_ssi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 173cb84..1186fa9 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -809,6 +809,8 @@ static int fsl_ssi_set_dai_sysclk(struct snd_soc_dai 
*cpu_dai,
 int clk_id, unsigned int freq, int dir)
 {
 struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
+   if (clk_get_rate(ssi_private->clk) == freq)
+   return 0;

 ssi_private->bitclk_freq = freq;

--
2.1.4



[PATCH 1/10] KVM: PPC: Book3S HV: Use ARRAY_SIZE macro

2017-09-03 Thread Thomas Meyer
Use ARRAY_SIZE macro, rather than explicitly coding some variant of it
yourself.
Found with: find -type f -name "*.c" -o -name "*.h" | xargs perl -p -i -e
's/\bsizeof\s*\(\s*(\w+)\s*\)\s*\ /\s*sizeof\s*\(\s*\1\s*\[\s*0\s*\]\s*\)
/ARRAY_SIZE(\1)/g' and manual check/verification.

Signed-off-by: Thomas Meyer 
---

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 359c79cdf0cc..ae80181c4e1f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -19,6 +19,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1766,7 +1767,7 @@ static struct debugfs_timings_element {
{"cede",offsetof(struct kvm_vcpu, arch.cede_time)},
 };
 
-#define N_TIMINGS  (sizeof(timings) / sizeof(timings[0]))
+#define N_TIMINGS  (ARRAY_SIZE(timings))
 
 struct debugfs_timings_state {
struct kvm_vcpu *vcpu;


[PATCH] ASoC: fsl_spdif: make const arrays rate static

2017-09-03 Thread Colin King
From: Colin Ian King 

Don't populate the const arrays rate on the stack, instead make them
static. Makes the object code smaller by 220 bytes:

Before:
   textdata bss dec hex filename
  243859776 128   3428985f1 sound/soc/fsl/fsl_spdif.o

After:
   textdata bss dec hex filename
  240059936 128   340698515 sound/soc/fsl/fsl_spdif.o

Signed-off-by: Colin Ian King 
---
 sound/soc/fsl/fsl_spdif.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 7e6cc4da0088..4f7469c1864c 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -1110,7 +1110,7 @@ static u32 fsl_spdif_txclk_caldiv(struct fsl_spdif_priv 
*spdif_priv,
struct clk *clk, u64 savesub,
enum spdif_txrate index, bool round)
 {
-   const u32 rate[] = { 32000, 44100, 48000, 96000, 192000 };
+   static const u32 rate[] = { 32000, 44100, 48000, 96000, 192000 };
bool is_sysclk = clk_is_match(clk, spdif_priv->sysclk);
u64 rate_ideal, rate_actual, sub;
u32 sysclk_dfmin, sysclk_dfmax;
@@ -1169,7 +1169,7 @@ static u32 fsl_spdif_txclk_caldiv(struct fsl_spdif_priv 
*spdif_priv,
 static int fsl_spdif_probe_txclk(struct fsl_spdif_priv *spdif_priv,
enum spdif_txrate index)
 {
-   const u32 rate[] = { 32000, 44100, 48000, 96000, 192000 };
+   static const u32 rate[] = { 32000, 44100, 48000, 96000, 192000 };
struct platform_device *pdev = spdif_priv->pdev;
struct device *dev = &pdev->dev;
u64 savesub = 10, ret;
-- 
2.14.1



Re: [PATCH V3 6/6] crypto/nx: Add P9 NX support for 842 compression engine

2017-09-03 Thread Dan Streetman
On Sun, Sep 3, 2017 at 4:32 AM, Haren Myneni  wrote:
> On 09/02/2017 09:17 AM, Dan Streetman wrote:
>> On Sat, Sep 2, 2017 at 4:40 AM, Haren Myneni  
>> wrote:
>>> On 08/29/2017 06:58 AM, Dan Streetman wrote:
 On Sat, Jul 22, 2017 at 1:01 AM, Haren Myneni  
 wrote:
>
> This patch adds P9 NX support for 842 compression engine. Virtual
> Accelerator Switchboard (VAS) is used to access 842 engine on P9.
>
> For each NX engine per chip, setup receive window using
> vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
> pid and tid values. This unique (lpid, pid, tid) combination will
> be used to identify the target engine.
>
> For crypto open request, open send window on the NX engine for
> the corresponding chip / cpu where the open request is executed.
> This send window will be closed upon crypto close request.
>
> NX provides high and normal priority FIFOs. For compression /
> decompression requests, we use only hight priority FIFOs in kernel.
>
> Each NX request will be communicated to VAS using copy/paste
> instructions with vas_copy_crb() / vas_paste_crb() functions.
>
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/Kconfig  |   1 +
>  drivers/crypto/nx/nx-842-powernv.c | 375 
> -
>  drivers/crypto/nx/nx-842.c |   2 +-
>  3 files changed, 371 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/crypto/nx/Kconfig b/drivers/crypto/nx/Kconfig
> index ad7552a6998c..cd5dda9c48f4 100644
> --- a/drivers/crypto/nx/Kconfig
> +++ b/drivers/crypto/nx/Kconfig
> @@ -38,6 +38,7 @@ config CRYPTO_DEV_NX_COMPRESS_PSERIES
>  config CRYPTO_DEV_NX_COMPRESS_POWERNV
> tristate "Compression acceleration support on PowerNV platform"
> depends on PPC_POWERNV
> +   depends on PPC_VAS
> default y
> help
>   Support for PowerPC Nest (NX) compression acceleration. This
> diff --git a/drivers/crypto/nx/nx-842-powernv.c 
> b/drivers/crypto/nx/nx-842-powernv.c
> index c0dd4c7e17d3..13089a0b9dfa 100644
> --- a/drivers/crypto/nx/nx-842-powernv.c
> +++ b/drivers/crypto/nx/nx-842-powernv.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Dan Streetman ");
> @@ -32,6 +33,9 @@ MODULE_ALIAS_CRYPTO("842-nx");
>
>  #define WORKMEM_ALIGN  (CRB_ALIGN)
>  #define CSB_WAIT_MAX   (5000) /* ms */
> +#define VAS_RETRIES(10)
> +/* # of requests allowed per RxFIFO at a time. 0 for unlimited */
> +#define MAX_CREDITS_PER_RXFIFO (1024)
>
>  struct nx842_workmem {
> /* Below fields must be properly aligned */
> @@ -42,16 +46,27 @@ struct nx842_workmem {
>
> ktime_t start;
>
> +   struct vas_window *txwin;   /* Used with VAS function */
> char padding[WORKMEM_ALIGN]; /* unused, to allow alignment */
>  } __packed __aligned(WORKMEM_ALIGN);
>
>  struct nx842_coproc {
> unsigned int chip_id;
> unsigned int ct;
> -   unsigned int ci;
> +   unsigned int ci;/* Coprocessor instance, used with icswx 
> */
> +   struct {
> +   struct vas_window *rxwin;
> +   int id;
> +   } vas;
> struct list_head list;
>  };
>
> +/*
> + * Send the request to NX engine on the chip for the corresponding CPU
> + * where the process is executing. Use with VAS function.
> + */
> +static DEFINE_PER_CPU(struct nx842_coproc *, coproc_inst);
> +
>  /* no cpu hotplug on powernv, so this list never changes after init */
>  static LIST_HEAD(nx842_coprocs);
>  static unsigned int nx842_ct;  /* used in icswx function */
> @@ -513,6 +528,105 @@ static int nx842_exec_icswx(const unsigned char 
> *in, unsigned int inlen,
>  }
>
>  /**
> + * nx842_exec_vas - compress/decompress data using the 842 algorithm
> + *
> + * (De)compression provided by the NX842 coprocessor on IBM PowerNV 
> systems.
> + * This compresses or decompresses the provided input buffer into the 
> provided
> + * output buffer.
> + *
> + * Upon return from this function @outlen contains the length of the
> + * output data.  If there is an error then @outlen will be 0 and an
> + * error will be specified by the return code from this function.
> + *
> + * The @workmem buffer should only be used by one function call at a 
> time.
> + *
> + * @in: input buffer pointer
> + * @inlen: input buffer size
> + * @out: output buffer pointer
> + * @outlenp: output buffer size pointer
> + * @workmem: working memory buffer pointer, size determined by
> + *   nx842_powern

Re: [PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to system freq

2017-09-03 Thread Łukasz Majewski

Hi Fabio,



[Sorry for the top-posting]


The driver currently has:


/*
* Hardware limitation: The bclk rate must be
* never greater than 1/5 IPG clock rate
*/
if (freq * 5 > clk_get_rate(ssi_private->clk)) {
dev_err(cpu_dai->dev, "bitclk > ipgclk/5\n");
return -EINVAL;
}



Unfortunately not.

This is the part of fsl_ssi_set_bclk() function which is called after 
fsl_ssi_set_dai_sysclk() (which sets ssi_private->bitclk_freq = freq;).


Before the aforementioned check we do have:

if (ssi_private->bitclk_freq)
freq = ssi_private->bitclk_freq;
else
freq = params_channels(hw_params) * 32 *
params_rate(hw_params);


Which assigns freq = bitclk_freq (66 MHz)

And then we break on this particular check:

66MHz * 5 > 66 MHz.



The culprit IMHO is the  ssi_private->bitclk_freq = freq; in the 
fsl_ssi_set_dai_sysclk(), since we _should_ set SSI's IP block clock 
(ssi_private->clk), not the bit clock (BCLK).



This patch just quits early if it detects change, which don't need to be 
done.




Isn't this properly taking care of the clock restriction?


From: Lukasz Majewski 
Sent: Sunday, September 3, 2017 8:05:01 AM
To: Timur Tabi; Nicolin Chen; Xiubo Li; Fabio Estevam; Liam Girdwood; Mark 
Brown; Jaroslav Kysela; Takashi Iwai
Cc: alsa-de...@alsa-project.org; linuxppc-dev@lists.ozlabs.org; 
linux-ker...@vger.kernel.org; Lukasz Majewski
Subject: [PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to 
system freq

The problem is visible in the following setup (on the imx6q):
"simple-audio-card" -> ssi2 -> I2S + I2C -> codec

The function call log (simple-card probe -> CONFIG_SND_SIMPLE_CARD):

asoc_simple_card_init_dai() @ sound/soc/generic/simple-card-utils.c
snd_soc_dai_set_sysclk()
fsl_ssi_set_dai_sysclk() @ sound/soc/fsl/fsl_ssi.c

The last call is changing the bit clock (BCLK) frequency to SSI's IP
block clock (ipg = 66 MHz) [1].
This is wrong, since IMX SSI block requires the I2S BCLK to be less
than 1/5 of [1].

As a result the driver initialization passes without any errors, but the
speaker-test test case breaks.

This commit checks if the fsl_ssi_set_dai_sysclk() frequency passed is
not equal to [1].

Signed-off-by: Lukasz Majewski 
---
  sound/soc/fsl/fsl_ssi.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 173cb84..1186fa9 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -809,6 +809,8 @@ static int fsl_ssi_set_dai_sysclk(struct snd_soc_dai 
*cpu_dai,
  int clk_id, unsigned int freq, int dir)
  {
  struct fsl_ssi_private *ssi_private = 
snd_soc_dai_get_drvdata(cpu_dai);
+   if (clk_get_rate(ssi_private->clk) == freq)
+   return 0;

  ssi_private->bitclk_freq = freq;

--
2.1.4





--
Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de


Re: [PATCH] sound: soc: fsl: Do not set DAI sysclk when it is equal to system freq

2017-09-03 Thread Fabio Estevam
On Sun, Sep 3, 2017 at 11:40 AM, Łukasz Majewski  wrote:

> This is the part of fsl_ssi_set_bclk() function which is called after
> fsl_ssi_set_dai_sysclk() (which sets ssi_private->bitclk_freq = freq;).
>
> Before the aforementioned check we do have:
>
> if (ssi_private->bitclk_freq)
> freq = ssi_private->bitclk_freq;
> else
> freq = params_channels(hw_params) * 32 *
> params_rate(hw_params);
>
>
> Which assigns freq = bitclk_freq (66 MHz)
>
> And then we break on this particular check:
>
> 66MHz * 5 > 66 MHz.
>
>
>
> The culprit IMHO is the  ssi_private->bitclk_freq = freq; in the
> fsl_ssi_set_dai_sysclk(), since we _should_ set SSI's IP block clock
> (ssi_private->clk), not the bit clock (BCLK).
>
>
> This patch just quits early if it detects change, which don't need to be
> done.

Thanks for the clarification.

Reviewed-by: Fabio Estevam 


Re: Linux 4.13: Reported regressions as of Sunday, 2017-09-03

2017-09-03 Thread Linus Torvalds
On Sun, Sep 3, 2017 at 2:36 AM, Thorsten Leemhuis
 wrote:
>
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> Status: Asked on the list, but looks like issue gets ignored by everyone
> Note: I'm a bit unsure if adding this issue to this list was a good
> idea. Side note: Was reported against linux-next in May already
> Reported: 2017-07-10
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Cause: https://git.kernel.org/torvalds/c/e585513b76f7

Sadly, while I love the concept of performance tracking, the
"will-it-scale" reports haven't really been reliable enough to really
be useful. There is a _ton_ of noise in the numbers, and the
test-cases don't seem to be stable enough to really track sanely.

I wish it was otherwise, because we also got a report of "57.3%
improvement of will-it-scale.per_process_ops" this release.

So I find the kernel test robot performance tracking very interesting
in theory, but as things stand now I think it's just that:
"interesting". Not quite ready for action.

 Linus


[PATCH v3 1/2] powerpc/mm: Export flush_all_mm()

2017-09-03 Thread Frederic Barrat
With the optimizations introduced by commit a46cc7a90fd8
("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no
longer flushes the page walk cache with radix. This patch introduces
flush_all_mm(), which flushes everything, tlb and pwc, for a given mm.

Signed-off-by: Frederic Barrat 
---
Changelog:
v3: add comment to explain limitations on hash
v2: this patch is new

 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h  | 20 
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h |  3 +++
 arch/powerpc/include/asm/book3s/64/tlbflush.h   | 15 +++
 arch/powerpc/mm/tlb-radix.c |  6 --
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
index 2f6373144e2c..2ac45cf85042 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -65,6 +65,26 @@ static inline void hash__flush_tlb_mm(struct mm_struct *mm)
 {
 }
 
+static inline void hash__local_flush_all_mm(struct mm_struct *mm)
+{
+   /*
+* There's no Page Walk Cache for hash, so what is needed is
+* the same as flush_tlb_mm(), which doesn't really make sense
+* with hash. So the only thing we could do is flush the
+* entire LPID! Punt for now, as it's not being used.
+*/
+}
+
+static inline void hash__flush_all_mm(struct mm_struct *mm)
+{
+   /*
+* There's no Page Walk Cache for hash, so what is needed is
+* the same as flush_tlb_mm(), which doesn't really make sense
+* with hash. So the only thing we could do is flush the
+* entire LPID! Punt for now, as it's not being used.
+*/
+}
+
 static inline void hash__local_flush_tlb_page(struct vm_area_struct *vma,
  unsigned long vmaddr)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 9b433a624bf3..af06c6fe8a9f 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -21,17 +21,20 @@ extern void radix__flush_tlb_range(struct vm_area_struct 
*vma, unsigned long sta
 extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long 
end);
 
 extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
+extern void radix__local_flush_all_mm(struct mm_struct *mm);
 extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned 
long vmaddr);
 extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned 
long vmaddr,
  int psize);
 extern void radix__tlb_flush(struct mmu_gather *tlb);
 #ifdef CONFIG_SMP
 extern void radix__flush_tlb_mm(struct mm_struct *mm);
+extern void radix__flush_all_mm(struct mm_struct *mm);
 extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr);
 extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
int psize);
 #else
 #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm)
+#define radix__flush_all_mm(mm)radix__local_flush_all_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
 #define radix__flush_tlb_page_psize(mm,addr,p) 
radix__local_flush_tlb_page_psize(mm,addr,p)
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 72b925f97bab..70760d018bcd 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -57,6 +57,13 @@ static inline void local_flush_tlb_page(struct 
vm_area_struct *vma,
return hash__local_flush_tlb_page(vma, vmaddr);
 }
 
+static inline void local_flush_all_mm(struct mm_struct *mm)
+{
+   if (radix_enabled())
+   return radix__local_flush_all_mm(mm);
+   return hash__local_flush_all_mm(mm);
+}
+
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
if (radix_enabled())
@@ -79,9 +86,17 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
return radix__flush_tlb_page(vma, vmaddr);
return hash__flush_tlb_page(vma, vmaddr);
 }
+
+static inline void flush_all_mm(struct mm_struct *mm)
+{
+   if (radix_enabled())
+   return radix__flush_all_mm(mm);
+   return hash__flush_all_mm(mm);
+}
 #else
 #define flush_tlb_mm(mm)   local_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, addr)  local_flush_tlb_page(vma, addr)
+#define flush_all_mm(mm)   local_flush_all_mm(mm)
 #endif /* CONFIG_SMP */
 /*
  * flush the page walk cache for the address
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index b3e849c4886e..5a1f46eff3a2 100644
--- a/arch/powerpc/m

[PATCH v3 2/2] cxl: Enable global TLBIs for cxl contexts

2017-09-03 Thread Frederic Barrat
The PSL and nMMU need to see all TLB invalidations for the memory
contexts used on the adapter. For the hash memory model, it is done by
making all TLBIs global as soon as the cxl driver is in use. For
radix, we need something similar, but we can refine and only convert
to global the invalidations for contexts actually used by the device.

The new mm_context_add_copro() API increments the 'active_cpus' count
for the contexts attached to the cxl adapter. As soon as there's more
than 1 active cpu, the TLBIs for the context become global. Active cpu
count must be decremented when detaching to restore locality if
possible and to avoid overflowing the counter.

The hash memory model support is somewhat limited, as we can't
decrement the active cpus count when mm_context_remove_copro() is
called, because we can't flush the TLB for a mm on hash. So TLBIs
remain global on hash.

Signed-off-by: Frederic Barrat 
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
---
Changelog:
v3: don't decrement active cpus count with hash, as we don't know how to flush
v2: Replace flush_tlb_mm() by the new flush_all_mm() to flush the TLBs
and PWCs (thanks to Ben)

 arch/powerpc/include/asm/mmu_context.h | 46 ++
 arch/powerpc/mm/mmu_context.c  |  9 ---
 drivers/misc/cxl/api.c | 22 +---
 drivers/misc/cxl/context.c |  3 +++
 drivers/misc/cxl/file.c| 19 --
 5 files changed, 85 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 309592589e30..a0d7145d6cd2 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -77,6 +77,52 @@ extern void switch_cop(struct mm_struct *next);
 extern int use_cop(unsigned long acop, struct mm_struct *mm);
 extern void drop_cop(unsigned long acop, struct mm_struct *mm);
 
+#ifdef CONFIG_PPC_BOOK3S_64
+static inline void inc_mm_active_cpus(struct mm_struct *mm)
+{
+   atomic_inc(&mm->context.active_cpus);
+}
+
+static inline void dec_mm_active_cpus(struct mm_struct *mm)
+{
+   atomic_dec(&mm->context.active_cpus);
+}
+
+static inline void mm_context_add_copro(struct mm_struct *mm)
+{
+   /*
+* On hash, should only be called once over the lifetime of
+* the context, as we can't decrement the active cpus count
+* and flush properly for the time being.
+*/
+   inc_mm_active_cpus(mm);
+}
+
+static inline void mm_context_remove_copro(struct mm_struct *mm)
+{
+   /*
+* Need to broadcast a global flush of the full mm before
+* decrementing active_cpus count, as the next TLBI may be
+* local and the nMMU and/or PSL need to be cleaned up.
+* Should be rare enough so that it's acceptable.
+*
+* Skip on hash, as we don't know how to do the proper flush
+* for the time being. Invalidations will remain global if
+* used on hash.
+*/
+   if (radix_enabled()) {
+   flush_all_mm(mm);
+   dec_mm_active_cpus(mm);
+   }
+}
+#else
+static inline void inc_mm_active_cpus(struct mm_struct *mm) { }
+static inline void dec_mm_active_cpus(struct mm_struct *mm) { }
+static inline void mm_context_add_copro(struct mm_struct *mm) { }
+static inline void mm_context_remove_copro(struct mm_struct *mm) { }
+#endif
+
+
 extern void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
   struct task_struct *tsk);
 
diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c
index 0f613bc63c50..d60a62bf4fc7 100644
--- a/arch/powerpc/mm/mmu_context.c
+++ b/arch/powerpc/mm/mmu_context.c
@@ -34,15 +34,6 @@ static inline void switch_mm_pgdir(struct task_struct *tsk,
   struct mm_struct *mm) { }
 #endif
 
-#ifdef CONFIG_PPC_BOOK3S_64
-static inline void inc_mm_active_cpus(struct mm_struct *mm)
-{
-   atomic_inc(&mm->context.active_cpus);
-}
-#else
-static inline void inc_mm_active_cpus(struct mm_struct *mm) { }
-#endif
-
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
 {
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index a0c44d16bf30..1137a2cc1d3e 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cxl.h"
 
@@ -331,9 +332,12 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,
/* ensure this mm_struct can't be freed */
cxl_context_mm_count_get(ctx);
 
-   /* decrement the use count */
-   if (ctx->mm)
+   if (ctx->mm) {
+   /* decrement the use count from above */
mmput(ctx->mm);
+   /* make TLBIs for this context global */
+   mm_context_add_

[powerpc:test 314/320] WARNING: vmlinux.o(.text+0xa5500): Section mismatch in reference from the function .xive_spapr_init() to the function .init.text:.xive_core_init()

2017-09-03 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git test
head:   5f121292f0a0873fa2cd3a0292fb4860a8953f38
commit: eac1e731b59ee3b5f5e641a7765c7ed41ed26226 [314/320] powerpc/xive: guest 
exploitation of the XIVE interrupt controller
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout eac1e731b59ee3b5f5e641a7765c7ed41ed26226
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0xa5500): Section mismatch in reference from the 
>> function .xive_spapr_init() to the function .init.text:.xive_core_init()
   The function .xive_spapr_init() references
   the function __init .xive_core_init().
   This is often because .xive_spapr_init lacks a __init
   annotation or the annotation of .xive_core_init is wrong.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] powerpc/powernv: Turn on SCSI_AACRAID in powernv_defconfig

2017-09-03 Thread Stewart Smith
Michael Ellerman  writes:
>> 2. On a bare metal machine, if you set ipr.fast_reboot=1 on the skiboot
>>kernel, then we should also avoid resetting the ipr adapter, so ipr
>>init on the kernel being kexec booted from skiboot should be extremely 
>> fast. 
>
> OK, I didn't know that was an option, so that might help.
>
>> ...
>> If you've got cases where ipr init is taking a long time, I'd be
>> interested to know what scenarios are the most annoying to see if there
>> is any opportunity to improve.
>
> Yeah booting bare metal is where I see it (not using ipr.fast_reboot).

Hrm... We should probably enable that by default for petitboot then.

It'd at least cut some time off booting straight through to OS.

-- 
Stewart Smith
OPAL Architect, IBM.



[PATCH] powerpc: Fix kernel crash in emulation of vector loads and stores

2017-09-03 Thread Paul Mackerras
Commit 350779a29f11 ("powerpc: Handle most loads and stores in
instruction emulation code", 2017-08-30) changed the register usage
in get_vr and put_vr with the aim of leaving the register number in
r3 untouched on return.  Unfortunately, r6 was not a good choice, as
the callers as of 350779a29f11 store a MSR value in r6.  Then, in
commit c22435a5f3d8 ("powerpc: Emulate FP/vector/VSX loads/stores
correctly when regs not live", 2017-08-30), the saving and restoring
of the MSR got moved into get_vr and put_vr.  Either way, the effect
is that we put a value in MSR that only has the 0x3f8 bits non-zero,
meaning that we are switching to 32-bit mode.  That leads to a crash
like this:

Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0x0007bea0
Oops: Kernel access of bad area, sig: 11 [#12]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: vmx_crypto binfmt_misc ip_tables x_tables autofs4 
crc32c_vpmsum
CPU: 6 PID: 32659 Comm: trashy_testcase Tainted: G  D 
4.13.0-rc2-00313-gf3026f57e6ed-dirty #23
task: c00f1bb9e780 task.stack: c00f1ba98000
NIP:  0007bea0 LR: c007b054 CTR: c007be70
REGS: c00f1ba9b960 TRAP: 0400   Tainted: G  D  
(4.13.0-rc2-00313-gf3026f57e6ed-dirty)
MSR:  1000400010a1   CR: 48000228  XER: 
CFAR: c007be74 SOFTE: 1
GPR00: c007b054 c00f1ba9bbe0 c0e6e000 001d
GPR04: c00f1ba9bc00 c007be70 00e8 92009033
GPR08: 0200 1282f033 0b0a0900 1009
GPR12:  cfd42100 0706050303020100 a5a5a5a5a5a5a5a5
GPR16: 2e2e2e2e2e2de70c 2e2e2e2e2e2e2e2d 00ff00ff 060604020202
GPR20: 005b  03020100 
GPR24: c00f1ab90020 c00f1ba9bc00 0001 0001
GPR28: c00f1ba9bc90 c00f1ba9bea0 0b0a0908 0001
NIP [0007bea0] 0x7bea0
LR [c007b054] emulate_loadstore+0x1044/0x1280
Call Trace:
[c00f1ba9bbe0] [c0076b80] analyse_instr+0x60/0x34f0 (unreliable)
[c00f1ba9bc70] [c007b7ec] emulate_step+0x23c/0x544
[c00f1ba9bce0] [c0053424] arch_uprobe_skip_sstep+0x24/0x40
[c00f1ba9bd00] [c024b2f8] uprobe_notify_resume+0x598/0xba0
[c00f1ba9be00] [c001c284] do_notify_resume+0xd4/0xf0
[c00f1ba9be30] [c000bd44] ret_from_except_lite+0x70/0x74
Instruction dump:
       
       
---[ end trace a7ae7a7f3e0256b5 ]---

To fix this, we just revert to using r3 as before, since the callers
don't rely on r3 being left unmodified.

Fortunately, this can't be triggered by a misaligned load or store,
because vector loads and stores truncate misaligned addresses rather
than taking an alignment interrupt.  It can be triggered using
uprobes.

Fixes: 350779a29f11 ("powerpc: Handle most loads and stores in instruction 
emulation code")
Reported-by: Anton Blanchard 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/lib/ldstfp.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/lib/ldstfp.S b/arch/powerpc/lib/ldstfp.S
index 7b5cf5e..ae15eba 100644
--- a/arch/powerpc/lib/ldstfp.S
+++ b/arch/powerpc/lib/ldstfp.S
@@ -77,7 +77,7 @@ _GLOBAL(get_vr)
orisr7, r6, MSR_VEC@h
MTMSRD(r7)
isync
-   rlwinm  r6,r3,3,0xf8
+   rlwinm  r3,r3,3,0xf8
bcl 20,31,1f
 reg = 0
.rept   32
@@ -86,7 +86,7 @@ reg = 0
 reg = reg + 1
.endr
 1: mflrr5
-   add r5,r6,r5
+   add r5,r3,r5
mtctr   r5
mtlrr0
bctr
@@ -101,7 +101,7 @@ _GLOBAL(put_vr)
orisr7, r6, MSR_VEC@h
MTMSRD(r7)
isync
-   rlwinm  r6,r3,3,0xf8
+   rlwinm  r3,r3,3,0xf8
bcl 20,31,1f
 reg = 0
.rept   32
@@ -110,7 +110,7 @@ reg = 0
 reg = reg + 1
.endr
 1: mflrr5
-   add r5,r6,r5
+   add r5,r3,r5
mtctr   r5
mtlrr0
bctr
-- 
2.7.4



Re: [PATCH] powerpc: Fix kernel crash in emulation of vector loads and stores

2017-09-03 Thread Anton Blanchard
Hi Paul,

> Commit 350779a29f11 ("powerpc: Handle most loads and stores in
> instruction emulation code", 2017-08-30) changed the register usage
> in get_vr and put_vr with the aim of leaving the register number in
> r3 untouched on return.  Unfortunately, r6 was not a good choice, as
> the callers as of 350779a29f11 store a MSR value in r6.  Then, in
> commit c22435a5f3d8 ("powerpc: Emulate FP/vector/VSX loads/stores
> correctly when regs not live", 2017-08-30), the saving and restoring
> of the MSR got moved into get_vr and put_vr.  Either way, the effect
> is that we put a value in MSR that only has the 0x3f8 bits non-zero,
> meaning that we are switching to 32-bit mode.  That leads to a crash
> like this:

Thanks! This fixed the issues I was seeing:

Tested-by: Anton Blanchard 

Anton


[PATCH] powerpc: Implement cross-endian emulation of larx and stcx instructions

2017-09-03 Thread Paul Mackerras
This adds byte-swapping of the values loaded or stored by the
load with reservation (larx) and store conditional (stcx)
instructions when the execution environment being emulated has
the opposite endianness to the kernel.  This should have been done
in commit d955189ae427 ("powerpc: Handle opposite-endian processes
in emulation code", 2017-08-30) but was missed then.

Since op->reg is used quite frequently in emulate_loadstore(),
this puts op->reg into rd at the beginning of the function and
replaces subsequent uses of op->reg with rd.

This does not affect alignment interrupt handling, since these
instructions cannot be emulated when the address is not aligned,
because we have no way to do atomic unaligned accesses, in
general.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/lib/sstep.c | 47 ++-
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index fb9f58b..0590417 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2719,6 +2719,7 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
type = op->type & INSTR_TYPE_MASK;
cross_endian = (regs->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
ea = truncate_if_32bit(regs->msr, op->ea);
+   rd = op->reg;
 
switch (type) {
case LARX:
@@ -2745,7 +2746,12 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
__get_user_asmx(val, ea, err, "ldarx");
break;
case 16:
-   err = do_lqarx(ea, ®s->gpr[op->reg]);
+   err = do_lqarx(ea, ®s->gpr[rd]);
+   if (unlikely(cross_endian)) {
+   val = byterev_8(regs->gpr[rd]);
+   regs->gpr[rd] = byterev_8(regs->gpr[rd + 1]);
+   regs->gpr[rd + 1] = val;
+   }
break;
 #endif
default:
@@ -2755,8 +2761,11 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
regs->dar = ea;
break;
}
-   if (size < 16)
-   regs->gpr[op->reg] = val;
+   if (size < 16) {
+   if (unlikely(cross_endian))
+   val = byterev_8(val);
+   regs->gpr[rd] = val;
+   }
break;
 
case STCX:
@@ -2764,6 +2773,8 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
return -EACCES; /* can't handle misaligned */
if (!address_ok(regs, ea, size))
return -EFAULT;
+   if (unlikely(cross_endian))
+   do_byterev(&op->val, size);
err = 0;
switch (size) {
 #ifdef __powerpc64__
@@ -2782,8 +2793,12 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
__put_user_asmx(op->val, ea, err, "stdcx.", cr);
break;
case 16:
-   err = do_stqcx(ea, regs->gpr[op->reg],
-  regs->gpr[op->reg + 1], &cr);
+   if (unlikely(cross_endian))
+   err = do_stqcx(ea, byterev_8(regs->gpr[rd + 1]),
+  op->val, &cr);
+   else
+   err = do_stqcx(ea, op->val, regs->gpr[rd + 1],
+  &cr);
break;
 #endif
default:
@@ -2800,16 +2815,16 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
case LOAD:
 #ifdef __powerpc64__
if (size == 16) {
-   err = emulate_lq(regs, ea, op->reg, cross_endian);
+   err = emulate_lq(regs, ea, rd, cross_endian);
break;
}
 #endif
-   err = read_mem(®s->gpr[op->reg], ea, size, regs);
+   err = read_mem(®s->gpr[rd], ea, size, regs);
if (!err) {
if (op->type & SIGNEXT)
-   do_signext(®s->gpr[op->reg], size);
+   do_signext(®s->gpr[rd], size);
if ((op->type & BYTEREV) == (cross_endian ? 0 : 
BYTEREV))
-   do_byterev(®s->gpr[op->reg], size);
+   do_byterev(®s->gpr[rd], size);
}
break;
 
@@ -2830,7 +2845,7 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
case LOAD_VMX:
if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
return 0;
-   err = do_ve

[RFC/PATCH] powerpc/eeh: Create PHB PEs after EEH is initialized

2017-09-03 Thread Benjamin Herrenschmidt
Otherwise we end up not yet having computed the right
diag data size on powernv where EEH initialization
is delayed, thus causing memory corruption later on
when calling OPAL.

Signed-off-by: Benjamin Herrenschmidt 
---

Russell, what do you think ? The end result is that the PEs
for the PHBs are created much later. I suppose that might cause
changes of behaviour if, for example, we hit EEH while probing
before we call eeh_init() again. Hopefully we have all the
appropriate NULL checks to deal with it though...

Another option would be to break up eeh_init() between calling
the backend init, which we would unconditionally do early (ie
removing the special powernv test) and a new eeh_create_pe()'s
which would explicitely be called by the backend at the "right"
time...

Without either fix, we are currently corrupting kernel memory
when hitting EEH on a PHB PE (fences for example).

If this ends up being the right approach, then we need a CC
stable as well.

 arch/powerpc/kernel/eeh.c |  4 
 arch/powerpc/kernel/eeh_dev.c | 18 --
 2 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 63992b2d8e15..f27eecd5ec7f 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1018,6 +1018,10 @@ int eeh_init(void)
} else if ((ret = eeh_ops->init()))
return ret;
 
+   /* Initialize PHB PEs */
+   list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
+   eeh_dev_phb_init_dynamic(hose);
+
/* Initialize EEH event */
ret = eeh_event_init();
if (ret)
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index d6b2ca70d14d..0820b73288c0 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -83,21 +83,3 @@ void eeh_dev_phb_init_dynamic(struct pci_controller *phb)
/* EEH PE for PHB */
eeh_phb_pe_create(phb);
 }
-
-/**
- * eeh_dev_phb_init - Create EEH devices for devices included in existing PHBs
- *
- * Scan all the existing PHBs and create EEH devices for their OF
- * nodes and their children OF nodes
- */
-static int __init eeh_dev_phb_init(void)
-{
-   struct pci_controller *phb, *tmp;
-
-   list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
-   eeh_dev_phb_init_dynamic(phb);
-
-   return 0;
-}
-
-core_initcall(eeh_dev_phb_init);



[PATCH] powerpc/xive: Fix section __init warning

2017-09-03 Thread Cédric Le Goater
xive_spapr_init() is called from a __init routine and calls __init
routines.

Signed-off-by: Cédric Le Goater 
---
 arch/powerpc/sysdev/xive/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 43e9eeb0d39f..f24a70bc6855 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -593,7 +593,7 @@ static bool xive_get_max_prio(u8 *max_prio)
return true;
 }
 
-bool xive_spapr_init(void)
+bool __init xive_spapr_init(void)
 {
struct device_node *np;
struct resource r;
-- 
2.13.5