Vaibhav Jain writes:
> Make sure to set the valid-bit in software-state field of the
> populated PE. This was earlier missing for dedicated mode AFUs, hence
> was causing a PSL freeze when the AFU was activated.
>
> Signed-off-by: Vaibhav Jain
> ---
> drivers/misc/cxl/native.c | 4
> 1 fil
"Bryant G. Ly" writes:
> For a PCI device it's pci_dn can be retrieved from
> pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
> Thus, we should just use the generic function pci_get_pdn_by_devfn
> to get the pci_dn.
>
> Signed-off-by: Bryant G. Ly
Minor issue, it's preferab
Hi Bryant,
Thanks for the patch, a few comments/questions.
How have you tested this?
"Bryant G. Ly" writes:
> For a PCI device it's pci_dn can be retrieved from
> pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
> Thus, we should just use the generic function pci_get_pdn_by_
Haren Myneni [ha...@linux.vnet.ibm.com] wrote:
>
> This patch adds P9 NX support for 842 compression engine. Virtual
> Accelerator Switchboard (VAS) is used to access 842 engine on P9.
>
> For each NX engine per chip, setup receive window using
> vas_rx_win_open() which configures RxFIFo with FIF
Define the vas_win_close() interface which should be used to close a
send or receive windows.
While the hardware configurations required to open send and receive windows
differ, the configuration to close a window is the same for both. So we use
a single interface to close the window.
Signed-off-
Define interfaces (wrappers) to the 'copy' and 'paste' instructions
(which are new in PowerISA 3.0). These are intended to be used to
by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the
NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu
---
Changelog[v8]:
- [Michael E
Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.
The receive window, identified by the [vasid, cop] parameters, must
already b
Define the vas_rx_win_open() interface. This interface is intended to be
used by the Nest Accelerator (NX) driver(s) to setup receive windows for
one or more NX engines (which implement compression/encryption algorithms
in the hardware).
Follow-on patches will provide an interface to close the win
Define helpers to allocate/free VAS window objects. These will
be used in follow-on patches when opening/closing windows.
Changelog[v8]:
- [Michael Ellerman] Make some functions static; retry if
ida_get_new() fails with EAGAIN; fix a couple of leak in ids
Signed-off-by: Sukadev
Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.
Signed-off-by: Sukadev Bhattiprolu
---
Changelog[v8]:
- Update comments (ISA references and some cleanup)
- Use 0 or 1 when setting
Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu
---
Changelog [v8]:
Min
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.
There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency, switch the order of parameters in
SET_FIELD to match the order in skiboot.
Signed-o
Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.
VA
Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.
Signed-off-by: Sukadev Bhattiprolu
---
Changelog[v8]
- Use u64/u32 instead of the uintXX versions.
Changelog[v7]
- Move the threshold control macros from uap
Power9 introduces a hardware subsystem referred to as the Virtual
Accelerator Switchboard (VAS). VAS allows kernel subsystems and user
space processes to directly access the Nest Accelerator (NX) engines
which implement compression and encryption algorithms in the hardware.
NX has been in Power pr
On Mon, Aug 28, 2017 at 11:05:03AM -0500, Bryant G. Ly wrote:
> For a PCI device it's pci_dn can be retrieved from
> pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
> Thus, we should just use the generic function pci_get_pdn_by_devfn
> to get the pci_dn.
>
> Signed-off-by: Bry
On Mon, Aug 28, 2017 at 02:42:29PM +1000, Paul Mackerras wrote:
> Al Viro pointed out that while one thread of a process is executing
> in kvm_vm_ioctl_create_spapr_tce(), another thread could guess the
> file descriptor returned by anon_inode_getfd() and close() it before
> the first thread has ad
The following changes since commit d1d0d5ffb3006eaf9b5f41c89fe801e032cbbfe4:
powerpc/64: Optimise set/clear of CTRL[RUN] (runlatch) (2017-08-23 23:48:38
+1000)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git next
for you to fetch ch
On 21/08/17 12:47, Alexey Kardashevskiy wrote:
> Folks,
>
> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
> repost this or go back to PCI bus flags or something else? Thanks.
Anyone, any help? How do we proceed with this? Thanks.
>
>
>
> On 14/08/17 19:45, Alexey K
* A new variant of memblock_virt_alloc_* allocations:
memblock_virt_alloc_try_nid_raw()
- Does not zero the allocated memory
- Does not panic if request cannot be satisfied
* optimize early system hash allocations
Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify
Add struct page zeroing as a part of initialization of other fields in
__init_single_page().
This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895
v3 @ 2.60GHz with 1T of memory (268400646 pages in 8 nodes):
BASEFIX
sparse_init 11.244671
vmemmap_alloc_block() will no longer zero the block, so zero memory
at its call sites for everything except struct pages. Struct page memory
is zero'd by struct page initialization.
Replace allocators in sprase-vmemmap to use the non-zeroing version. So,
we will get the performance improvement by
This patch fixes two issues in deferred_init_memmap
=
In deferred_init_memmap() where all deferred struct pages are initialized
we have a check like this:
if (page->flags) {
VM_BUG_ON(page_zone(page) != zone);
goto free_range;
}
This way we are checking if the current deferre
Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not initialized by going
through __init_single_page().
In some cases these struct pages are accessed
Changelog:
v7 - v6
- Addressed comments from Michal Hocko
- memblock_discard() patch was removed from this series and integrated
separately
- Fixed bug reported by kbuild test robot new patch:
mm: zero reserved and unavailable struct pages
- Removed patch
x86/mm: reserve only exiting low page
To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.
We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.
Signed-off-by: Pavel Tatashin
Review
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().
With deferred struct page feature enabled, however, we set fields in
register_page_bo
Remove duplicating code by using common functions
vmemmap_pud_populate and vmemmap_pgd_populate.
Signed-off-by: Pavel Tatashin
Reviewed-by: Steven Sistare
Reviewed-by: Daniel Jordan
Reviewed-by: Bob Picco
---
arch/sparc/mm/init_64.c | 23 ++-
1 file changed, 6 insertions(+
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().
With deferred struct page feature enabled there is a case where we set some
fields pr
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.
SPARC-M6 with 15T of memory, single thread performance:
To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.
We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.
Signed-off-by: Pavel Tatashin
Review
On Tue, 29 Aug 2017 10:20:48 +1000
Paul Mackerras wrote:
> On Fri, Aug 25, 2017 at 02:30:36PM +1000, Nicholas Piggin wrote:
> > When stop is executed with EC=ESL=0, it appears to execute like a
> > normal instruction (resuming from NIP when woken by interrupt).
> > So all the save/restore handlin
powerpc/vphn: Reorganize source code in order to better distinguish the
VPHN code from the NUMA code better, by moving relevant functions to
appropriate files.
Signed-off-by: Michael Bringmann
---
arch/powerpc/include/asm/topology.h|6
arch/powerpc/mm/numa.c | 550 +---
powerpc/nodes: On systems like PowerPC which allow 'hot-add' of CPU
or memory resources, it may occur that the new resources are to be
inserted into nodes that were not used for these resources at bootup.
In the kernel, any node that is used must be defined and initialized
at boot.
This patch ext
powerpc/numa: Correct the currently broken capability to set the
topology for shared CPUs in LPARs. At boot time for shared CPU
lpars, the topology for each shared CPU is set to node zero, however,
this is now updated correctly using the Virtual Processor Home Node
(VPHN) capabilities information
On Power systems with shared configurations of CPUs and memory, there
are some issues with association of additional CPUs and memory to nodes
when hot-adding resources. These patches address some of those problems.
powerpc/numa: Correct the currently broken capability to set the
topology for sha
On Fri, Aug 25, 2017 at 02:30:36PM +1000, Nicholas Piggin wrote:
> When stop is executed with EC=ESL=0, it appears to execute like a
> normal instruction (resuming from NIP when woken by interrupt).
> So all the save/restore handling can be avoided completely. In
> particular NV GPRs do not have to
On Fri, Aug 25, 2017 at 02:30:34PM +1000, Nicholas Piggin wrote:
> Reviewed-by: Gautham R. Shenoy
> Signed-off-by: Nicholas Piggin
> ---
> arch/powerpc/include/asm/cpuidle.h | 16
> arch/powerpc/kernel/idle_book3s.S | 26 --
> 2 files changed, 20 inserti
On Fri, Aug 25, 2017 at 02:30:35PM +1000, Nicholas Piggin wrote:
> The hardware can execute stop in any context, and KVM does not
> require real mode because siblings do not share MMU state. This
> saves a switch to real-mode when going idle.
>
> Acked-by: Gautham R. Shenoy
> Signed-off-by: Nicho
On Fri, Aug 25, 2017 at 02:30:33PM +1000, Nicholas Piggin wrote:
> POWER9 CPUs have independent MMU contexts per thread, so KVM does not
> need to quiesce secondary threads, so the hwthread_req/hwthread_state
> protocol does not have to be used. So patch it away on POWER9, and patch
> away the bran
From: Madalin Bucur
Date: Sun, 27 Aug 2017 16:13:36 +0300
> This patch set introduces Receive Side Scaling for the DPAA Ethernet
> driver. Documentation is updated with details related to the new
> feature and limitations that apply.
> Added also a small fix.
>
> v2: removed a C++ style comment
Hi Haren,
Some comments inline ...
Haren Myneni writes:
> diff --git a/drivers/crypto/nx/nx-842-powernv.c
> b/drivers/crypto/nx/nx-842-powernv.c
> index c0dd4c7e17d3..13089a0b9dfa 100644
> --- a/drivers/crypto/nx/nx-842-powernv.c
> +++ b/drivers/crypto/nx/nx-842-powernv.c
> @@ -32,6 +33,9 @@ M
Hi Markus,
Thanks for the patch.
On 08/27/2017 10:10 PM, SF Markus Elfring wrote:
> From: Markus Elfring
> Date: Sun, 27 Aug 2017 22:00:22 +0200
>
> Omit an extra message for a memory allocation failure in this function.
>
> This issue was detected by using the Coccinelle software.
>
> Signed
For a PCI device it's pci_dn can be retrieved from
pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
Thus, we should just use the generic function pci_get_pdn_by_devfn
to get the pci_dn.
Signed-off-by: Bryant G. Ly
---
arch/powerpc/kernel/rtas_pci.c | 30 ++
> That makes me extremely nervous... there could be all sort of
> assumptions esp. in arch code about the fact that we never populate the
> tree without the mm sem.
>
> We'd have to audit archs closely. Things like the page walk cache
> flushing on power etc...
Yes the whole thing is quite risky.
On Mon, 2017-08-28 at 11:37 +0200, Peter Zijlstra wrote:
> > Doing all this job and just give up because we cannot allocate page tables
> > looks very wasteful to me.
> >
> > Have you considered to look how we can hand over from speculative to
> > non-speculative path without starting from scratch
On Mon, 2017-08-28 at 19:37 +0200, Frederic Barrat wrote:
> Good point, I had missed the change. It looks like I now need to call
> radix__flush_all_mm(), which I would have to export outside of
> tlb-radix.c first.
>
> Any problem with having a flush_all_mm() to complement a flush_tlb_mm()?
>
On 08/28/2017 11:30 AM, Bhumika Goyal wrote:
> Make this const as it is not modified anywhere.
>
> Signed-off-by: Bhumika Goyal
Reviewed-by: Tyrel Datwyler
Make this const as it is not modified anywhere.
Signed-off-by: Bhumika Goyal
---
drivers/tty/hvc/hvcs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
index 79cc5be..40adf86 100644
--- a/drivers/tty/hvc/hvcs.c
+++ b/drivers/tty
Le 28/08/2017 à 14:03, Benjamin Herrenschmidt a écrit :
On Mon, 2017-08-28 at 10:47 +0200, Frederic Barrat wrote:
Signed-off-by: Frederic Barrat
diff --git a/arch/powerpc/include/asm/mmu_context.h
b/arch/powerpc/include/asm/mmu_context.h
index 309592589e30..6447c0df7ec4 100644
--- a/
On 08/24/2017 05:07 PM, Michael Bringmann wrote:
>
> powerpc/numa: Correct the currently broken capability to set the
> topology for shared CPUs in LPARs. At boot time for shared CPU
> lpars, the topology for each shared CPU is set to node zero, however,
> this is now updated correctly using the
> > But there is a dependency, no? If I apply the driver patch,
> > non-converted device trees will not find their eeproms anymore. So, I
>
> I don't think that's correct. If you apply this patch before the DTS
> changes, the driver will still match using the I2C device ID table
> like it has bee
All,
On Wed, Jun 28, 2017 at 11:30 AM, Matthew Weber
wrote:
> Scott,
>
> On Sun, Apr 30, 2017 at 2:01 AM, Scott Wood wrote:
>> On Thu, Apr 27, 2017 at 12:59:40PM -0500, Matt Weber wrote:
>>> This patch updates the machine check handler of Linux kernel to
>>> handle the e6500 architecture case. I
On 08/28/2017 02:56 AM, Michael Ellerman wrote:
> Some Power9 boxes will have this adapter installed, so add it to the
> defconfig so we can boot on those machines without an initrd.
Michael, not sure if this affects Petitboot (I know it has its own
default config files), but in the past we had so
Hi! Find below my fourth regression report for Linux 4.13. It lists 6
regressions I'm currently aware of. 1 of them is new, 5 got fixed since
the last report (that was two weeks ago; didn't find time for compiling
one last week; sorry). You can also find the report at
http://bit.ly/lnxregrep413 whe
On Mon, 2017-08-28 at 10:47 +0200, Frederic Barrat wrote:
>
>
> Signed-off-by: Frederic Barrat
> diff --git a/arch/powerpc/include/asm/mmu_context.h
> b/arch/powerpc/include/asm/mmu_context.h
> index 309592589e30..6447c0df7ec4 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/ar
Nicholas Piggin writes:
> POWER9 CPUs have independent MMU contexts per thread, so KVM does not
> need to quiesce secondary threads, so the hwthread_req/hwthread_state
> protocol does not have to be used. So patch it away on POWER9, and patch
> away the branch from the Linux idle wakeup to kvm_st
Hi Boris,
On 8/28/17 5:51 AM, Borislav Petkov wrote:
[..]
> +static int __init early_set_memory_enc_dec(resource_size_t paddr,
>> + unsigned long size, bool enc)
>> +{
>> +unsigned long vaddr, vaddr_end, vaddr_next;
>> +unsigned long psize, pmask;
>
Sukadev Bhattiprolu writes:
> Michael Ellerman [m...@ellerman.id.au] wrote:
>> Hi Suka,
>>
>> A few more things ...
>>
>> Sukadev Bhattiprolu writes:
>>
>> > diff --git a/arch/powerpc/platforms/powernv/copy-paste.h
>> > b/arch/powerpc/platforms/powernv/copy-paste.h
>> > new file mode 100644
Sukadev Bhattiprolu writes:
> Michael Ellerman [m...@ellerman.id.au] wrote:
>> Sukadev Bhattiprolu writes:
>> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c
>> > b/arch/powerpc/platforms/powernv/vas-window.c
>> > index 2dd4b63..24288dd 100644
>> > --- a/arch/powerpc/platforms/powern
Al Viro writes:
> On Mon, Aug 28, 2017 at 02:38:37PM +1000, Paul Mackerras wrote:
>> On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
>> > On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
>> >
>> > > It seems to me that it would be better to do the anon_inode_getfd()
>> >
Dan Carpenter writes:
> On Sun, Aug 27, 2017 at 02:56:31PM +1000, Benjamin Herrenschmidt wrote:
>> On Fri, 2017-08-25 at 13:33 +0300, Dan Carpenter wrote:
>> > My static checker complains that 0x1800 >> 13 is zero. Looking at
>> > the context, it seems like a copy and paste bug from the line
On 28.08.2017 13:28, Abdul Haleem wrote:
> Hi,
>
> offlate we are seeing hung task call traces when running trinity fuzzer
> test. kernel go hung and requires machine reboot.
>
> Machine Type : Power 8
> Kernel : 4.13.0-rc6-next-20170825
> config: Tul-VM-config
>
>
> call traces:
>
On Mon, Jul 24, 2017 at 02:07:55PM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs shares the guest physical address with
s/shares/share/
> hypervisor.
"the hypervisor."
> When SEV is active, the shared physical address must be mapped
> with encryption attribute cleared so that both
On Sun 2017-08-27 22:10:08, SF Markus Elfring wrote:
> From: Markus Elfring
> Date: Sun, 27 Aug 2017 22:00:22 +0200
>
> Omit an extra message for a memory allocation failure in this function.
>
> This issue was detected by using the Coccinelle software.
>
> Signed-off-by: Markus Elfring
Acked
Le 28/08/2017 à 06:15, Vaibhav Jain a écrit :
Make sure to set the valid-bit in software-state field of the
populated PE. This was earlier missing for dedicated mode AFUs, hence
was causing a PSL freeze when the AFU was activated.
Signed-off-by: Vaibhav Jain
---
Acked-by: Frederic Barrat
On Sun, Aug 27, 2017 at 03:18:23AM +0300, Kirill A. Shutemov wrote:
> On Fri, Aug 18, 2017 at 12:05:13AM +0200, Laurent Dufour wrote:
> > + /*
> > +* Can't call vm_ops service has we don't know what they would do
> > +* with the VMA.
> > +* This include huge page from hugetlbfs.
> > +
On 28/08/17 18:47, Frederic Barrat wrote:
cxl keeps a driver use count, which is used with the hash memory model
on p8 to know when to upgrade local TLBIs to global and to trigger
callbacks to manage the MMU for PSL8.
If a process opens a context and closes without attaching or fails the
attachm
On 28/08/17 18:35, Aneesh Kumar K.V wrote:
We need to add memory barrier so that the page table walk doesn't happen
before the cpumask is set and made visible to the other cpus. We need
to use a sync here instead of lwsync because lwsync is not sufficient for
store/load ordering.
We also need to
Le 28/08/2017 à 06:15, Vaibhav Jain a écrit :
Make sure to set the valid-bit in software-state field of the
populated PE. This was earlier missing for dedicated mode AFUs, hence
was causing a PSL freeze when the AFU was activated.
Acked-by: Christophe Lombard
Signed-off-by: Vaibhav Jain
--
The PSL and nMMU need to see all TLB invalidations for the memory
contexts used on the adapter. For the hash memory model, it is done by
making all TLBIs global as soon as the cxl driver is in use. For
radix, we need something similar, but we can refine and only convert
to global the invalidations
cxl keeps a driver use count, which is used with the hash memory model
on p8 to know when to upgrade local TLBIs to global and to trigger
callbacks to manage the MMU for PSL8.
If a process opens a context and closes without attaching or fails the
attachment, the driver use count is never decrement
On Mon, Aug 28, 2017 at 10:57:05AM +0300, Dan Carpenter wrote:
> I sent this email during kernel summit and neither of us could send a
> patch at the time and we both problem forgot. I definitely forgot.
>
s/problem/probably/... I suck at email. :(
regards,
dan carpenter
We need to add memory barrier so that the page table walk doesn't happen
before the cpumask is set and made visible to the other cpus. We need
to use a sync here instead of lwsync because lwsync is not sufficient for
store/load ordering.
We also need to add an if (mm) check so that we do the right
On Mon, 2017-08-28 at 13:23 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt writes:
>
> > On Mon, 2017-08-28 at 11:55 +0530, Aneesh Kumar K.V wrote:
> > > We need to add memory barrier so that the page table walk doesn't happen
> > > before the cpumask is set and made visible to the other
On Sun, Aug 27, 2017 at 02:56:31PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2017-08-25 at 13:33 +0300, Dan Carpenter wrote:
> > My static checker complains that 0x1800 >> 13 is zero. Looking at
> > the context, it seems like a copy and paste bug from the line below and
> > probably 0x3 <
Benjamin Herrenschmidt writes:
> On Mon, 2017-08-28 at 11:55 +0530, Aneesh Kumar K.V wrote:
>> We need to add memory barrier so that the page table walk doesn't happen
>> before the cpumask is set and made visible to the other cpus. We need
>> to use a sync here instead of lwsync because lwsync i
Current vDSO64 implementation does not have support for coarse clocks
(CLOCK_MONOTONIC_COARSE, CLOCK_REALTIME_COARSE), for which it falls back
to system call, increasing the response time, vDSO implementation reduces
the cycle time. Below is a benchmark of the difference in execution time
with and
78 matches
Mail list logo