From: Alexander Sverdlin
Teach ftrace_make_call() and ftrace_make_nop() about PLTs.
Teach PLT code about FTRACE and all its callbacks.
Otherwise the following might happen:
[ cut here ]
WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14
__arm_gen_branch+0x83/0x8
From: Alexander Sverdlin
Will be used in the following patch. No functional change.
Signed-off-by: Alexander Sverdlin
---
arch/arm/include/asm/insn.h | 8
arch/arm/kernel/ftrace.c| 2 +-
arch/arm/kernel/insn.c | 19 ++-
3 files changed, 15 insertions(+), 14
From: Alexander Sverdlin
No functional change, later it will be re-used in several files.
Signed-off-by: Alexander Sverdlin
---
arch/arm/include/asm/module.h | 9 +
arch/arm/kernel/module-plts.c | 9 -
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/incl
From: Alexander Sverdlin
FTRACE's function tracer currently doesn't always work on ARM with
MODULE_PLT option enabled. If the module is loaded too far, FTRACE's
code modifier cannot cope with introduced veneers and turns the
function tracer off globally.
ARM64 already has a solution for the prob
From: Alexander Sverdlin
There are several implementations of PL061 which lack GPIOINTR signal in
hardware and only have individual GPIOMIS[7:0] interrupts. Use the
hierarchical interrupt support of the gpiolib in these cases (if at least 8
IRQs are configured for the PL061).
One in-tree example
From: Alexander Sverdlin
There are several implementations of PL061 which lack GPIOINTR signal in
hardware and only have individual GPIOMIS[7:0] interrupts. Use the
hierarchical interrupt support of the gpiolib in these cases (if at least 8
IRQs are configured for the PL061).
One in-tree example
From: Alexander Sverdlin
While get_dma_channel() is protected against concurrent calls, there is a
race against kref_put() in mport_cdev_release():
CPU0CPU1
get_dma_channel()
kref_init(&priv->md->dma_ref);
...
mport_cdev_release_dma()
kref_put(&md->dma
From: Alexander Sverdlin
Get rid of central chrdev MTD lock, which prevents simultaneous operations
on completely independent physical MTD chips. Replace it with newly
introduced per-master mutex.
Signed-off-by: Alexander Sverdlin
---
drivers/mtd/mtdchar.c | 14 --
drivers/mtd/mt
From: Alexander Sverdlin
It looks unnecessary in the function, remove it from the function
having in mind to remove it completely.
Signed-off-by: Alexander Sverdlin
---
drivers/mtd/mtdchar.c | 10 ++
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/mtd/mtdchar.c b/
From: Alexander Sverdlin
This will save one SYNCW on Octeon and improve tight
uncontended spinlock loop performance by 17%.
Signed-off-by: Alexander Sverdlin
---
arch/mips/include/asm/atomic.h | 3 +++
arch/mips/include/asm/cmpxchg.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/arch
From: Alexander Sverdlin
On Octeon mmiowb() is SYNCW, which is already contained in
smp_store_release(). Removing superfluous barrier brings around 10%
performance on uncontended tight spinlock loops.
Signed-off-by: Alexander Sverdlin
---
arch/mips/include/asm/spinlock.h | 2 ++
1 file changed
From: Alexander Sverdlin
The switch to qspinlock brought a massive regression in spinlocks on
Octeon. Even after applying this series (and a patch in the
ARCH-independent code [1]) tight contended (6 cores, 1 thread per core)
spinlock loop is still 50% slower as previous ticket-based implementati
From: Alexander Sverdlin
It makes no sense to fold smp_mb__before_llsc()/smp_llsc_mb() again and
again, leave only one barrier pair in the outer function.
This removes one SYNCW from __xchg_small() and brings around 10%
performance improvement in a tight spinlock loop with 6 threads on a 6 core
From: Alexander Sverdlin
On Octeon smp_mb() translates to SYNC while wmb+rmb translates to SYNCW
only. This brings around 10% performance on tight uncontended spinlock
loops.
Refer to commit 500c2e1fdbcc ("MIPS: Optimize spinlocks.") and the link
below.
On 6-core Octeon machine:
sysbench --test
From: Alexander Sverdlin
This has the effect of removing one redundant SYNCW from
queued_spin_lock_slowpath() on Octeon.
Signed-off-by: Alexander Sverdlin
---
arch/mips/include/asm/atomic.h | 2 ++
arch/mips/include/asm/cmpxchg.h | 4
2 files changed, 6 insertions(+)
diff --git a/arch/m
From: Alexander Sverdlin
Flushing the write buffer brings aroung 10% performace on the tight
uncontended spinlock loops on Octeon. Refer to commit 500c2e1fdbcc
("MIPS: Optimize spinlocks.").
Signed-off-by: Alexander Sverdlin
---
arch/mips/include/asm/spinlock.h | 3 +++
1 file changed, 3 inser
From: Alexander Sverdlin
Drop smp_wmb in arch_mcs_spin_lock_contended() after adding in into
ARCH-independent code.
Signed-off-by: Alexander Sverdlin
---
arch/arm/include/asm/mcs_spinlock.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm/include/asm/mcs_spinlock.h
b/arch/arm/inc
From: Alexander Sverdlin
Ensure writes are pushed out of core write buffer to prevent waiting code
on another cores from spinning longer than necessary.
6 threads running tight spinlock loop competing for the same lock
on 6 cores on MIPS/Octeon do 100 iterations...
before the patch in:4
From: Alexander Sverdlin
Teach ftrace_make_call() and ftrace_make_nop() about PLTs.
Teach PLT code about FTRACE and all its callbacks.
Otherwise the following might happen:
[ cut here ]
WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14
__arm_gen_branch+0x83/0x8
From: Alexander Sverdlin
FTRACE's function tracer currently doesn't always work on ARM with
MODULE_PLT option enabled. If the module is loaded too far, FTRACE's
code modifier cannot cope with introduced veneers and turns the
function tracer off globally.
ARM64 already has a solution for the prob
From: Alexander Sverdlin
No functional change, later it will be re-used in several files.
Signed-off-by: Alexander Sverdlin
---
arch/arm/include/asm/module.h | 9 +
arch/arm/kernel/module-plts.c | 9 -
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/incl
From: Alexander Sverdlin
Because check_kernel_sections_mem() does exactly this for all platforms.
Signed-off-by: Alexander Sverdlin
---
arch/mips/cavium-octeon/setup.c | 9 -
1 file changed, 9 deletions(-)
diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c
From: Alexander Sverdlin
Linux doesn't own the memory immediately after the kernel image. On Octeon
bootloader places a shared structure right close after the kernel _end,
refer to "struct cvmx_bootinfo *octeon_bootinfo" in cavium-octeon/setup.c.
If check_kernel_sections_mem() rounds the PFNs up
From: Alexander Sverdlin
Allocate the IRQ descriptors where necessary before configuring them via
irq_set_chip_and_handler(). Fixes the following soft lockup:
watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [modprobe:72]
Modules linked in:
irq event stamp: 33288
hardirqs last enabled at (3328
From: Alexander Sverdlin
Give uartlite a chance to be probed when IRQ controller will be finally
available and return potential -EPROBE_DEFER as-is. The condition "<="
has been changed to "<" to follow the recommendation in the header of
platform_get_irq().
Signed-off-by: Alexander Sverdlin
---
From: Alexander Sverdlin
Linux doesn't own the memory immediately after the kernel image. On Octeon
bootloader places a shared structure right close after the kernel _end,
refer to "struct cvmx_bootinfo *octeon_bootinfo" in cavium-octeon/setup.c.
If check_kernel_sections_mem() rounds the PFNs up
From: Alexander Sverdlin
spi_nor_parse_sfdp() modifies the passed structure so that it points to
itself (params.erase_map.regions to params.erase_map.uniform_region). This
makes it impossible to copy the local struct anywhere else.
Therefore only use memcpy() in backup-restore scenario. The bug
From: Alexander Sverdlin
The removal of mips_swiotlb_ops exposed a problem in octeon_mgmt Ethernet
driver. mips_swiotlb_ops had an mb() after most of the operations and the
removal of the ops had broken the receive functionality of the driver.
My code inspection has shown no other places except
o
From: Alexander Sverdlin
Ignore loopback-originatig packets soon enough and don't try to process L2
header where it doesn't exist. The very similar br_handle_frame() in bridge
code performs exactly the same check.
This is an example of such ICMPv6 packet:
skb len=96 headroom=40 headlen=96 tailr
29 matches
Mail list logo