On 12/31/23 16:55, David Laight wrote:
per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
This requires the cpu number be 64bit.
However the value is osq_lock() comes from a 32bit xchg() and there
isn't a way of telling gcc the high bits are zero (they are) so
there will always be an
On 12/31/23 16:54, David Laight wrote:
When osq_lock() returns false or osq_unlock() returns static
analysis shows that node->next should always be NULL.
This means that it isn't necessary to explicitly set it to NULL
prior to atomic_xchg(&lock->tail, curr) on extry to osq_lock().
Just in case
On 12/31/23 16:54, David Laight wrote:
node->prev is only used to update 'prev' in the unlikely case
of concurrent unqueues.
This can be replaced by a check for node->prev_cpu changing
and then calling decode_cpu() to get the changed 'prev' pointer.
node->cpu (or more particularly) prev->cpu is
On 12/31/23 16:52, David Laight wrote:
The vcpu_is_preempted() test stops osq_lock() spinning if a virtual
cpu is no longer running.
Although patched out for bare-metal the code still needs the cpu number.
Reading this from 'prev->cpu' is a pretty much guaranteed have a cache miss
when osq_unl
On 12/31/23 16:51, David Laight wrote:
Since node->locked cannot be set before the assignment to prev->next
it is save to clear it in the slow path.
Signed-off-by: David Laight
---
kernel/locking/osq_lock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/
per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
This requires the cpu number be 64bit.
However the value is osq_lock() comes from a 32bit xchg() and there
isn't a way of telling gcc the high bits are zero (they are) so
there will always be an instruction to clear the high bits.
The c
When osq_lock() returns false or osq_unlock() returns static
analysis shows that node->next should always be NULL.
This means that it isn't necessary to explicitly set it to NULL
prior to atomic_xchg(&lock->tail, curr) on extry to osq_lock().
Just in case there a non-obvious race condition that ca
node->prev is only used to update 'prev' in the unlikely case
of concurrent unqueues.
This can be replaced by a check for node->prev_cpu changing
and then calling decode_cpu() to get the changed 'prev' pointer.
node->cpu (or more particularly) prev->cpu is only used for the
osq_wait_next() call in
The vcpu_is_preempted() test stops osq_lock() spinning if a virtual
cpu is no longer running.
Although patched out for bare-metal the code still needs the cpu number.
Reading this from 'prev->cpu' is a pretty much guaranteed have a cache miss
when osq_unlock() is waking up the next cpu.
Instead s
Since node->locked cannot be set before the assignment to prev->next
it is save to clear it in the slow path.
Signed-off-by: David Laight
---
kernel/locking/osq_lock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index
This is an updated series of optimisations to osq_lock.c
Patches #1 and #3 from v1 have been applied by Linus.
Some of the generated code issues I was getting were caused by
CONFIG_DEBUG_PREEMPT being set. No idea why, it isn't any more.
Patch #1 is the node->locked part of the old #2.
Patch #2 r
Follow the updated bindings and use a QCS404-specific compatible for the
HFPLL.
Signed-off-by: Luca Weiss
---
arch/arm64/boot/dts/qcom/qcs404.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi
b/arch/arm64/boot/dts/qcom/qcs404.dtsi
inde
It doesn't appear that the configuration is for the HFPLL is generic, so
add a qcs404-specific compatible and rename the existing struct to
qcs404.
Signed-off-by: Luca Weiss
---
drivers/clk/qcom/hfpll.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/hf
Convert the .txt documentation to .yaml.
Take the liberty to change the compatibles for ipq8064, apq8064, msm8974
and msm8960 to follow the updated naming schema. These compatibles are
not used upstream yet.
Also add a compatible for QCS404 since that SoC upstream already uses
qcom,hfpll compatib
files changed, 87 insertions(+), 66 deletions(-)
---
base-commit: 39676dfe52331dba909c617f213fdb21015c8d10
change-id: 20231231-hfpll-yaml-9266f012365c
Best regards,
--
Luca Weiss
From: Linus Torvalds
> Sent: 30 December 2023 20:59
>
> On Sat, 30 Dec 2023 at 12:41, Linus Torvalds
> wrote:
> >
> > UNTESTED patch to just do the "this_cpu_write()" parts attached.
> > Again, note how we do end up doing that this_cpu_ptr conversion later
> > anyway, but at least it's off the cr
From: Linus Torvalds
> Sent: 30 December 2023 20:41
>
> On Fri, 29 Dec 2023 at 12:57, David Laight wrote:
> >
> > this_cpu_ptr() is rather more expensive than raw_cpu_read() since
> > the latter can use an 'offset from register' (%gs for x86-84).
> >
> > Add a 'self' field to 'struct optimistic_s
From: Waiman Long
> Sent: 31 December 2023 03:04
> The presence of debug_smp_processor_id in your compiled code is likely
> due to the setting of CONFIG_DEBUG_PREEMPT in your kernel config.
>
> #ifdef CONFIG_DEBUG_PREEMPT
> extern unsigned int debug_smp_processor_id(void);
> # define smp_p
18 matches
Mail list logo