On Fri, Sep 23, 2022 at 05:15:43PM -0500, Segher Boessenkool wrote:
> On Sat, Sep 24, 2022 at 02:26:52AM +1000, Nicholas Piggin wrote:
> > I still don't see what clauses guarantees asm("%0" ::"r"(foo)) to give
> > 13. It doesn't say access via inline as
On Sat, Sep 24, 2022 at 02:00:55PM +1000, Nicholas Piggin wrote:
> On Sat Sep 24, 2022 at 8:15 AM AEST, Segher Boessenkool wrote:
> > Never it is guaranteed that all accesses through this variable will use
> > the register directly: this fundamentally cannot work on all archs, and
> > also not at -
On Sat Sep 24, 2022 at 8:15 AM AEST, Segher Boessenkool wrote:
> On Sat, Sep 24, 2022 at 02:26:52AM +1000, Nicholas Piggin wrote:
> > I still don't see what clauses guarantees asm("%0" ::"r"(foo)) to give
> > 13. It doesn't say access via inline assem
On Sat, Sep 24, 2022 at 02:26:52AM +1000, Nicholas Piggin wrote:
> I still don't see what clauses guarantees asm("%0" ::"r"(foo)) to give
> 13. It doesn't say access via inline assembly is special,
But it is. It is for all register variables, local and gl
stb%X0 %1,0(%0)" : : "r" (foo), "r" (val) : "memory");
> it would work fine. It would also work fine if you wrote 13 in the
> template directly. These things follow the rules, so are guaranteed.
>
> The most important pieces of doc here may be
>*
ings follow the rules, so are guaranteed.
The most important pieces of doc here may be
* Accesses to the variable may be optimized as usual and the register
remains available for allocation and use in any computations,
provided that observable values of the variable are not affected.
On Tue Sep 20, 2022 at 4:41 PM AEST, Christophe Leroy wrote:
> local_paca is declared as global register asm("r13"), it is therefore
> garantied to always ever be r13.
>
> It is therefore not required to opencode r13 in the assembly, use
> a reference to local_paca->irq_soft_mask instead.
>
> This
local_paca is declared as global register asm("r13"), it is therefore
garantied to always ever be r13.
It is therefore not required to opencode r13 in the assembly, use
a reference to local_paca->irq_soft_mask instead.
This also allows removing the 'memory' clobber in irq_soft_mask_set()
as GCC n
On Wed, 18 May 2022 10:48:55 +0200, Christophe Leroy wrote:
> Use WRITE_ONCE() instead of opencoding the saving of current
> stack pointeur.
>
>
Applied to powerpc/next.
[1/1] powerpc/irq: remove inline assembly in hard_irq_disable macro
https://git.kernel.o
Use WRITE_ONCE() instead of opencoding the saving of current
stack pointeur.
Signed-off-by: Christophe Leroy
---
By the way, is WRITE_ONCE() needed at all ? Could we instead do
local_paca->saved_r1 = current_stack_pointer;
---
arch/powerpc/include/asm/hw_irq.h | 4 +---
1 file changed, 1 insert
In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with update form addressing,
but the associated "<>" constraint is missing.
As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used
Hi!
On Tue, Oct 20, 2020 at 07:40:09AM +, Christophe Leroy wrote:
> In several places, inline assembly uses the "%Un" modifier
> to enable the use of instruction with update form addressing,
> but the associated "<>" constraint is missing.
>
> As men
On Tue, Oct 20, 2020 at 09:44:33AM +0200, Christophe Leroy wrote:
> Le 19/10/2020 à 22:24, Segher Boessenkool a écrit :
> >>but the associated "<>" constraint is missing.
> >
> >But that is just fine. Pointless, sure, but not a bug.
>
> Most of those are from prehistoric code. So at some point in
Le 19/10/2020 à 22:24, Segher Boessenkool a écrit :
On Mon, Oct 19, 2020 at 12:12:48PM +, Christophe Leroy wrote:
In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with pre-update addressing,
Calling this "pre-update"
In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with update form addressing,
but the associated "<>" constraint is missing.
As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used
On Mon, Oct 19, 2020 at 12:12:48PM +, Christophe Leroy wrote:
> In several places, inline assembly uses the "%Un" modifier
> to enable the use of instruction with pre-update addressing,
Calling this "pre-update" is misleading: the register is not updated
before th
Le 19/10/2020 à 17:35, kernel test robot a écrit :
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master next-20201016]
[cannot apply to kvm-ppc/kvm-ppc-next mpe/next v5.9]
[If your patch is applied to the w
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master next-20201016]
[cannot apply to kvm-ppc/kvm-ppc-next mpe/next v5.9]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submittin
In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with pre-update addressing,
but the associated "<>" constraint is missing.
As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used
e could enhance TCP checksum calculations by
> > > splitting
> > > inline assembly blocks to give GCC the opportunity to mix it with other
> > > stuff, but I'm getting difficulties with the carry.
> > >
> > > As far as I can read in the docu
> You mean the mpc8xx , but I'm also using the mpc832x which has a e300c2
> core and is capable of executing 2 insns in parallel if not in the same
> Unit.
That should let you do a memory read and an add.
(I can't remember if the ppc has 'add from memory' but that is
likely to use both units anywa
Le 16/01/2020 à 17:21, Segher Boessenkool a écrit :
Christophe uses a very primitive 32-bit cpu, not even superscalar. A
loop doing adde is pretty much optimal, probably wants some unrolling
though.
You mean the mpc8xx , but I'm also using the mpc832x which has a e300c2
core and is capable
From: Segher Boessenkool
> Sent: 16 January 2020 16:22
...
> > However a loop of 'add with carry' instructions may not be the
> > fastest code by any means.
> > Because the carry flag is needed for every 'adc' you can't do more
> > that one adc per clock.
> > This limits you to 8 bytes/clock on a 6
Hi!
On Thu, Jan 16, 2020 at 03:54:58PM +, David Laight wrote:
> if you are trying to 'loop carry' the 'carry flag' with 'add with carry'
> instructions you'll almost certainly need to write the loop in asm.
> Since the loop itself is simple, this probably doesn't matter.
Agreed.
> However a
From: Christophe Leroy
> Sent: 16 January 2020 06:12
>
> I'm trying to see if we could enhance TCP checksum calculations by
> splitting inline assembly blocks to give GCC the opportunity to mix it
> with other stuff, but I'm getting difficulties with the carry.
if you
Hi!
On Thu, Jan 16, 2020 at 07:11:36AM +0100, Christophe Leroy wrote:
> I'm trying to see if we could enhance TCP checksum calculations by
> splitting inline assembly blocks to give GCC the opportunity to mix it
> with other stuff, but I'm getting difficulties with the carr
On Thu, Jan 16, 2020 at 09:06:08AM +0100, Gabriel Paubert wrote:
> On Thu, Jan 16, 2020 at 07:11:36AM +0100, Christophe Leroy wrote:
> > Hi Segher,
> >
> > I'm trying to see if we could enhance TCP checksum calculations by splitting
> > inline assembly blocks to gi
On Thu, Jan 16, 2020 at 07:11:36AM +0100, Christophe Leroy wrote:
> Hi Segher,
>
> I'm trying to see if we could enhance TCP checksum calculations by splitting
> inline assembly blocks to give GCC the opportunity to mix it with other
> stuff, but I'm getting difficultie
Hi Segher,
I'm trying to see if we could enhance TCP checksum calculations by
splitting inline assembly blocks to give GCC the opportunity to mix it
with other stuff, but I'm getting difficulties with the carry.
As far as I can read in the documentation, the z constraint represents
On Fri, 2016-29-04 at 22:29:27 UTC, Unknown sender due to SPF wrote:
> In create_zero_mask() we have:
>
> addi%1,%2,-1
> andc%1,%1,%2
> popcntd %0,%1
>
> using the "r" constraint for %2. r0 is a valid register in the "r" set,
> but addi X,r0,X turns it into an li:
>
>
In create_zero_mask() we have:
addi%1,%2,-1
andc%1,%1,%2
popcntd %0,%1
using the "r" constraint for %2. r0 is a valid register in the "r" set,
but addi X,r0,X turns it into an li:
li r7,-1
andcr7,r7,r0
popcntd r4,r7
Fix this by us
Hi Segher,
> > Our inline assembly only clobbers the first condition register
> > field, but we mark all of them as being clobbered.
>
> No, we don't. "cc" has been an alias for cr0 for over twenty two and
> a half years now; it has never changed meaning. Th
On Sat, Nov 01, 2014 at 11:42:51AM +1100, Anton Blanchard wrote:
> Our inline assembly only clobbers the first condition register field,
> but we mark all of them as being clobbered.
No, we don't. "cc" has been an alias for cr0 for over twenty two and a
half years now;
Our inline assembly only clobbers the first condition register field,
but we mark all of them as being clobbered.
This will cause LLVM to save and restore the non volatile condition
register fields around the inline assembly, which is completely
unnecessary. A simple example:
void foo(void
On Tue, Jul 07, 2009 at 02:40:02PM +0200, Andreas Schwab wrote:
> Gabriel Paubert writes:
>
> > On Fri, Jul 03, 2009 at 10:57:12PM +0200, Andreas Schwab wrote:
> >> The 'Z' constraint is required for a memory operand for insns that don't
> >> have an update form (which would be selected by the %U
Gabriel Paubert writes:
> On Fri, Jul 03, 2009 at 10:57:12PM +0200, Andreas Schwab wrote:
>> The 'Z' constraint is required for a memory operand for insns that don't
>> have an update form (which would be selected by the %U modifier).
>
> Hmmm, I believed that it was for instructions that only
On Fri, Jul 03, 2009 at 10:57:12PM +0200, Andreas Schwab wrote:
> Brad Boyer writes:
>
> > On Fri, Jul 03, 2009 at 12:14:41PM +0530, kernel mailz wrote:
> >> b. using m or Z with a memory address. I tried replacing m/Z but no change
> >> Is there some guideline ?
> >> gcc documentation says Z is
kernel mailz writes:
> My query was more on %U1%X1, I guess it is specifying U and/or X for %1 right
> ?
> what does U/X stand for (is it similar to u - unsigned and x for a hex
> address)
> are there any more literals like U/X/...
The 'U' and 'X' modifiers expand to 'u' and 'x' resp, dependin
Brad Boyer writes:
> On Fri, Jul 03, 2009 at 12:14:41PM +0530, kernel mailz wrote:
>> b. using m or Z with a memory address. I tried replacing m/Z but no change
>> Is there some guideline ?
>> gcc documentation says Z is obsolete. Is m/Z replaceable ?
>
> No idea. I don't remember ever seeing 'Z
On Fri, Jul 03, 2009 at 12:14:41PM +0530, kernel mailz wrote:
>> Thanks for responding to my previous mail. A few more queries
>>
>> a. What is the use of adding format specifiers in inline assembly
>> like
>> asm volatile("ld%U1%X1 %0,%1":"=r"(re
On Fri, Jul 03, 2009 at 12:14:41PM +0530, kernel mailz wrote:
> Thanks for responding to my previous mail. A few more queries
>
> a. What is the use of adding format specifiers in inline assembly
> like
> asm volatile("ld%U1%X1 %0,%1":"=r"(ret) : "m&qu
Hi,
Thanks for responding to my previous mail. A few more queries
a. What is the use of adding format specifiers in inline assembly
like
asm volatile("ld%U1%X1 %0,%1":"=r"(ret) : "m"(*ptr) : "memory");
b. using m or Z with a memory address. I tried r
function depending on the surrounding code.
> I am trying all the kernel code inline assembly to find an example
> that works differently with memory.
Well, we recently had an example in the atomic64 code for 32-bit that
Paulus wrote, where we discovered we were missing the memory clobber in
loca
On Mon, 2009-06-29 at 16:57 +0100, David Howells wrote:
> kernel mailz wrote:
>
> > asm("sync");
>
> Isn't gcc free to discard this as it has no dependencies, no indicated side
> effects, and isn't required to be kept? I think this should probably be:
>
> asm volatile("sync");
It should
kernel mailz writes:
> Consider atomic_add and atomic_add_return in kernel code.
> I am not able to figure out why "memory" is added in latter
The "memory" indicates that gcc should not reorder accesses to memory
from one side of the asm to the other. The reason for putting it on
the atomic ops
kernel mailz wrote:
> Consider atomic_add and atomic_add_return in kernel code.
>
> On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor wrote:
>> kernel mailz writes:
>>
>>> I tried a small example
>>>
>>> int *p = 0x1000;
>>> int a = *p;
>>> asm("sync":::"memory");
>>> a = *p;
>>>
>>> and
>>>
>>>
Hi Scott,
I agree with you, kind of understand that it is required.
But buddy unless you see some construct work or by adding the
construct a visible difference is there, the concept is just piece of
theory.
I am trying all the kernel code inline assembly to find an example
that works differently
Consider atomic_add and atomic_add_return in kernel code.
On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor wrote:
> kernel mailz writes:
>
>> I tried a small example
>>
>> int *p = 0x1000;
>> int a = *p;
>> asm("sync":::"memory");
>> a = *p;
>>
>> and
>>
>> volatile int *p = 0x1000;
>> int a = *
kernel mailz writes:
> I tried a small example
>
> int *p = 0x1000;
> int a = *p;
> asm("sync":::"memory");
> a = *p;
>
> and
>
> volatile int *p = 0x1000;
> int a = *p;
> asm("sync");
> a = *p
>
> Got the same assembly.
> Which is right.
>
> So does it mean, if proper use of volatile is done, th
David Howells writes:
> kernel mailz wrote:
>
>> asm("sync");
>
> Isn't gcc free to discard this as it has no dependencies, no indicated side
> effects, and isn't required to be kept? I think this should probably be:
>
> asm volatile("sync");
An asm with no outputs is considered to be vo
On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote:
> I tried a small example
>
> int *p = 0x1000;
> int a = *p;
> asm("sync":::"memory");
> a = *p;
>
> and
>
> volatile int *p = 0x1000;
> int a = *p;
> asm("sync");
> a = *p
>
> Got the same assembly.
> Which is right.
>
> So does it
kernel mailz wrote:
> asm("sync");
Isn't gcc free to discard this as it has no dependencies, no indicated side
effects, and isn't required to be kept? I think this should probably be:
asm volatile("sync");
David
___
Linuxppc-dev mailing list
"memory" ?
But then why below example of __xchg uses both ?
I am confused!
Anyone has a clue?
-TZ
-- Forwarded message --
From: kernel mailz
Date: Sun, Jun 28, 2009 at 10:27 AM
Subject: Re: Inline Assembly queries
To: Ian Lance Taylor
Cc: gcc-h...@gcc.gnu.org, linux
414: 7d 60 48 28 lwarx r11,0,r9
1418: 7c 00 49 2d stwcx. r0,0,r9
141c: 40 a2 ff f8 bne-1414
1420: 38 60 00 00 li r3,0
1424: 4e 80 00 20 blr
No diff ?
am I choosing the right example ?
-TZ
On Sun, Jun 28, 2009
Hello All the gurus,
I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc
There are a few queries
1. asm volatile or simply asm produce the same assembly code.
Tried with a few examples but didnt find any difference by adding
volatile with asm
2. Use of "memory"
Scott Wood wrote:
Chris Friesen wrote:
Scott Wood wrote:
Is the compiler assigning r0 to addr? That will be treated as a
literal zero instead. Try changing "r" (addr) to "b" (addr), or use
stwx.
Bingo! Is there a constraint to tell the compiler to not use r0 for addr?
Yes, "b".
Doh. S
Chris Friesen wrote:
Scott Wood wrote:
Is the compiler assigning r0 to addr? That will be treated as a
literal zero instead. Try changing "r" (addr) to "b" (addr), or use
stwx.
Bingo! Is there a constraint to tell the compiler to not use r0 for addr?
Yes, "b".
-Scott
___
Scott Wood wrote:
Chris Friesen wrote:
I've got a function that is used to overwrite opcodes in order to create
self-modifying code. It worked just fine with previous compilers, but
with gcc 4.3 it seems like it sometimes (but not always) causes problems
when inlined. If I force it to never
Chris Friesen wrote:
I've got a function that is used to overwrite opcodes in order to create
self-modifying code. It worked just fine with previous compilers, but
with gcc 4.3 it seems like it sometimes (but not always) causes problems
when inlined. If I force it to never be inlined, it work
Hi,
I've got a function that is used to overwrite opcodes in order to create
self-modifying code. It worked just fine with previous compilers, but
with gcc 4.3 it seems like it sometimes (but not always) causes problems
when inlined. If I force it to never be inlined, it works fine.
First,
unsigned int get_PLL_range(unsigned int range, unsigned int config)
{
range = range * 8 + 23;
return ((config << range) | (config >> (32 - range))) & 3;
}
The special pattern ((a << n) | (a >> (32 - n))) is recognized by
gcc as
a rotate operation.
It's only valid for 1 <= n <= 31 though
Kevin Diggs <[EMAIL PROTECTED]> writes:
> Jeremy Kerr wrote:
>> Hi Kevin,
>>
>>
>>> /*
>>> * Turn r3 (range) into a rotate count for the selected
>>>range. * 0 -> 23, 1 -> 31
>>> */
>>> __asm__ __volatile__ ( "slwi %0,%0,3\n"
>>> "
Jeremy Kerr wrote:
Hi Kevin,
/*
* Turn r3 (range) into a rotate count for the selected
range. * 0 -> 23, 1 -> 31
*/
__asm__ __volatile__ ( "slwi %0,%0,3\n"
"addi %0,%0,23\n"
"rlwnm %0,%1,%0,30,31
Hi Kevin,
> /*
> * Turn r3 (range) into a rotate count for the selected
> range. * 0 -> 23, 1 -> 31
> */
> __asm__ __volatile__ ( "slwi %0,%0,3\n"
> "addi %0,%0,23\n"
> "rlwnm %0,%1,%0,30,31\n"
On Tue, 2008-08-05 at 17:20 -0700, Kevin Diggs wrote:
> Hi,
>
> thats bad right? Because the "addi 0, 0, 23" will not work as expected
> because of the "special property" of r0. FYI: The first three lines
> after the "#APP" are from a similar function get_PLL_ratio().
>
> Is there a way
Hi,
If I have:
inline unsigned int get_PLL_range(unsigned int range, unsigned int
config)
{
unsigned int ret;
/*
* Turn r3 (range) into a rotate count for the selected range.
* 0 -> 23, 1 -> 31
*/
__asm__ __volatile__ ( "slwi %0,%0,3\n"
On Thu, Jun 05, 2008 at 11:44:51AM +0100, David Howells wrote:
> Scott Wood <[EMAIL PROTECTED]> wrote:
>
> > int tmp;
> >
> > asm volatile("addi %1, %2, -1;"
> > "andc %1, %2, %1;"
> > "cntlzw %1, %1;"
> > "subfic %0, %1, 31" : "=r" (j), "=&r" (tmp) : "r" (i
Scott Wood <[EMAIL PROTECTED]> wrote:
> int tmp;
>
> asm volatile("addi %1, %2, -1;"
> "andc %1, %2, %1;"
> "cntlzw %1, %1;"
> "subfic %0, %1, 31" : "=r" (j), "=&r" (tmp) : "r" (i));
Registers are usually assumed to be 'long' in size, so I'd recommend using
Kevin Diggs wrote:
Hi,
When doing inline assembly, is there a way to get the compiler to
assign "extra" (one not specified for inputs and outputs) registers? In
the following:
__asm__ __volatile__ (
"addi 5,%1,-1\n"
Hi,
When doing inline assembly, is there a way to get the compiler to
assign "extra" (one not specified for inputs and outputs) registers? In
the following:
__asm__ __volatile__ (
"addi 5,%1,-1\n"
70 matches
Mail list logo