Branch instructions that depend on target distance

2020-02-24 Thread Petr Tesarik
Hi all,

I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
One issue I've run into is the need to find out target branch distance
at compile time. I looked around, and it's not the first one
architecture with such requirement, but AFAICS it has never been solved
properly.

For example, AVR tracks instruction length. Later, ret_cond_branch()
selects between a branch instruction and an inverted branch followed by
an unconditional jump based on these calculated lengths.

This works great ... until there's some inline asm() statement, for
which gcc cannot keep track of the length attribute, so it is probably
taken as zero. Linker then fails with a cryptic message:

> relocation truncated to fit: R_AVR_7_PCREL against `no symbol'

I can provide a minimal test case and report a bug if you want...

Developers work around the issue by rewriting their code when they are
bitten by this, but it is less than optimal, because you cannot really
get rid of all inline assembly, and it's in general unpredictable where
these inline asm() blocks will be placed by the compiler.

OTOH, the avr backend is pretty outdated, so there may be a better
alternative that I'm just not seeing. Any hints?

Background: There is a port of the VC4 port of the LK embedded kernel
[2]. I have tried to build that kernel with optimization turned on, but
I'm getting:

compiling kernel/thread.c
/tmp/ccJFdnfX.s: Assembler messages:
/tmp/ccJFdnfX.s:1451: Error: operand out of range (64 not between -64 and 63)

That's because there is an inline "di" (disable interrupts) instruction
inside a conditional statement in thread_yield(), which causes this
off-by-one miscalculation.

Petr T

[1] https://github.com/itszor/gcc-vc4
[2] https://github.com/librerpi/lk


pgpRhoEcCoUru.pgp
Description: Digitální podpis OpenPGP


Re: Branch instructions that depend on target distance

2020-02-24 Thread Jozef Lawrynowicz
On Mon, 24 Feb 2020 12:05:28 +0100
Petr Tesarik  wrote:

> Hi all,
> 
> I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
> One issue I've run into is the need to find out target branch distance
> at compile time. I looked around, and it's not the first one
> architecture with such requirement, but AFAICS it has never been solved
> properly.
> 
> For example, AVR tracks instruction length. Later, ret_cond_branch()
> selects between a branch instruction and an inverted branch followed by
> an unconditional jump based on these calculated lengths.
> 
> This works great ... until there's some inline asm() statement, for
> which gcc cannot keep track of the length attribute, so it is probably
> taken as zero. Linker then fails with a cryptic message:
> 
> > relocation truncated to fit: R_AVR_7_PCREL against `no symbol'  

The MSP430 backend just always generates maximum range branch instructions,
except for some special cases. We then rely on the linker to relax branch
instructions to shorter range "jump" instructions when the destination is
within range.

So the compiler output will always work, but not be the smallest possible code
size.

For that relocation truncated to fit error message you want to check that the
linker has the ability to relax whatever branch instruction it is failing on to
a longer range branch.

Jozef
> 
> I can provide a minimal test case and report a bug if you want...
> 
> Developers work around the issue by rewriting their code when they are
> bitten by this, but it is less than optimal, because you cannot really
> get rid of all inline assembly, and it's in general unpredictable where
> these inline asm() blocks will be placed by the compiler.
> 
> OTOH, the avr backend is pretty outdated, so there may be a better
> alternative that I'm just not seeing. Any hints?
> 
> Background: There is a port of the VC4 port of the LK embedded kernel
> [2]. I have tried to build that kernel with optimization turned on, but
> I'm getting:
> 
> compiling kernel/thread.c
> /tmp/ccJFdnfX.s: Assembler messages:
> /tmp/ccJFdnfX.s:1451: Error: operand out of range (64 not between -64 and 63)
> 
> That's because there is an inline "di" (disable interrupts) instruction
> inside a conditional statement in thread_yield(), which causes this
> off-by-one miscalculation.
> 
> Petr T
> 
> [1] https://github.com/itszor/gcc-vc4
> [2] https://github.com/librerpi/lk



Re: Branch instructions that depend on target distance

2020-02-24 Thread Andreas Schwab
On Feb 24 2020, Petr Tesarik wrote:

> This works great ... until there's some inline asm() statement, for
> which gcc cannot keep track of the length attribute, so it is probably
> taken as zero.

GCC computes it by counting the number of asm insns.  You can use
ADJUST_INSN_LENGTH to adjust this as needed.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Branch instructions that depend on target distance

2020-02-24 Thread Petr Tesarik
On Mon, 24 Feb 2020 11:14:44 +
Jozef Lawrynowicz  wrote:

> On Mon, 24 Feb 2020 12:05:28 +0100
> Petr Tesarik  wrote:
> 
> > Hi all,
> > 
> > I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
> > One issue I've run into is the need to find out target branch distance
> > at compile time. I looked around, and it's not the first one
> > architecture with such requirement, but AFAICS it has never been solved
> > properly.
> > 
> > For example, AVR tracks instruction length. Later, ret_cond_branch()
> > selects between a branch instruction and an inverted branch followed by
> > an unconditional jump based on these calculated lengths.
> > 
> > This works great ... until there's some inline asm() statement, for
> > which gcc cannot keep track of the length attribute, so it is probably
> > taken as zero. Linker then fails with a cryptic message:
> >   
> > > relocation truncated to fit: R_AVR_7_PCREL against `no symbol'
> 
> The MSP430 backend just always generates maximum range branch instructions,
> except for some special cases. We then rely on the linker to relax branch
> instructions to shorter range "jump" instructions when the destination is
> within range.
> 
> So the compiler output will always work, but not be the smallest possible code
> size.
> 
> For that relocation truncated to fit error message you want to check that the
> linker has the ability to relax whatever branch instruction it is failing on 
> to
> a longer range branch.

But that would change the instruction length, so not really an option
AFAICS (unless I also switch to LTO).

Anyway, the situation is much worse on the VideoCore IV. The
alternatives here are:

1.
   addcmpbCC rx, 0, imm, target
   ; usually written as bCC rx, imm, target

2.
cmp rx, imm
bCC .+2
j   target

The tricky part is that the addcmpbCC instruction does NOT modify
condition codes, while the cmp instruction does. Nothing you could
solve in the linker...

OK, it seems I'll have to go with the worst-case variant.

Petr T

> 
> Jozef
> > 
> > I can provide a minimal test case and report a bug if you want...
> > 
> > Developers work around the issue by rewriting their code when they
> > are bitten by this, but it is less than optimal, because you cannot
> > really get rid of all inline assembly, and it's in general
> > unpredictable where these inline asm() blocks will be placed by the
> > compiler.
> > 
> > OTOH, the avr backend is pretty outdated, so there may be a better
> > alternative that I'm just not seeing. Any hints?
> > 
> > Background: There is a port of the VC4 port of the LK embedded
> > kernel [2]. I have tried to build that kernel with optimization
> > turned on, but I'm getting:
> > 
> > compiling kernel/thread.c
> > /tmp/ccJFdnfX.s: Assembler messages:
> > /tmp/ccJFdnfX.s:1451: Error: operand out of range (64 not between
> > -64 and 63)
> > 
> > That's because there is an inline "di" (disable interrupts)
> > instruction inside a conditional statement in thread_yield(), which
> > causes this off-by-one miscalculation.
> > 
> > Petr T
> > 
> > [1] https://github.com/itszor/gcc-vc4
> > [2] https://github.com/librerpi/lk  
> 



pgpDUoBAqjy1h.pgp
Description: Digitální podpis OpenPGP


Re: Branch instructions that depend on target distance

2020-02-24 Thread Petr Tesarik
On Mon, 24 Feb 2020 12:29:40 +0100
Andreas Schwab  wrote:

> On Feb 24 2020, Petr Tesarik wrote:
> 
> > This works great ... until there's some inline asm() statement, for
> > which gcc cannot keep track of the length attribute, so it is probably
> > taken as zero.  
> 
> GCC computes it by counting the number of asm insns.  You can use
> ADJUST_INSN_LENGTH to adjust this as needed.

Hmm, that's interesting, but does it work for inline asm() statements?
The argument is essentially a free-form string (with some
substitution), and the compiler cannot know how many bytes they occupy.

Is there a way to set ADJUST_INSN_LENGTH in the asm() statement? Of
course, that would have to be inserted manually as some sort of
argument to asm() in the C code being compiled...

And yes, that would work for me if it is implemented.

Petr T


pgpGO_tm6qRY7.pgp
Description: Digitální podpis OpenPGP


Re: Branch instructions that depend on target distance

2020-02-24 Thread Andrew Stubbs

On 24/02/2020 11:05, Petr Tesarik wrote:

Hi all,

I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
One issue I've run into is the need to find out target branch distance
at compile time. I looked around, and it's not the first one
architecture with such requirement, but AFAICS it has never been solved
properly.

For example, AVR tracks instruction length. Later, ret_cond_branch()
selects between a branch instruction and an inverted branch followed by
an unconditional jump based on these calculated lengths.

This works great ... until there's some inline asm() statement, for
which gcc cannot keep track of the length attribute, so it is probably
taken as zero. Linker then fails with a cryptic message:


relocation truncated to fit: R_AVR_7_PCREL against `no symbol'


You can probably fix this by implementing the ADJUST_INSN_LENGTH macro 
and recognising the inline assembler. See the internals manual.


We encountered similar issues with the recent GCN port, and the correct 
solution was to add the length attribute everywhere. The attributes are 
often conservative estimates (rather than having extra alternatives for 
every possible encoding), so the asm problem is mitigated somewhat, at 
the cost of a few "far" branches where they're not strictly necessary.


There were also addition problems because "far" branches clobber the 
condition register, and "near" branches do not, but that's another story.


Andrew


Re: Branch instructions that depend on target distance

2020-02-24 Thread Andreas Schwab
On Feb 24 2020, Petr Tesarik wrote:

> On Mon, 24 Feb 2020 12:29:40 +0100
> Andreas Schwab  wrote:
>
>> On Feb 24 2020, Petr Tesarik wrote:
>> 
>> > This works great ... until there's some inline asm() statement, for
>> > which gcc cannot keep track of the length attribute, so it is probably
>> > taken as zero.  
>> 
>> GCC computes it by counting the number of asm insns.  You can use
>> ADJUST_INSN_LENGTH to adjust this as needed.
>
> Hmm, that's interesting, but does it work for inline asm() statements?

Yes, for a suitable definition of work.

> The argument is essentially a free-form string (with some
> substitution), and the compiler cannot know how many bytes they occupy.

That's why ADJUST_INSN_LENGTH can adjust it.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Branch instructions that depend on target distance

2020-02-24 Thread Alexander Monakov
On Mon, 24 Feb 2020, Andreas Schwab wrote:

> On Feb 24 2020, Petr Tesarik wrote:
> 
> > On Mon, 24 Feb 2020 12:29:40 +0100
> > Andreas Schwab  wrote:
> >
> >> On Feb 24 2020, Petr Tesarik wrote:
> >> 
> >> > This works great ... until there's some inline asm() statement, for
> >> > which gcc cannot keep track of the length attribute, so it is probably
> >> > taken as zero.  
> >> 
> >> GCC computes it by counting the number of asm insns.  You can use
> >> ADJUST_INSN_LENGTH to adjust this as needed.
> >
> > Hmm, that's interesting, but does it work for inline asm() statements?
> 
> Yes, for a suitable definition of work.
> 
> > The argument is essentially a free-form string (with some
> > substitution), and the compiler cannot know how many bytes they occupy.
> 
> That's why ADJUST_INSN_LENGTH can adjust it.

I think Petr might be unaware of the fact that GCC counts the **number of
instructions in an inline asm statement** by counting separators in the
asm string. This may overcount when a separator appears in a string literal
for example, but triggering under-counting is trickier.

Petr, please see https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html
for some more discussion.

Alexander


Hey, spotted an out-of-date reference on gcc.gnu.org

2020-02-24 Thread Richard @ Quality Nonsense PR Team
Dear GCC Team,

Just checking in one last time to make sure we don't get overlooked in your 
inbox.

Is there anything stopping you from recommending our  long read about Digital 
Equipment Corporation to your users instead of the broken DEC link, out of 
interest?

I'm always interested in actionable feedback ;)

Richard
Digital.com

On Wed, Feb 19, 2020 at 11:34 AM, Richard @ Quality Nonsense PR Team 
 wrote:

Dear GCC Team,

I'm following up to make sure you didn't miss my email about your broken link 
to DEC.com. What did you think about my idea of citing our long read about 
Digital Equipment Corporation instead? Can I answer any questions, perhaps?

Richard
Digital.com

On Thu, Feb 13, 2020 at 12:25 PM, Richard @ Quality Nonsense PR Team 
 wrote:

Dear GCC Team,

I’m writing because you cite DEC.com in this post on GCC - but the website has 
been offline for 5+ years!

At Digital.com, we've published a "long read" on the rise and fall of Digital 
Equipment Corporation, which I thought make an "easy fix" to send your readers 
somewhere more useful than a 404 page.

It covers everything from VAX, Altavista, DEC's sale to Compaq & ultimate 
demise. You'll find the article here:-

https://digital.com/about/dec/

Would you consider citing our work to replace the broken DEC.com link, please? 

I think this article would keep your content current and your readers happy - 
love to hear what you think. Either way, thank you for your time and 
consideration.

Best wishes,

Richard
Digital.com 
Don't want emails from us anymore? Reply to this email with the word 
"UNSUBSCRIBE" in the subject line.
Digital.com, BM Box 3667 , London, Greater London, WC1N 3XX , United Kingdom

If you don't want further emails from me, please reply with 
"UNSUBSCRIBE" in the subject line. Thank you.

Digital.com, BM Box 3667, BM Box 3667 Greater London, WC1N 3XX , United Kingdom 
   
If you don't want further emails from me, please reply with 
"UNSUBSCRIBE" in the subject line. Thank you.

Digital.com, BM Box 3667, BM Box 3667 Greater London, WC1N 3XX , United Kingdom


Re: Branch instructions that depend on target distance

2020-02-24 Thread Julian Brown
On Mon, 24 Feb 2020 15:03:21 +0300 (MSK)
Alexander Monakov  wrote:

> On Mon, 24 Feb 2020, Andreas Schwab wrote:
> 
> > On Feb 24 2020, Petr Tesarik wrote:
> >   
> > > On Mon, 24 Feb 2020 12:29:40 +0100
> > > Andreas Schwab  wrote:
> > >  
> > >> On Feb 24 2020, Petr Tesarik wrote:
> > >>   
> > >> > This works great ... until there's some inline asm()
> > >> > statement, for which gcc cannot keep track of the length
> > >> > attribute, so it is probably taken as zero.
> > >> 
> > >> GCC computes it by counting the number of asm insns.  You can use
> > >> ADJUST_INSN_LENGTH to adjust this as needed.  
> > >
> > > Hmm, that's interesting, but does it work for inline asm()
> > > statements?  
> > 
> > Yes, for a suitable definition of work.
> >   
> > > The argument is essentially a free-form string (with some
> > > substitution), and the compiler cannot know how many bytes they
> > > occupy.  
> > 
> > That's why ADJUST_INSN_LENGTH can adjust it.  
> 
> I think Petr might be unaware of the fact that GCC counts the
> **number of instructions in an inline asm statement** by counting
> separators in the asm string. This may overcount when a separator
> appears in a string literal for example, but triggering
> under-counting is trickier.
> 
> Petr, please see
> https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html for some more
> discussion.

VC4 instructions vary between 16 & 80 bits in length -- I guess you
need to arrange things so that the maximum is used for inline asms (per
instruction, counting by separators). That's not 100% ideal since most
instructions will be much shorter, but at least it should give working
code.

Julian


Re: Branch instructions that depend on target distance

2020-02-24 Thread Petr Tesarik
On Mon, 24 Feb 2020 15:03:21 +0300 (MSK)
Alexander Monakov  wrote:

> On Mon, 24 Feb 2020, Andreas Schwab wrote:
> 
> > On Feb 24 2020, Petr Tesarik wrote:
> >   
> > > On Mon, 24 Feb 2020 12:29:40 +0100
> > > Andreas Schwab  wrote:
> > >  
> > >> On Feb 24 2020, Petr Tesarik wrote:
> > >>   
> > >> > This works great ... until there's some inline asm() statement, for
> > >> > which gcc cannot keep track of the length attribute, so it is probably
> > >> > taken as zero.
> > >> 
> > >> GCC computes it by counting the number of asm insns.  You can use
> > >> ADJUST_INSN_LENGTH to adjust this as needed.  
> > >
> > > Hmm, that's interesting, but does it work for inline asm() statements?  
> > 
> > Yes, for a suitable definition of work.
> >   
> > > The argument is essentially a free-form string (with some
> > > substitution), and the compiler cannot know how many bytes they occupy.  
> > 
> > That's why ADJUST_INSN_LENGTH can adjust it.  
> 
> I think Petr might be unaware of the fact that GCC counts the **number of
> instructions in an inline asm statement** by counting separators in the
> asm string. This may overcount when a separator appears in a string literal
> for example, but triggering under-counting is trickier.
> 
> Petr, please see https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html
> for some more discussion.

Indeed. Thanks for the pointer. First, it explains why my AVR test case
was invalid (I used ".rept 64 ; nop ; .endr" to save me some work).
Second, it made me aware of "the longest instruction supported by that
processor".

IIUC, this should be the default value for (define_attr "length")
in a machine description file unless a better value can be calculated
from a known instruction. And it also explains why it is still
necessary to provide some value even if you define a "length" attribute
for all known instructions.

Cheers,
Petr T


pgpUMX2z5RFPW.pgp
Description: Digitální podpis OpenPGP


Re: Hey, spotted an out-of-date reference on gcc.gnu.org

2020-02-24 Thread Jonathan Wakely
You keep talking about a broken link without telling us where it is.

But I don't think we're interested anyway, that's why nobody has replied.



On Mon, 24 Feb 2020 at 12:24, Richard @ Quality Nonsense PR Team
 wrote:
>
> Dear GCC Team,
>
> Just checking in one last time to make sure we don't get overlooked in your 
> inbox.
>
> Is there anything stopping you from recommending our  long read about Digital 
> Equipment Corporation to your users instead of the broken DEC link, out of 
> interest?
>
> I'm always interested in actionable feedback ;)
>
> Richard
> Digital.com
>
> On Wed, Feb 19, 2020 at 11:34 AM, Richard @ Quality Nonsense PR Team 
>  wrote:
>
> Dear GCC Team,
>
> I'm following up to make sure you didn't miss my email about your broken link 
> to DEC.com. What did you think about my idea of citing our long read about 
> Digital Equipment Corporation instead? Can I answer any questions, perhaps?
>
> Richard
> Digital.com
>
> On Thu, Feb 13, 2020 at 12:25 PM, Richard @ Quality Nonsense PR Team 
>  wrote:
>
> Dear GCC Team,
>
> I’m writing because you cite DEC.com in this post on GCC - but the website 
> has been offline for 5+ years!
>
> At Digital.com, we've published a "long read" on the rise and fall of Digital 
> Equipment Corporation, which I thought make an "easy fix" to send your 
> readers somewhere more useful than a 404 page.
>
> It covers everything from VAX, Altavista, DEC's sale to Compaq & ultimate 
> demise. You'll find the article here:-
>
> https://digital.com/about/dec/
>
> Would you consider citing our work to replace the broken DEC.com link, please?
>
> I think this article would keep your content current and your readers happy - 
> love to hear what you think. Either way, thank you for your time and 
> consideration.
>
> Best wishes,
>
> Richard
> Digital.com
> Don't want emails from us anymore? Reply to this email with the word 
> "UNSUBSCRIBE" in the subject line.
> Digital.com, BM Box 3667 , London, Greater London, WC1N 3XX , United Kingdom
> If you don't want further emails from me, please reply with 
> "UNSUBSCRIBE" in the subject line. Thank you.
>
> Digital.com, BM Box 3667, BM Box 3667 Greater London, WC1N 3XX , United 
> Kingdom
> If you don't want further emails from me, please reply with 
> "UNSUBSCRIBE" in the subject line. Thank you.
>
> Digital.com, BM Box 3667, BM Box 3667 Greater London, WC1N 3XX , United 
> Kingdom


Re: Branch instructions that depend on target distance

2020-02-24 Thread Jeff Law
On Mon, 2020-02-24 at 12:36 +0100, Petr Tesarik wrote:
> On Mon, 24 Feb 2020 11:14:44 +
> Jozef Lawrynowicz  wrote:
> 
> > On Mon, 24 Feb 2020 12:05:28 +0100
> > Petr Tesarik  wrote:
> > 
> > > Hi all,
> > > 
> > > I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
> > > One issue I've run into is the need to find out target branch distance
> > > at compile time. I looked around, and it's not the first one
> > > architecture with such requirement, but AFAICS it has never been solved
> > > properly.
> > > 
> > > For example, AVR tracks instruction length. Later, ret_cond_branch()
> > > selects between a branch instruction and an inverted branch followed by
> > > an unconditional jump based on these calculated lengths.
> > > 
> > > This works great ... until there's some inline asm() statement, for
> > > which gcc cannot keep track of the length attribute, so it is probably
> > > taken as zero. Linker then fails with a cryptic message:
> > >   
> > > > relocation truncated to fit: R_AVR_7_PCREL against `no symbol'
> > 
> > The MSP430 backend just always generates maximum range branch instructions,
> > except for some special cases. We then rely on the linker to relax branch
> > instructions to shorter range "jump" instructions when the destination is
> > within range.
> > 
> > So the compiler output will always work, but not be the smallest possible 
> > code
> > size.
> > 
> > For that relocation truncated to fit error message you want to check that 
> > the
> > linker has the ability to relax whatever branch instruction it is failing 
> > on to
> > a longer range branch.
> 
> But that would change the instruction length, so not really an option
> AFAICS (unless I also switch to LTO).
> 
> Anyway, the situation is much worse on the VideoCore IV. The
> alternatives here are:
> 
> 1.
>addcmpbCC rx, 0, imm, target
>; usually written as bCC rx, imm, target
> 
> 2.
> cmp rx, imm
> bCC .+2
> j   target
Yea, this isn't that uncommon.  You can describe both of these to the
branch shortening pass.

> 
> The tricky part is that the addcmpbCC instruction does NOT modify
> condition codes, while the cmp instruction does. Nothing you could
> solve in the linker...
> 
> OK, it seems I'll have to go with the worst-case variant.
You can support both.  You output the short case when the target is
close enough and the longer variant otherwise.

Jeff