Re: putc vs. fputc

2009-07-24 Thread Martin Guy
On 7/24/09, Uros Bizjak  wrote:
>  The source of gcc uses both, fputc and putc. I would like to do some
>  janitorial work and change fputc to putc.

putc and fputc have different semantics:
fputc is guaranteed to be a function while putc may be a macro.

M

"He who has nothing to do, combs dogs" - Sicilian saying


Re: Anyone else run ACATS on ARM?

2009-08-12 Thread Martin Guy
On 8/12/09, Joel Sherrill  wrote:
>  So any ACATS results from any other ARM target would be
>  appreciated.

I looked into gnat-arm for the new Debian port and the conclusion was
that it has never been bootstrapped onto ARM. The closest I have seen
is Adacore's GNATPro x86->xscale cross-compiler hosted on Windows and
targetting Nucleus OS (gak!)

The community feeling was that it would "just go" given a prodigal
burst of cross-compiling, but I never got achieved sufficiently high
blood pressure to try it...

 M


Re: Anyone else run ACATS on ARM?

2009-08-13 Thread Martin Guy
On 8/12/09, Matthias Klose  wrote:
> On 12.08.2009 23:07, Martin Guy wrote:
> > I looked into gnat-arm for the new Debian port and the conclusion was
> > that it has never been bootstrapped onto ARM. The closest I have seen
> > is Adacore's GNATPro x86->xscale cross-compiler hosted on Windows and
> > targetting Nucleus OS (gak!)
>
>  is there any arm-linx-gnueabi gnat binary that could be used to bootstrap
> an initial gnat-4.4 package for debian?

No, unless someone has done this since 2007. It involved
cross-compiling to generate a native gnat compiler, but that was not a
priority for ARM Ltd when I was working on this.

M


How to make ARM->MaverickCrunch register transfers schedulable?

2009-08-15 Thread Martin Guy
Hi!
  I'd appreciate some input on how to get the pipeline scheduler to
know about the bizarre MaverickCrunch timing characteristics.

  Brief: Crunch is an asynchronous ARM coprocessor which has internal
operations from/to its own register set, transfers between its own
registers and the ARM integer registers, and transfers directly
to/from memory.
  Softfp is the current favourite ABI, where double arguments are
passed in ARM register pairs, same as softfloat, and a typical double
float function transfers its arguments from ARM registers to the FPU,
does some munging between the FPU registers, then transfers the result
back to ARM regs for the return(). It has to do this 32 bits at a
time:

double adddf(double a, double b) {return (a+b);}

adddf:
cfmv64lrmvdx0, r0
cfmv64hrmvdx0, r1
cfmv64lrmvdx1, r2
cfmv64hrmvdx1, r3
cfaddd  mvdx1, mvdx1, mvdx0
cfmvr64lr0, mvdx1
cfmvr64hr1, mvdx1
bx  lr

Although you can do one transfer per cycle between the two units, two
consecutive transfers to the same Crunch register incur a delay of
four cycles, so each transfers to crunch registers takes 4 cycles. A
better sequence would be:

cfmv64lrmvdx0, r0
cfmv64lrmvdx1, r2
cfmv64hrmvdx0, r1
cfmv64hrmvdx1, r3

My questions are two:

- can I model the fact that two consecutive writes to the same
register have a latency of four cycles (whereas writes to different
registers can be one per cycle)?

- am I right in thinking to define two new register modes, MAVHI and
MAVLO for the two kinds of writes to the maverick registers, then turn
the movdf (and movdi) definitions for moves to/from ARM registers into
define_split's using the two new modes?

Thanks, sorry it's a bit osbcure!

   M

"An expert is someone who knows more and more about less and less:


Re: GCC 4..4.x speed regression - help?

2009-08-16 Thread Martin Guy
Yes, GCC is bigger and slower and for several architectures generates
bigger, slower code with every release, though saying so won't make
you very popular on this list! :)

One theory is that there are now so many different optimization passes
(and, worse, clever case-specific hacks hidden in the backends) that
the interaction between the lot of them is now chaotic. Selecting
optimization flags by hand is no longer humanly possible.

There is a project to untangle the mess: Grigori Fursin's MILEPOST GCC
at http://ctuning.org/wiki/index.php/CTools:MilepostGCC - an AI-based
attempt to autmatically select combinations of GCC optimization flags
according to their measured effectiveness and a profile of you source
code's characteristics. The idea is fairly repulsive but effective -
it reports major speed gains of the order of twice as fast compared to
the standard "fastest" -O options, and there is  Google Summer of Code
2009 project based on this work.

It seems to me that much over-hacked software lives a life cycle much
like the human one: infancy, adolescence, adulthood, middle-age (spot
the spread!) and ultimately old age and senility, exhibiting
characteristics at each stage akin to the mental faculties of a
person.

If you're serious about speed, you could try MILEPOST GCC, or try the
current up-and-coming "adolescent" open source compiler, LLVM at
llvm.org

M


Re: How to implement pattens with more that 30 alternatives

2009-12-22 Thread Martin Guy
On 12/22/09, Daniel Jacobowitz  wrote:
>  in a patch I'm working on for ARM cmpdi patterns, I ended up needing
>  "cmpdi_lhs_operand" and "cmpdi_rhs_operand" predicates because Cirrus
>  and VFP targets accept different constants.  Automatically generating
>  that would be a bit excessive though.

I wouldn't bother implementaing that if the VFP/Cirrus conflict is the
only thing that needs that.
GCC's has never been able to generate working code for Cirrus
MaverickCrunch for over a dozen separate reasons, from incorrect use
of the way the Maverick sets the condition codes to hardware bugs in
the 64-bit instructions (or in the way GCC uses them).

I eventually cooked up over a dozen patches to make 4.[23] generate
reliable crunch floating point code but if you enable the 64-bit insns
it still fails the openssl testsuite.

 M


Re: How to implement pattens with more that 30 alternatives

2009-12-22 Thread Martin Guy
On 12/22/09, Daniel Jacobowitz  wrote:
> Interesting, I knew you had a lot of Cirrus patches but I didn't
>  realize the state of the checked-in code was so bad.
>
>  Is what's there useful or actively harmful?

Neither useful nor harmful except in that it adds noise to the arm backend.
It's useful if you want to get a working compiler by applying my patches...
The basic insn description is ok but the algorithms to use the insns
are defective; I suppose it's passively harmful since until it's fixed
it just adds noise and size to the arm backend.

I did the copyright assignment thing but I haven't mainlined the code,
partly because it currently has an embarassing -mcirrus-di flag to
enable the imperfect 64-bit int support, partly out of laziness (the
dejagnu testsuite for all insns it can generate and for the more
interesting resolved bugs). Maybe one day...

M


Re: powerpc-eabi-gcc no implicit FPU usage

2010-01-16 Thread Martin Guy
On 1/16/10, David Edelsohn  wrote:
>  > Is there a way to get GCC to only use the FPU when we explicitly want to 
> use it (i.e. when we use doubles/floats)?  Is -msoft-float my only option 
> here?  Is there any sort of #pragma that could do the same thing as 
> -msoft-float (I didn't see one)?
>
>  To absolutely prevent use of FPRs, one must use -msoft-float.  The
>  hard-float and soft-float ABIs are incompatible and one cannot mix
>  object files.

There is a third option -mfloat-abi=softfp which stipulates that FP
instructions can be used within functions but the parameter and return
values are passed using the same conventions as soft float. soft and
softfp-compiled files can be linked together, allowing you to mix code
using FP instructions and not with source file granularity.

I dunno if that affects the use of FP registers to load /store 64-bit
integer values as you originally described, but may be the closest you
can get without modifying GCC to insert new #pragmas.

M


Re: Are pointers to be supposed to be sign or zero extended to wider integers?

2010-02-12 Thread Martin Guy
On 2/12/10, Richard Guenther  wrote:
> On Fri, Feb 12, 2010 at 10:41 AM, Jakub Jelinek  wrote:
>  > It seems pointers are sign extended to wider integers, is that intentional?

>  Your program prints zero-extends for ICC.
>
>  Probably the behavior is undefined and we get a warning anyway:

All C requires is that casting a opinter to an integer and back again
should be a no-op, so either behaviour should work, although there's a
more detailed explanation here with references to the language
standards:
http://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

To address your specific point, it says:
"extends according to the signedness of the integer type if the
pointer representation is larger than the integer type"

M


Re: Change x86 default arch for 4.5?

2010-02-21 Thread Martin Guy
I disagree with the "default cpu should 95% of what is currently on
sale" argument.

The default affects naive users most strongly, so it should "just
work" on as many processors as is reasonable, not be as fast as
possible on most of the majority of the processors currently on sale.
Naive users might have anything.

As an example of the results of that kind of thinking, we've had years
of pain and wasted time in the ARM world due to the default
architecture being armv5t instead of armv4t. The results are that user
after user, making their first steps compiling a cross toolchain,
turns up on the mailing lists having got "illegal instruction" after
days of work, and that almost all the distributions are forced to
carry an "unbreak armv4t" patch to GCC.
Lord, someone was even compelled to try and get Android working on
their Openmoko, while it was binary-only, by emulating the few trivial
instructions in the kernel. Ubuntu, similarly, excludes the lower end
by leaving the default unchanged.

When users get interested in maximal speed, the first thing they do is
go for -mcpu=xyz, which doesn't require them to recompile the compiler
and is educational, while software distributions who build their own
compilers make their own choices about the minimum processors they
want to support.

Of course, most GCC developers and 95% of the people they know
probably do all have big new PCs using processors that are currently
in production, so to their eyes it might seem that all the world has
SSE2. However most people in the world cherish anything that works at
all; why make life harder for them?

We don't see the bottom end in the various usage stats because most
people using them don't have the internet (or a landline phone, come
to that).
The time-honoured policy of having the default settings work on as
wide a range hardware as possible is a socially inclusive one.
Some manifestos chisel the "low bar" into their constitutions (Debian
for example); it would be nice for GCC to do so too.

Cheers

M

"You can't buy a computer these days with less that a gigabyte."
   -- A.S.Tanenbaum, trying to defend Minix's fixed-size kernel arrays
at FOSDEM 2010


Re: Change x86 default arch for 4.5?

2010-02-21 Thread Martin Guy
On 2/21/10, Dave Korn  wrote:
>  It makes perfect sense that configuring for i686-*-* should
>  get you an i686 compiler and configuring for i586-*-* should get you an i586
>  compiler and so on, rather than that you get an i386 compiler no matter what
>  you asked for.

Agreed

   M


Re: Change x86 default arch for 4.5?

2010-02-21 Thread Martin Guy
On 2/21/10, Steven Bosscher  wrote:

> It is interesting how this conflicts with your signature:
>  > "You can't buy a computer these days with less that a gigabyte."
>  >   -- A.S.Tanenbaum, trying to defend Minix's fixed-size kernel arrays
>  > at FOSDEM 2010
> I take it you disagree with this? Because most people do not expect to
>  need 1GB for a Minix installation. ;-)
It's a straw man, another example of bogus reasoning. Of course you
can buy new computers with 64MB, and they are particularly suitable
for simple kernels.
The embedded Linux/BSD crowd at the presentation didn't seem very
impressed either.

> You want to cater for a minority with old hardware. I
>  actually expect you'll find that those users are less naive than the
>  average gcc user.
I want to cater for everyone, especially youngsters, learners and the
poor struggling with whatever they can get their hands on.
It's not even a rich country/poor country thing: I live in a run down
industrial area of England where the local kids are gagging for
anything that works.

> Can you name these distributions? I can only name Debian
>  (http://lists.debian.org/debian-arm/2006/06/msg00015.html)
A quick search for "unbreak-armv4t.patch" shows, at a glance on the
first ten hits, fedora, openembedded, slind, openmoko, mamona,
android-porting. I'll leave you to peruse page two on :)

> Ubuntu also requires i686 or later.
Ubuntu also needs 384MB to work these days, so it is a reasonable
application-specific choice for that distro. GCC should not be
tailored to high-end desktop, laptop and server machines.

>  But anyway, bringing ARM into this discussion is neither here nor there.
It is a specific example of a pointlessly higher cpu default (for
"arm-*") where such a decision was made in GCC and the annoyances it
causes. which is what this thread had drifted into.

> Your naive users (and mine) don't even know about -mcpu and -march.
Exactly, so they go "cc hello.c; a.out" and get "Illegal instruction"
unless they have a relatively new first-world PC.

>  >, which doesn't require them to recompile the compiler
> Neither does compiling for i386/i486 or armv4 if you have a
>  cross-compiler for another default -- you can use -mcpu to "downgrade"
>  too.
Of course. However it does bite cross-compilers because people end up
distributing the C library compiled for a high-end CPU, so no program
will run even when you do drop the -mcpu level. Raising it instead
still works for everyone.

>  (**/me mumbles something incoherent about Pareto, etc...***)
Moore's Law suggests that we should optimise most intensely for the
physically slower processors, where sloth or speed translates into
more real time, but I forgot that point in the last post :)

Actually, this is irrelevant to the thread, since one always has to
specify a CPU model in the tuple when configuring for i?86, and the
thread was about an i686-* configuration tuple still producing a
compiler that outputs i386 code by default, which does seem silly.

Happy Sunday.

   M


Re: Change x86 default arch for 4.5?

2010-02-21 Thread Martin Guy
On 2/21/10, Dave Korn  wrote:
>   I too am having a hard time envisaging exactly who would be in this class of
>  users who are simultaneously so naive that they don't know about -march or
>  -mcpu or think to read the manual, and yet so advanced that they are trying 
> to
>  write programs for and rebuild modern compilers on ancient equipment

Old equipment is retro in rich places, but we built the first
public-access email lab from stuff found in skips in inner-city Sicily
and were very glad that Slackware, Debian and so on ran on 386 and
486's. At that time "everyone who was anyone" had pentium MMXs at
least.

The point about defaults is that the GCC default tends to filter down
into the default for distributions; if GCC had been following the "90%
of people have Pentiums" rule, and the distros followed the default,
our low-budget lab would have been two terminals instead of about a
dozen (or had to run SCO Xenix or something). I'm ex-UK-university
computing lecturer myself, but been both rich and poor, both many
times, so I know how the other half lives.

Incidentally, one of the hackers who used Linux in that Sicilian squat
at the age of 13 has just been accepted to do a computing degree at
Cambridge University, UK.

Not this this *still* has anything to do with the thread, but it's Sunday, so...

Bless

   M


Re: Change x86 default arch for 4.5?

2010-02-21 Thread Martin Guy
On 2/21/10, Dave Korn  wrote:
> On 21/02/2010 20:03, Martin Guy wrote:
>  > The point about defaults is that the GCC default tends to filter down
>  > into the default for distributions;
>   I'd find it surprising if that was really the way it happens; don't
>  distributions make deliberate and conscious decisions about binary standards
>  and things like that?

Changing the default without losing that compatability would assume
that every distro (and there are hundreds of them) either already
specifies a specific arch or that its GCC maintainer notices the
change in GCC and adds explicit configuration options to revert the
change. The big ones with dedicated maintainers for GCC probably
already do that; others just configure and make the standard distro
and take what comes.

On 2/21/10, H.J. Lu  wrote:
> There is nothing which stops them from using -march=i386. It just may not
>  be the default.
There is: the arch that the libraries in their distro were compiled to run on.

On 2/21/10, Steven Bosscher  wrote:
> On Sun, Feb 21, 2010 at 9:22 PM, Erik Trulsson  wrote:
>  > One of the great advantages of much free/open software is the way it
>  > will work just fine even on older hardware.
>  And, let's face it, most users of gcc don't use it because it is free
>  software but because it performs just fine for them. And when it does
>  not, they just as easily switch to another compiler.
Hardly. At present there is a GCC monoculture, both in what is the
standard compiler with most systems and in what compiler packages will
build with, either because the build system uses GCC-specific flags or
because the code using GCC extensions.

On 2/21/10, Steven Bosscher  wrote:
> Which brings us back to the discussion of satisfying the needs of a
>  tiny minority while hurting the vast majority of users.
There's a difference in quality between the two. The "hurt" is that
powerful modern PCs might take 20% longer to encode a DVD, while the
"needs" is that the bulk of software will run at all on their poor
hardware.

It's usual in modern societies to give priority to enabling the
underprivileged to function at all over giving the well-off the
maximum of comfort and speed, but how you value the two aspects
probably depends on your personal experience of the two realities.

On 2/21/10, Dave Korn  wrote:
> On 21/02/2010 21:53, Steven Bosscher wrote:
>  > Yes, of course -- but what is the advantage of using the latest GCC
>  > for such an older processor?
>   Tree-SSA?  LTO?  Fixed bugs?  New languages?  Etc?  I can see plenty of good
>  reasons for it.
Apart from those factors (and one hopes that in general all code
generation improves from release to release), users may not really
have a choice, being most likely to try (or be given) the most recent
stable version of whatever distro, and distros tend to try to ship the
most recent stable gcc in each new release.

Let me add another example from my own experience: In 2001 I was stuck
for months in a crumbling house in the countryside with nothing but an
8MB 25MHz 386 because that's all l I had available at the time (green
screen, yay!) and I completed what would have been my postgraduate
degree project, begun in 1985: an unlimited precision floating point
math library in a pure functional language. The fact that I could do
that at all may be due to GCC's "work on the minimum" policy of the
time, both in the distro and on whatever machine David Turner used to
compile the binary-only release of the Miranda interpreter.

If I recall correctly, the default is currently arched and tuned for
486, and the 386's lacks are trapped and emulated in the kernel.

On 2/21/10, Steven Bosscher  wrote:
> Well, as Martin already pointed out (contradicting his own point):
>  Apparently a lot of distributions *do* change the defaults.

That's OK, I don't have The Truth in my pocket. Nor do I have any
quantifiable measure of the number of different systems in use in the
whole world, just a value judgement based on a different set of
experiences of the outcome of restrictive and generous policies in
munimum CPU targetting, which I'm sharing.

My direct experience is that low-end PCs are widely used in societies
where things are hard, and that upstream software developers are
always given the latest, fastest computers to make them more
productive and are unaware of the struggling masses :)

Cheers

  M


Re: Why not contribute? (to GCC)

2010-04-24 Thread Martin Guy
OK, now that stage3 is over I'm thinking of updating the
MaverickCrunch FPU fixes (currently for 4.3) and merging them but
would appreciate some guidance.

There are 26 patches in all and I can't expect anyone to understand
them because they require a good understanding of the FPU and its
hardware bugs (and there are a lot of them!) :) What's the best path?
Create a branch, work there and then merge when done?

I have done the copyright assignment stuff but don't have an account
on gcc.gnu.org. They all affect files under config/arm/ apart from one
testsuite fix and the docs.

The missing part is a huge testsuite for it. I confess I find that
daunting; it is potentially huge in that it replaces a non-working
code generator with a working one, and for the non-working one there
were *no* fpu-specific tests.
Do I really need to write an entire validation suite?

M


merging the maverick FPU patches

2010-04-25 Thread Martin Guy
now that stage3 is over I'm thinking of updating the
MaverickCrunch FPU fixes (currently for 4.3) and merging them but
would appreciate some guidance.

There are 26 patches in all and I can't expect anyone to understand
them because they require a good understanding of the FPU and its
hardware bugs (and there are a lot of them!) :) What's the best path?
Create a branch, work there and then merge when done?

I have done the copyright assignment stuff but don't have an account
on gcc.gnu.org. They all affect files under config/arm/ apart from one
testsuite fix and the docs.

   M


Re: What to do with hardware exception (unaligned access) ? ARM920T processor

2008-10-01 Thread Martin Guy
On 10/1/08, Vladimir Sterjantov <[EMAIL PROTECTED]> wrote:
>  Processor ARM920T, chip Atmel at91rm9200.
>
>  char c[30];
>  unsigned short *pN = &c[1];
>
>  *pN = 0x1234;

Accesses to shorts on ARM need to be aligned to an even address, and
longs to a 4-byte address. Otherwise the access returns (eg, for a
4-byte word pointer) is *(p & ~3) >>> *(p & 3) (where >>> is byte
rotate, not bit shift). Or causes a memory fault, if that's how your
system is configured.

If you don't want to make the code portable and your are running a
recent Linux, a fast fix is to
  echo 2 > /proc/cpu/alignment
which should make the kernel trap misaligned accesses and fix them up
for you, with a loss in performance of course. The real answer is to
fix the code...

M


ARM machine description: how are pool_ranges calculated

2008-11-15 Thread Martin Guy
Hi!
   I'd appreciate help with my learner's questions about GCC machine
descriptions, about the ARM code generator.
   I'm trying to fix code generation for the Cirrus MaverickCrunch FPU
by trying to understand several sets of patches, figure out which are
bogus which are buggy and which need reimplementing, and to distill
the essence of them into one "proper" set, but each day I'm ending up
confused by too many things about MDs that I am not certain of, so
some help would be appreciated.
  On with the first question...

ARM machine description and pool ranges:
   How should the value in the pool_range and neg_pool_range
attributes be calculated?
   What precisely do the values represent?

Here's how far I got:
   In the machine description, the pool_range and neg_pool_range
attributes tell how far ahead or behind the current instruction a
constant can be loaded relative to the current instruction.
The most common values are:
  - a sign bit and a 12-bit byte offset for ARM load insns (+/- 0 to
4095 bytes, max usable of 4092 for 4-byte-aligned words)
  - a sign bit and an 8-bit word offset for Maverick and VFP load
insns (+/- 0 to 1020 bytes)
  - other ranges for thumb instructions and iwmmxt, depending on insn
and addressing mode

When the offsets stored in the instructions are used, they refer to
offsets from the address of the instruction (IA) plus 8 bytes. Are the
pool_ranges also calculated from IA+8, from the address of the
instruction itself or even from the address of the following
instruction (IA+4)?

In the md, the most common pairs of values are (-4084, +4096) (-1008,
+1020) but there several other values in use for no obvious reason:
+4092, -1004, -1012, +1024
The +4096 (>4092) suggests that they are not the values as encoded in
the instruction, but are
offset by at least 4.
The full useful ranges offset by 8 would give (-4084, +4100) (-1016, +1028)

I can't find a mathematically explicit comment about it, and can't
make sense of the values.

In practice, by compiling autogenerated test programs and objdumping them -d:
32-bit integer constants use from [pc, #-4092] to [pc, #4084]
64-bit constants in pairs of ARM registers use from [pc, #-3072] to
[pc, #3068] (??)
Alternating 32- and 64-bit constants use from [pc, #-3412] to [pc, #3404] (???)
64-bit doubles in Maverick registers use from [pc, #-1020] to [pc,
#1008] (these are the exact values specified in the attributes fields
of cirrus.md for the cfldrd insn, without any IA+8 adjustment!)

Two non-issues
 - 64-bit alignment requirement for 64-bit quantities in EABI is not
applied to the constant pools - 64-bit data is 32-bit aligned there,
so no allowance of a possible extra 4 bytes for alignment is
necessary.
 - the -mcirrus-fix-invalid-insns flag, which peppers the output with
NOPs, causes no problems since the constant pool calculations are done
after the NOP-insertion.

Hoping I haven't just failed to spot some large and obvious comment...

 M


Re: GCC 4.4.0 Status Report (2008-11-27)

2008-12-09 Thread Martin Guy
On 12/9/08, Joel Sherrill <[EMAIL PROTECTED]> wrote:
>  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37440
>
>  Can Ada build on any Arm platform?

The only existing GNAT Ada compiler I could find for ARM (while
thinking about doing it for the new Debian eabi port) is Adacore's
Windows->Nucleus OS crosscompiler for Xscale CPUs, though they don't
say what version of GCC they use. Then my funding ran out.
  My impression at the time, and that of the Debian Ada maintainer,
was that it is a case of no one ever having made the effort to
cross-bootstrap a native linux compiler but that it "should just go".

If anyone cares enough I'm open to offers to try it...

M


Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax

2009-03-16 Thread Martin Guy
On 3/14/09, Paolo Bonzini  wrote:
> Hans-Peter Nilsson wrote:
>  > The answer to the question is "no", but I'd guess the more
>  > useful answer is "yes", for different definitions of "truncate".
>
> Ok, after my patches you will be able to teach GCC about this definition
>  of truncate.

I expect it's a bit too extreme an example, but I've just found (to my
horror) that the MaverickCrunch FPU truncates all its shift counts to
6-bit signed (-32(right) to +31(left)), including on 64-bit integers,
which is not very helpful to compile for.
...unless it happens to come easy to handle "shift count is truncated
to less than size of word" in your new framework

M


Re: help for arm avr bfin cris frv h8300 m68k mcore mmix pdp11 rs6000 sh vax

2009-03-16 Thread Martin Guy
On 3/16/09, Paolo Bonzini  wrote:
>AND R1, R0, #31
>MOV R2, R2, SHIFT R1
>ANDS R1, R0, #32
>MOVNE R2, R2, SHIFT #31
>MOVNE R2, R2, SHIFT #1
>
>  or
>
>ANDS R1, R0, #32
>MOVNE R2, R2, SHIFT #-32
>SUB R1, R1, R0  ; R1 = (x >= 32 ? 32 - x : -x)
>MOV R2, R2, SHIFT R1

Thanks for the tips. Yes, I was contemplating cooking up something
like that, hobbled by the fact that if you use maverick instructions
conditionally you either have to put seven nops either side of them or
risk death by astonishment.

M


Re: Machine Description Template?

2009-06-09 Thread Martin Guy
On 6/5/09, Graham Reitz  wrote:
> I have been working through sections 16 & 17 of the gccint.info
> document and also read through Hans' 'Porting GCC for Dunces'.

There is also "Incremental Machine Descriptions for GCC"
http://www.cse.iitb.ac.in/~uday/soft-copies/incrementalMD.pdf
which describes creation of a new, clean machine description from scratch

M


Re: What is the best way to resolve ARM alignment issues for large modules?

2010-05-08 Thread Martin Guy
On 5/7/10, Shaun Pinney  wrote:
>  Essentially, we have code which works fine on x86/PowerPC but fails on ARM 
> due
>  to differences in how misaligned accesses are handled.  The failures occur in
>  multiple large modules developed outside of our team and we need to find a
>  solution.  The best question to sum this up is, how can we use the compiler 
> to
>  arrive at a complete solution to quickly identify all code locations which
>  generate misaligned accesses and/or prevent the compiler from generating
>  misaligned accesses?

Dunno about the compiler, but if you use the Linux kernel you can fiddle with
/proc/cpu/alignment.

By default it's set to 0, which silently gives garbage results when
unaligned accesses are made.

echo 3 > /proc/cpu/alignment

will fix those misalignments using a kernel trap to emulate "correct"
behaviour (i.e. loading from bytes (char *)a to (char *)a + 3 in the
case of an int). Alternatively,

echo 5 > /proc/cpu/alignment

will make an unaligned access cause a Bus Error, which usually kills
the process and you can identify the offending code by running it
under gdb.

Eliminating the unaligned accesses is tedious work, but the result
will run slightly faster than relying on fixups, as well as making it
portable to any word-aligned system.

   M


Deprecating ARM FPA support (was: ARM Neon Tests Failing on non-Neon Target)

2010-05-22 Thread Martin Guy
On 5/11/10, Mark Mitchell  wrote:
> Richard Earnshaw wrote:
>
>  > Speaking of which, we should probably formally deprecate the old arm-elf
>  > derived targets in 4.6 so that we can remove them in 4.7.
>
>  > Similarly, we should deprecate support for the FPA on ARM.
>
>  I agree.

No one seems to have succeeded in getting arm-elf to work for some
years, so removing them seems to be no loss.

However, although no one currently sells FPA hardware, it is widely
supported as the only floating point model emulated by the Linux
kernel, and people have to use it when compiling stuff to run on OABI
systems, which include boards currently on the market based on ARMv4
(no t) such as the Balloon Board 2.0 as well as boards with more
recent CPUs where the manufacturer only supplies a LInux port for a
kernel version that predates EABI support such as the Armadillo range.

Dropping FPA support from GCC effectively makes the OABI unusable, and
often we are forced to use that by the environment supplied to us. Are
there significant advantages to removing FPA support, other than
reducing the size of the ARM backend?

 M


Re: Deprecating ARM FPA support

2010-05-24 Thread Martin Guy
On 5/24/10, Mark Mitchell  wrote:
>  > Certainly removing support for FPA (and any targets that require it) as
>  > a first step would be an option; but we should also focus on where we
>  > want to get to.
>
>  I agree with that.  But, it would also be interesting to know just how
>  broken that code is.  If, in fact, FPA and/or ARM ELF mostly work at
>  present, then there's less call for actually removing (as opposed to
>  deprecating) things.

FPA code generation is 100% good AFAIK, and has been used intensively
for years (as the FPU model for all gnu/linux ports before EABI).

Maverick is the one that has never worked since it was submitted; I
have patches that make it 100% good (well, ok, no known failure cases)
but don't know how to get them into mainline.

M


Re: merging the maverick FPU patches

2010-06-01 Thread Martin Guy
On 4/25/10, Ian Lance Taylor  wrote:
> Martin Guy  writes:
>
>  > now that stage3 is over I'm thinking of updating the
>  > MaverickCrunch FPU fixes (currently for 4.3) and merging them but
>  > would appreciate some guidance.
>  >
>  > There are 26 patches in all and I can't expect anyone to understand
>  > them because they require a good understanding of the FPU and its
>  > hardware bugs (and there are a lot of them!) :) What's the best path?
>  > Create a branch, work there and then merge when done?
>  >
>  > I have done the copyright assignment stuff but don't have an account
>  > on gcc.gnu.org. They all affect files under config/arm/ apart from one
>  > testsuite fix and the docs.
>
>
> For a backend specific patch like this, I would recommend contacting
>  the ARM backend maintainers directly to ask how they would prefer to
>  proceed.  They can be found in the top level MAINTAINERS file:
>
>  arm portNick Cliftonni...@redhat.com
>  arm portRichard Earnshawrichard.earns...@arm.com
>  arm portPaul Brook  p...@codesourcery.com

Hi
  I've had no reply from anyone - maybe everyone is hoping someone
else do so. :)
Of the three companies, redhat would be the most suitable, since the
original unfinished port was done by them, and I guess ARM has no
interest in making GCC work with non-ARM FPUs.
  The code they add/remove/change is pretty self-contained and doesn't
impact the other code generation options. It just fixes the current
implementation.
  Nick, are you willing to do the necessary? Since it just fixes
existing code that never worked, all it requires from a maintainer is
to check that it doesn't break code generation for other targets,
which is easy to check automatically by testing a sample of targets
and it's not hard to check by eye that the changes are only active
when TARGET_MAVERICK.

Cheers

   M


Re: Patch pinging

2010-06-07 Thread Martin Guy
On 6/7/10, NightStrike  wrote:
> On Wed, Jun 2, 2010 at 3:17 PM, Diego Novillo  wrote:
>  > On Wed, Jun 2, 2010 at 14:09, NightStrike  wrote:
>  >
>  >> threads that haven't been addressed.  I offered to Ian to do the same
>  >> thing for the whole mailing list if we can make it a policy that
>  >> people who commit changes do what Kai is doing so that it's clear that
>  >> the thread is done with.  I don't mind throwing a few pings down, and
>  >> I already have the whole ML tagged with a gmail label.
>  >
>  > Seems like a good idea to me.  I do not usually read the list every
>  > day (or every week some times), so if a patch is in my area and I had
>  > not been directly CC'd, it can take me up to 2 weeks to get to it.
>  >
>  > Most of the areas I'm on had good coverage (particularly since I share
>  > much with richi who is a very prolific patch reviewer), so it's not
>  > too much of a problem.
>
>  Ok.  Is one person responding enough for me to start doing that?  I
>  don't know how this sort of approval / acceptance process works for
>  GCC.

Excellent idea and thanks for volunteering..

M


Re: Patch pinging

2010-06-08 Thread Martin Guy
On 6/8/10, NightStrike  wrote:
> Are you volunteering to write that small script?

DUnno, are you volunteering to write that small script?

You're the only one here actually volunteering a forwardgoing
commitment of their time here to improve GCC's development in this
way, it seems (and mostly just getting vilified for it, for using a
bizarre camelcase name!)

What I expected to happen was that you would start doing whta you
envision should happen by hand, and would then get so bored at doing
it that out of laziness you'd automate it somehow. :)

Still, we'll see...

M


Re: Patch pinging

2010-06-09 Thread Martin Guy
>  > Still, we'll see...
>
>  Apparently not :(

Why not? At most, you just need not to make sure nothing ever send
mail to people who think that kind of thing is bozoid...

M


Re: Deprecating ARM FPA support (was: ARM Neon Tests Failing on non-Neon Target)

2010-06-27 Thread Martin Guy
On 6/27/10, Gerald Pfeifer  wrote:
> On Mon, 24 May 2010, Richard Kenner wrote:
>  > I think that's a critical distinction.  I can't see removing a port just
>  > because it's not used much (or at all) because it might be valuable for
>  > historical reason or to show examples for how to do things.
>
>  I'd say a port with
>  zero known users should actually be removed.

FPA is very widely used. From day 0 until 2006 it was the only FP
model emulated by the Linux kernel and so in required by all operating
systems created up to that date.
  Actively-maintained software distributions and recent ports of Linux
tend to use a different ABI ("EABI") whose default FP model is
user-space softfloat and does not require FPA code generation
(thankfully!), however there are many exiting software distributions
in current use that only support emulated hard FPA instructions. For
ARM boards without mainline Linux support whose manufacturers' kernel
ports predates 2.6.16, it is mandatory, as is also is for users who
just want to compile code for a given existing system that happens not
to be running a recent kernel and userspace.

 M


Re: Error in GCC documentation page

2010-07-08 Thread Martin Guy
> > > But in the C++ standard "integral expression" is more common.

"integral" is an adjective and "integer" is a noun.

"integer expression", though gramatically wrong (or, at best, an
elision of two nouns), is perfectly clear and unambiguous, whereas
"integral expression", though gramatically correct, hits some people
as "built-in expression" and trips others up as an unfamiliar and rare
word whose meaning is uncertain - for what gain?

Personally, I like "integral expression", but then I'm a
native-English speaker and UK academic with an extended vocabulary.

For world-class dovumentation, it depends whether it's more important
to be clear and unambiguous to all readers or an object lesson in
type-correct advanced English.
I'd say our friend has pointed out a tiny place where it could be made
a little more effective in the first of these purposes.

   M


Re: sparc elf

2006-06-12 Thread Martin Guy

2006/6/12, Niklaus <[EMAIL PROTECTED]>:

int main()
{
return 3;
}

i compiled it using sparc-elf-gcc -c test.c.
./sparc-elf-ld --entry=main test.o -o a.out
when i executed a.out on sparc machine it segfaulted and dumped core.


I guess 'cos you set entrypoint=main instead of __start or whatever it
is called.
You need to load in crtbegin.o (it may be called crt0.o), which
contains the correct entry point and calls main() and cleans up after
it, not just dive straight into main().

I wuold guess your segfault was caused by whatever non-existent code
your main() "returned" to :)  You could just call _exit(3) instead
probably.

Or use "sparc-elf-gcc test.c", even better...

M


Re: Contributing to cross-compiling

2008-01-31 Thread Martin Guy
2008/1/31, Manuel López-Ibáñez <[EMAIL PROTECTED]>:
> Nonetheless, if someone decided to go through the hassle of collecting
> tutorials and hints for various cross-compiling configurations in the
> wiki, I think many users will appreciate it. It is still considered by
> many to be a "dark art"[*].

The crosstool project http://kegel.com/crosstool is a humungous shell
script with configuration files that has collected a lot of the
"community wisdom" over the years about the necessary runes to build
cross-compilers for different scenarios and with different
target-cpu/gcc/glibc/OS combinations.

There is also a menu-driven spin-off project, crosstool-ng, which is
less mature but embodies the same set of knowledge.

M


Re: Benchmarks: 7z, bzip2 & gzip.

2008-03-04 Thread Martin Guy
2008/2/29, J.C. Pizarro <[EMAIL PROTECTED]>:
> Here are the results of benchmarks of 3 compressors: 7z, bzip2 and gzip, and
>  GCCs 3.4.6, 4.1.3-20080225, 4.2.4-20080227, 4.3.0-20080228 & 4.4.0-20080222.

Thanks, that's very interesting. I had noticed 4.2 producing 10%
larger and 10% slower code for a sample code fragment for ARM but
couldn't follow it up.

Is there a clause in regressions for "takes longer to compile and
produces worse code"?

M


wot to do with the Maverick Crunch patches?

2008-03-30 Thread Martin Guy
Ok, so we all have dozens of these EP93xx ARM SoCs on cheap boards,
with unusable floating point hardware.

What do we have to do to get the best-working GCC support for Maverick
Crunch FPU?

Suggest: make open-source project with objective:."to get the
best-working GCC support for Maverick Crunch FPU". Anyone wanna run
one, create repositories, set up mailing list etc a la
producingoss.com, or is the current infrastructure sufficient for a
coordinated effort?
Host the sets of patches under savannah.gnu.org and endeavour to unite them?
Do we have a wiki for it, other than debian's ArmEabiPort and the wikipedia?

As I understand it, mailline GCC with patches in various versions can give:

futaris-4.1.2/-4.2.0: Can usually use floating point in hardware for C
and C++, maybe problems with exception unwinding in C++. In generated
ASM code, all conditional execution of instructions is disabled except
for jump/branch. Loss of code speed/size: negligable.
Passes most FP tests but does not produce a fully working glibc (I
gather from the Maverick OpenEmbedded people)

cirrus-latest: Conditional instructions are enabled but you can still
get inaccurate or junk results at runtime due to timing bugs in FP
hardware triggered by certain types of instructions being a certain
distance apart at runtime. Does not pass all floating point math
verification tests either, but does worse than futaris.
Cirrus also have a hand-coded Maverick libm that you can link with
old-ABI binaries - can we incorporate this asm code in mainline?

Thoughts on a postcard please... any further progress in OE land?

   M


Re: [linux-cirrus] Re: wot to do with the Maverick Crunch patches?

2008-03-30 Thread Martin Guy
On 3/30/08, Brian Austin <[EMAIL PROTECTED]> wrote:
>  I am now doing Linux ALSA/SoC work for our low power audio codecs.
Good luck, look forward to using them... :)

>  I have been given the freedom with this new
>  position to allow access to this machine for outside people to
>  contribute whatever works they would like.
>
>  I can add a wiki, or whatever ya'll want if you wish to use our hardware
>  and pipeline for WWW things.  I also have GIT, BugZilla, and some other
>  stuff.

What URL is to be its "home page"?...

   M


Re: [linux-cirrus] Re: wot to do with the Maverick Crunch patches?

2008-03-31 Thread Martin Guy
> The company I work for is about to release a board to PCB fab
>  with a Cirrus part on it.  If this is the case we may want to hold back on 
> the
>  release and switch ARM parts.

If it's the EP93xx, you'd be well-advised to do so; I gather there is
one similar competitor that doesn't waste silicon on a broken FPU, a
display engine that can only do up to 800x600x16 or 1024x768x8 without
getting jumpy (2.6.2X fbdev), and a raster graphic operations unit
that appears to be slower than doing the corresponding bitops in ARM
software.

Don't get me wrong, the thing still bristles with peripherals and
delievers lots of poke for sexto to no energy, and we are working on
making the most of what we have.

Has anyone tried the NetBSD armevb port on an ep93XX and added the
frame driver patch I've seen lurking around? Could its frame buffer do
stable higher-res full-colour graphics? The Linux one does them but
the frame jitters about, as if the VDU is being locked out of the RAM
for too long.

>  I guess we'll go after our supplier as well to see what availability on the
>  existing parts will be like

Well, leave some for us :) No, it's still a solid chip that runs for
hundreds of days without a blip and barely gets warm, so I wouldn't
redesign unless you wanted those specific features or are early enough
in the design cycle.

   M


Best version of gnat-4.X port to start a port to arm eabi?

2008-05-01 Thread Martin Guy
Hi!
  I'm about to lower the gangplanks to get a native gnat on ARM EABI
through an unholy succession of cross-compilers, with the object of
getting gnat-4.1 4.2 and 4.3 into the new Debian port for ARM EABI.

 The only arm-targetted gnat I could find is adacore's Windows
cross-compiler for xscale (gag retch) but at least that suggests that
it's possible, and the Debian ADA person made optimistic noises when I
asked, but I thought I'd better consult the oracle first :)
  I've seen the recommendation about using the same version of gnat as
the version you're cross-compiling, and I gather that each version
will natively compile later versions ok, but maybe not the other way
round, so I'm assuming that I need to use an existing x86-native
gnat/gcc to make x86->arm-cross of the same version, then use that
canadianly to make arm-native, then use that to build the debian-gnat
package or the same and later versions.

  At the moment I am assuming to start with 4.1 to get all 3, but I
know that gcj only works on ARM EABI from 4.3, and C++ still has
problems with exceptions (try-catch) on EABI, maybe less so in later
versions (?) So, before I set out on the journey, does anyone know of
gnat-reasons or ARM EABI-reasons that would make it wiser to start
with a later version than 4.1?
  I confess I know little about Ada except that it has a formal syntax
longer than the bible...

Thanks

M


Re: Best version of gnat-4.X port to start a port to arm eabi?

2008-05-02 Thread Martin Guy
Many thanks for the input.

On 5/2/08, Joel Sherrill <[EMAIL PROTECTED]> wrote:
>  Do you mean the gcc target is arm-eabi?
As well as the host - I need to end up with a native Ada compiler
running on arm-linux-gnueabi.

On 5/1/08, Laurent GUERBY <[EMAIL PROTECTED]> wrote:
> > http://www.rtems.com/wiki/index.php/RTEMSAda
Fab!

> I haven't quite gotten skyeye to the point
>  I trust running testsuites on it completely automated
Aah, skyeye! I've been building and testing on qemu-arm-system since
2006 and it's been rock-solid.

> > The main issue for Ada with respect to other GCC languages
> > is the lack of support of "multilibs".
Fortunately not needed on Debian arm.

> > I don't think you need canadian cross, in the old times there were
> > targets in the Ada Makefile to help moving from a cross to a native
> > compiler.
Interesting, I'll have a look. I had been thinking to build a regular
x-compiler and to use it to cross-compile the ada compiler, but
thinking on't, it should be possible to generate it in one canadian
(or "cross-native") build. Does that sound a reasonable expectation?

>  I don't think that is necessary.   arm-eabi should be very close
>  to working (with newlib as the C library).
Good. However the environment is a given: Debian hence glibc.

>  If you want to compile on a bi-quad Xeon at 3GHz with 16GB of RAM (and
>  many other machines) running debian you can apply for an account on the
>  GCC Compile farm:
Thanks. Incidentally, there's a publicly-accessible 600MHz 512MB ARM
box here running arm-linux-gnueabi Debian, on which anyone wanting to
do ARM testing/dev is welcome to an account.

> > > longer than the bible...
Sorry, that was a quote from "The Song of Hakawatha", worth googling
if you don't know it and fancy a geeky chuckle.

M


Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-10 Thread Martin Guy
On 5/9/08, Paolo Bonzini <[EMAIL PROTECTED]> wrote:
>  The idea is to use integer arithmetic to compute the right exponent, and
> the lookup table to estimate the mantissa.  I used something like this for
> square root:
>
>  1) shift the entire FP number by 1 to the right (logical right shift)
>  2) sum 0x2000 so that the exponent is still offset by 64
>  3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 32-bit
> table
>  4) sum the value (as a 32-bit integer!) with the content of the table
>  5) perform 2 Newton-Raphson iterations as necessary

It normally turns out to be faster to use the magic integer sqrt
algorithm, even when you have multiplication and division in hardware

unsigned long
isqrt(x)
unsigned long x;
{
register unsigned long op, res, one;

op = x;
res = 0;

/* "one" starts at the highest power of four <= than the argument. */
one = 1 << 30;  /* second-to-top bit set */
while (one > op) one >>= 2;

while (one != 0) {
if (op >= res + one) {
op = op - (res + one);
res = res +  2 * one;
}
res >>= 1;
one >>= 2;
}
return(res);
}

The current soft-fp routine in libm seems to use a variant of this,
but of course it may be faster if implemented using the Maverick's
64-bit add/sub/cmp.

M


Re: GCC 4.2.2 arm-linux-gnueabi: c++ exceptions handling?

2008-09-26 Thread Martin Guy
On 9/26/08, Sergei Poselenov <[EMAIL PROTECTED]> wrote:
> Hello all,
>
>  I've built the above cross-compiler and ran the GCC testsuite.
>  Noted a lot of c++ tests failed with the same output:
>  ...
>  terminate called after throwing an instance of 'int'
>  terminate called recursively

Are you configuring cross glibc with --disable-libunwind-exceptions?
This has been necessary for all ARM EABI cross-compilers I've built so far.

>  Could someone having the 4.2 release series compiler configured for
>  ARM EABI target try this simple test:

I just tried it with the native Debian ARM EABI compiler: gcc-4.2.4,
binutils-2.18.0.20080103, glibc-2.7 and it silently exits(0).
FWIW, their g++-4.2 is also configured with explicit
--disable-sjlj-exceptions, although that seems to be the default.

M