HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Toon Moene

Sebastian,






   Here are, in addition, the numbers for compiling and 
running HIRLAM with -ftree-loop-distribution (after applying your patch, 
obviously).


There something weird going on with the count of the "loops not 
vectorized" - every successfully vectorized loop gets an additional 
message:


note: not vectorized: vectorization may not beprofitable.

which rather defeats the purpose of the "not vectorized" messages.

In short, almost 1900 more loops are vectorized, but that's of course 
certainly due to the fact that loop distribution *makes* more loops.


In run time it has little (but positive) effect.

Kind regards,

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003
Baseline, no source changes:

Mon Dec 10 17:45:19 UTC 2007 (revision 130746)

Compilation flags:

CCFLAGS := -g -O3 $(MACHINECPP) -ffast-math -fno-associative-math -march=native 
-mtune=native -ftree-vectorizer-verbose=2
FCFLAGS := -g -O3 -fbacktrace -ffpe-trap=invalid,zero,overflow -ffast-math 
-fno-associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2

Loops vectorized:
5675
Loops not vectorized:
13705

Timings:
20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK12.7488 SECONDS
20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK  2445.9609 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   259.3362 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK12.4408 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   305.9351 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK   262.1124 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK12.7448 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK  2323.3733 SECONDS
20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK   412.7058 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   264.5685 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK12.6648 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   306.7352 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK   261.5164 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK12.7688 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK  2325.3774 SECONDS
20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK   413.8739 SECONDS

Baseline, no source changes, with -ftree-loop-linear:

Mon Dec 10 17:45:19 UTC 2007 (revision 130746)

Compilation flags:

CCFLAGS := -g -O3 $(MACHINECPP) -ftree-loop-linear -ffast-math 
-fno-associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2
FCFLAGS := -g -O3 -ftree-loop-linear -fbacktrace 
-ffpe-trap=invalid,zero,overflow -ffast-math -fno-associative-math 
-march=native -mtune=native -ftree-vectorizer-verbose=2

This compilation got one ICE:

rttov_aitosu.f90: In function 'rttov_aitosu':
rttov_aitosu.f90:4: error: definition in block 262 does not dominate use in 
block 134
for SSA_NAME: pretmp.240_59 in statement:
prephitmp.220_58 = PHI 
PHI argument
pretmp.240_59
for PHI node
prephitmp.220_58 = PHI 
rttov_aitosu.f90:4: internal compiler error: verify_ssa failed
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

Worked around by compiling this file without -ftree-loop-linear

Loops vectorized:
5671
Loops not vectorized:
13655

Timings:
20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK12.5648 SECONDS
20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK  2444.1208 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   259.3402 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK12.4728 SECONDS
20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   307.8672 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK   260.0323 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK12.8608 SECONDS
20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK  2310.2485 SECONDS
20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK   411.3977 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   261.1283 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK12.7248 SECONDS
20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   308.1313 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK   262.7564 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK12.6528 SECONDS
20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK  2336.5620 SECONDS
20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK   410.6577 SECONDS

Baseline, with -ftree-loop-distribution changes:

Mon Dec 10 17:45:19 UTC 2007 (revision 130746M)

Compilation flags:

CCFLAGS := -g -O3 $(MACHINECPP) -ftree-loop-distribution -ffast-math 
-fno-associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2
FCFLAGS := -g -O3 -ftree-loop-distribution -fbacktrace 
-ffpe-t

Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Uros Bizjak

Hello!

There something weird going on with the count of the "loops not 
vectorized" - every successfully vectorized loop gets an additional 
message:


note: not vectorized: vectorization may not beprofitable.


This is due to switching on vector cost model by default for x86.

BTW: Attached patch fixed the message by adding the space between "be" 
and "profitable.". Patch was commited to SVN after bootstrappnig on x86_64.


2007-12-16  Uros Bizjak  <[EMAIL PROTECTED]>

   * tree-vect-transform.c (conservative_cost_threshold): Add missing
   space to "not vectorized" message.

Uros.

Index: tree-vect-transform.c
===
--- tree-vect-transform.c   (revision 130987)
+++ tree-vect-transform.c   (working copy)
@@ -6552,7 +6552,7 @@
th = (unsigned) min_profitable_iters;

  if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS)) 
-fprintf (vect_dump, "not vectorized: vectorization may not be"

+fprintf (vect_dump, "not vectorized: vectorization may not be "
"profitable.");
 
  if (th && vect_print_dump_info (REPORT_DETAILS))




Re: Designs for better debug info in GCC

2007-12-16 Thread Alexandre Oliva
On Dec 16, 2007, "Daniel Berlin" <[EMAIL PROTECTED]> wrote:

> There is no portion of the DWARF3 spec which requires you output
> information that is correct or useful. The same way the C standard
> does not require you to write correct programs, only valid ones, the
> DWARF3 spec does not require you to output correct information, only
> information that is encoded properly.

But if a C compiler translated programs to garbage, that would be
wrong.  By the same reasoning, if a Dwarf producer created garbage,
that would be wrong.

It's true that most of Dwarf 3 attributes are optional.  But when it
says "if you output this attribute, its operand must be such and
such", if you output the attribute with operands that don't match the
specification, that's a bug.

> It is certainly a goal of DWARF3 to allow producers to provide correct
> info

Exactly.  And where's the permission to provide incorrect info, rather
than merely leaving it out?

>> I've heard this "intrusiveness" argument be pointed out so many times,
>> by so many people that claim to not have been able to keep up with the
>> thread, and who claim to have not looked at the patches at all, that
>> I'm more and more convinced it's just fear of the unknown than any
>> actual rational evaluation of the impact of the changes.

> Well, no.
> You yourself have shown it to be intrusiveness in the extreme, in the
> very next paragraphs!

> "
> At some point you have to face reality and see that such information
> isn't kept around by magic, it takes some effort, and this effort is
> needed at every location where there are changes that might affect
> debug information.  And that's pretty much everywhere. "

> So, everywhere needs to change. That's pretty intrusiveness, no?

No.  Looks like selective attention, because you're reasoning out the
part in which I discussed using the strength of the optimizers against
the problem, by letting them do what they are already used to on the
debug information too.

If we add a new RTL code or a new TREE code, is that intrusive because
now every optimization pass will deal with the new node types in very
much the same way they've dealt with other similar node types forever?
Of course not.

And if we have to add a few exceptions here and there to deal with the
specifics of this new node type, does that become too intrusive then?
I don't think so.

Then what's the fuss about the new node types?  Do you want to count
the number of places in which INSN_P remains there, lexically
unchanged, and compare with the number of places in which I've added a
!DEBUG_INSN_P after it?

> Having to stop and think at every point in an optimization about the
> debug info,

Well, sorry, writing compilers is hard.  You have to think about
several things at the same time.  Shall we just go shopping instead?

I'm trying to make it as simple as possible.  The fact that nearly
100% of the code is unchanged seems to indicate to me that it's not
such a bad an approach, but if you want something that just magically
works, you're up for much disappointment.

> (having to stop and think about debug info at every single point of
> every single optimization).

Information doesn't come out of thin air, and thin air doesn't
maintain information accurate just because we wish it does.  We have
to work to create and update the information throughout compilation,
at every transformation, and my reasoning is precisely that optimizers
already do this all the time, so why not use them for what we need?

> You don't need to be this intrusiveness to stop outputting the
> incorrect info we do.

What do you have to back your statement up?

Let me help you: sure we don't.  We can just refrain from outputting
any debug information whatsoever.  Then, it will be compliant with the
standard.  But it won't be useful.

>> I've never seen this documented as such, and we've never worked toward
>> these stated goals.

> Who is we?
> I certainly have worked exactly towards these goals.
> As have almost all the authors of the current debugging info
> framework.

Oh, wow, I guess I just wasn't welcome into the club, because I didn't
get the guidelines book.  How unfortunate, now I have to give up my
plan of doing better and abide by the unpublished and undocumented
goals of some small cabal.  Or do I?

> If you look in the mailing list archives, you will even discover Diego
> is not the first one have exactly the viewpoint about what should and
> should not be debuggable, and that the community has consistenly
> worked towards exactly the viewpoint diego describes.

I've seen several different viewpoints from "the community".

> Anyway, I give up on reading this thread.  It has turned into a mess.
> You really need to step back

Oh, do I?  Why is that?

> and see that you have not achieved any sort of consensus of what
> levels of optimization should be how debuggable,

Why would I expect to get any consensus on that?  I haven't even
tried, and I won't.  This is not what the i

Re: Rant about ChangeLog entries and commit messages

2007-12-16 Thread Alexandre Oliva
On Dec 16, 2007, NightStrike <[EMAIL PROTECTED]> wrote:

> On 12/15/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote:
>> ... a good example of compliance with the GPL:
>> 
>> 5. Conveying Modified Source Versions.
>> 
>> a) The work must carry prominent notices stating that you modified
>> it, and giving a relevant date.

> Maybe Changelogs should be reserved for important changes.  For
> instance, something like "Fixed a typo" is a complete waste.  I doubt
> anyone looks ta a Changelog to see if someone fixed a typo recently or
> at any point in the past.

I've done that, while backporting patches.  Oftentimes there are small
fixes on top of larger patches, and you want to credit those who made
the small fixes, and you want to be sure you caught them next time you
look at the patch.  ChangeLogs for these are useful for this purpose.

> Perhaps there could be some criteria so that not every single iota
> gets a log entry.

How would leaving changes out comply with 5a above?

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Toon Moene

Uros Bizjak wrote:


note: not vectorized: vectorization may not beprofitable.


This is due to switching on vector cost model by default for x86.


Ah, but my hidden critique of the message was: 
-ftree-vectorizer-verbose=2 should *only* tell us:


1. Which loops are vectorized.
2. Which are not - and why (in a single sentence).

For more detailed logging, one should use -ftree-vectorizer-verbose=n 
with n>2, IMNSHO.


Kind regards,

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003


libgfortran, libgomp not compiled with BOOT_CFLAGS.

2007-12-16 Thread Toon Moene

L.S.,

Recently, I've begun to bootstrap with make BOOT_CFLAGS="flags", 
basically to get the run time libraries (libgfortran, libgomp) compiled 
with -mcpu=native -mtune=native (the speed of the compiler doesn't 
interest me that much).


However, I see that almost everything is compiled with -mcpu=native 
-mtune=native, *except* the run time libraries ...


Is that because they're target libraries - if so, how would one get them 
compiled "optimally" if not building a cross-compiler ?


Thanks in advance for any insight.

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003


Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Dorit Nuzman
> Uros Bizjak wrote:
>
> >> note: not vectorized: vectorization may not beprofitable.
> >
> > This is due to switching on vector cost model by default for x86.
>
> Ah, but my hidden critique of the message was:
> -ftree-vectorizer-verbose=2 should *only* tell us:
>
> 1. Which loops are vectorized.
> 2. Which are not - and why (in a single sentence).
>
> For more detailed logging, one should use -ftree-vectorizer-verbose=n
> with n>2, IMNSHO.
>

yes, you are right. this printing should be either removed (as it's anyhow
already being printed also under REPORT_DETAILS), or we may want to add a
new verbosity level (lower than REPORT_DETAILS) for cost-model info
("REPORT_COST").

dorit

> Kind regards,
>
> --
> Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.indiv.nluug.nl/~toon/
> GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003



Re: HIRLAM and -ftree-loop-linear

2007-12-16 Thread Toon Moene

Sebastian Pop wrote:

> I wrote:


rttov_aitosu.f90: In function 'rttov_aitosu':
rttov_aitosu.f90:4: error: definition in block 262 does not dominate use in 
block 134

Worked around by compiling this file without -ftree-loop-linear



Could you verify that the attached patch fixes also this problem?


Unfortunately, it doesn't; I get exactly the same error message as before.

Kind regards,

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003


RE: Help with another constraint

2007-12-16 Thread Hans-Peter Nilsson
On Wed, 12 Dec 2007, Dave Korn wrote:

> On 12 December 2007 12:14, Revital1 Eres wrote:
>
> > It seems that the pair m and I is missing (which indicate the memory =
> > constant instruction).
>
>   So doesn't the question then become "Why isn't reload reloading the constant
> into a register"?

Yes.  And the answer AFAIK is "because it doesn't see a way to
move a constant into a register; it understands "r", not "p" and
"q".

So bviyer, add an "r" alternative.  See also the "*" and "#"
qualifiers.  No need for bogus 0 -to- memory alternatives.

brgds, H-P


Re: HIRLAM and -ftree-loop-linear

2007-12-16 Thread Dorit Nuzman
> Sebastian,
>
> Here are (attached) results for testing HIRLAM with and without
> -ftree-loop-linear.
>
> As you can see, the results are neutral:  4 loops fewer vectorized, but
> about 50 fewer recognized.
>

any chance you kept the dumps and can report which loops were not
vectorized/recognized with -ftree-loop-linear (so we could see if these
represent missed vectorization opportunities?)

thanks,
dorit

> Now I like to redo that test with -ftree-loop-distribution.  Can you
> send me a patch against the trunk (otherwise it won't be a fair
comparison).
>
> Kind regards,
>
> --
> Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.indiv.nluug.nl/~toon/
> GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003
> Baseline, no source changes:
>
> Mon Dec 10 17:45:19 UTC 2007 (revision 130746)
>
> Compilation flags:
>
> CCFLAGS := -g -O3 $(MACHINECPP) -ffast-math -fno-associative-math -
> march=native -mtune=native -ftree-vectorizer-verbose=2
> FCFLAGS := -g -O3 -fbacktrace -ffpe-trap=invalid,zero,overflow -
> ffast-math -fno-associative-math -march=native -mtune=native -ftree-
> vectorizer-verbose=2
>
> Loops vectorized:
> 5675
> Loops not vectorized:
> 13705
>
> Timings:
> 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK12.7488 SECONDS
> 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK  2445.9609 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   259.3362 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK12.4408 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   305.9351 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK   262.1124 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK12.7448 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK  2323.3733 SECONDS
> 20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK   412.7058 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   264.5685 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK12.6648 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   306.7352 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK   261.5164 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK12.7688 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK  2325.3774 SECONDS
> 20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK   413.8739 SECONDS
>
> Baseline, no source changes, with -ftree-loop-linear:
>
> Mon Dec 10 17:45:19 UTC 2007 (revision 130746)
>
> Compilation flags:
>
> CCFLAGS := -g -O3 $(MACHINECPP) -ftree-loop-linear -ffast-math -fno-
> associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2
> FCFLAGS := -g -O3 -ftree-loop-linear -fbacktrace -ffpe-trap=invalid,
> zero,overflow -ffast-math -fno-associative-math -march=native -
> mtune=native -ftree-vectorizer-verbose=2
>
> This compilation got one ICE:
>
> rttov_aitosu.f90: In function 'rttov_aitosu':
> rttov_aitosu.f90:4: error: definition in block 262 does not dominate
> use in block 134
> for SSA_NAME: pretmp.240_59 in statement:
> prephitmp.220_58 = PHI 
> PHI argument
> pretmp.240_59
> for PHI node
> prephitmp.220_58 = PHI 
> rttov_aitosu.f90:4: internal compiler error: verify_ssa failed
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See  for instructions.
>
> Worked around by compiling this file without -ftree-loop-linear
>
> Loops vectorized:
> 5671
> Loops not vectorized:
> 13655
>
> Timings:
> 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK12.5648 SECONDS
> 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK  2444.1208 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   259.3402 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK12.4728 SECONDS
> 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK   307.8672 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK   260.0323 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK12.8608 SECONDS
> 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK  2310.2485 SECONDS
> 20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK   411.3977 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   261.1283 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK12.7248 SECONDS
> 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK   308.1313 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK   262.7564 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK12.6528 SECONDS
> 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK  2336.5620 SECONDS
> 20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK   410.6577 SECONDS



Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Dorit Nuzman
Here's a tentative patch to do that:
- removes the confusing printing "not vectorized: vectorization may not be
profitable" from REPORT_UNVECTORIZED_LOOPS
- instead print "vectorization may not be profitable" under a new verbosity
level REPORT_COST
- change (hopefully all) other cost-model printings to be printed under
REPORT_COST

I'll test it later this week. I assume this kind of thing is an ok stage 3
material (it's a regression fix cause this confusion in the dump reports
was introduced with the cost model patches during 4.3)

dorit

--- tree-vect-transform.c   2007-12-16 14:09:20.0 +0200
+++ tree-vect-transform.cost_verbose.c  2007-12-16 16:07:09.0 +0200
@@ -134,7 +134,7 @@
   /* Cost model disabled.  */
   if (!flag_vect_cost_model)
 {
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model disabled.");:
   return 0;
 }
@@ -153,7 +153,7 @@
   /*  FIXME: Make cost depend on complexity of individual check.  */
   vec_outside_cost +=
 VEC_length (tree, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo));
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: Adding cost of checks for loop "
  "versioning to treat misalignment.\n");
 }
@@ -163,7 +163,7 @@
   /*  FIXME: Make cost depend on complexity of individual check.  */
   vec_outside_cost +=
 VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo));.
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: Adding cost of checks for loop "
  "versioning aliasing.\n");
 }
@@ -224,14 +224,14 @@
   if (byte_misalign < 0)
 {
   peel_iters_prologue = vf/2;
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: "
  "prologue peel iters set to vf/2.");

   /* If peeling for alignment is unknown, loop bound of main loop
becomes
  unknown.  */
   peel_iters_epilogue = vf/2;
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: "
  "epilogue peel iters set to vf/2 because "
  "peeling for alignment is unknown .");
@@ -261,7 +261,7 @@
   if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
 {
   peel_iters_epilogue = vf/2;
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: "
  "epilogue peel iters set to vf/2 because "
  "loop iterations are unknown .");
@@ -391,7 +391,7 @@
   /* vector version will never be profitable.  */
   else
 {
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "cost model: vector iteration cost = %d "
  "is divisible by scalar iteration cost = %d by a factor "
  "greater than or equal to the vectorization factor =
%d .",
@@ -399,7 +399,7 @@
   return -1;
 }

-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 {
   fprintf (vect_dump, "Cost model analysis: \n");
   fprintf (vect_dump, "  Vector inside of loop cost: %d\n",
@@ -425,7 +425,7 @@
then skip the vectorized loop.  */
   min_profitable_iters--;

-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "  Profitability threshold = %d\n",
 min_profitable_iters);

@@ -465,7 +465,7 @@
   vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op));
   if (!vectype)
 {
-  if (vect_print_dump_info (REPORT_DETAILS))
+  if (vect_print_dump_info (REPORT_COST))
 {
   fprintf (vect_dump, "unsupported data-type ");
   print_generic_expr (vect_dump, TREE_TYPE (reduction_op),
TDF_SLIM);
@@ -520,7 +520,7 @@

   STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost;

-  if (vect_print_dump_info (REPORT_DETAILS)))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "vect_model_reduction_cost: inside_cost = %d, "
  "outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST
(stmt_info),
  STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
@@ -541,7 +541,7 @@
   /* prologue cost for vec_init and vec_step.  */
   STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = 2 *
TARG_SCALAR_TO_VEC_COST;

-  if (vect_print_dump_info (REPORT_DETAILS)))
+  if (vect_print_dump_info (REPORT_COST))
 fprintf (vect_dump, "vect_model_induction_cost: inside_cost = %d, "
  "outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST
(stmt_info),
  STMT_VINFO_OUTSIDE_OF

Re: porting gcc to tic54x

2007-12-16 Thread Hans-Peter Nilsson
On Wed, 12 Dec 2007, a2220333 wrote:

> hi,
> I have been porting tic54x to gcc. I use gcc-4.2.2 version. I write some 
> simplest c54x.h and c54x.c and a empty md, and I

I think the answer is right there   
  ^^

> compile it to generate the tic54x-gcc compiler.
>
> But when I execute the compiler I generate I got a segmentation fault error. 
> Is there anything must be define in c54x.c or
> c54x.h that could make the simplest compiler with no correct output and no 
> errors? Because I want to add functions from this
> basic port.

If that wasn't the bug, I suggest you start up gdb and step
through cc1, but I'd be surprised if you get anywhere without
the prerequisite move, add, and control flow insns in the .md.

brgds, H-P


Re: HIRLAM and -ftree-loop-linear

2007-12-16 Thread Toon Moene

Dorit Nuzman wrote:


any chance you kept the dumps and can report which loops were not
vectorized/recognized with -ftree-loop-linear (so we could see if these
represent missed vectorization opportunities?)


I haven't, but it wouldn't be too much effort do this.

I'll try stage 1 tonight - i.e., to establish a base (with the latest 
trunk check-out, not using -ftree-loop-linear), and then subsequently 
using that flag.


Kind regards,

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003


RE: Help with another constraint

2007-12-16 Thread Hans-Peter Nilsson
On Sun, 16 Dec 2007, Hans-Peter Nilsson wrote:
> On Wed, 12 Dec 2007, Dave Korn wrote:
>
> > On 12 December 2007 12:14, Revital1 Eres wrote:
> >
> > > It seems that the pair m and I is missing (which indicate the memory =
> > > constant instruction).
> >
> >   So doesn't the question then become "Why isn't reload reloading the 
> > constant
> > into a register"?
>
> Yes.  And the answer AFAIK is "because it doesn't see a way to
> move a constant into a register; it understands "r", not "p" and
> "q".

I think I have to correct myself; register allocation and reload
*should* understand p and q as register constraints, given e.g.
a correct REG_CLASS_FROM_LETTER definition and correct regclass
macros.  The latter were not disclosed and are usually a source
of hard-to-find errors.

Besides, if you can't directly move between p and q (as your
constraints indicate) then as Rask says, you also need to tell
GCC through the secondary-reload mechanisms.

I can't help but thinking the best suggetion is for bviyer to
let gdb answer the question by stepping through cc1 instead of
relying on indirect debugging.  That's what people do. ;)

brgds, H-P


Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Sebastian Pop
On Dec 16, 2007 4:24 AM, Toon Moene <[EMAIL PROTECTED]> wrote:
> Here are, in addition, the numbers for compiling and
> running HIRLAM with -ftree-loop-distribution (after applying your patch,
> obviously).
>
> In short, almost 1900 more loops are vectorized, but that's of course
> certainly due to the fact that loop distribution *makes* more loops.
>
> In run time it has little (but positive) effect.
>

Wow! Thanks for the numbers.  I guess from your message that there
were no ICEs or other problems with the loop distribution patch.

Mark, is the loop distribution patch okay for trunk?

Thanks,
Sebastian
-- 
AMD - GNU Tools


Problem with SSA inlining and default defs

2007-12-16 Thread Eric Botcazou
Hi,

How SSA inlining and default defs for uninitialized variables are supposed to 
interact?  Suppose you have the following situation

BB0 ...
 |   \
(ab) |BB1 s_2 = f(s_1(D))
 |   /
BB2 s_3 = PHI 

in a function that gets inlined into a loop.  The liveness of s_1(D) in BB0 
will propagate to BB2 along the backwards edge and you get overlapping live 
ranges for s_1(D) and s_3.  If s_1(D) is SSA_NAME_OCCURS_IN_ABNORMAL_PHI, the 
compilation will abort during SSA coalescing because they must be coalesced.

This is on the mainline, Ada testcase attached, run 'gnatchop' on it and 
compile at -O -gnatp.

Thanks in advance.

-- 
Eric Botcazou
package Q is

procedure Read(S : out Integer);
procedure Restore(S : in out Integer);

end Q;
package P is

type Int_Ptr is access all Integer;
procedure Exec(P : Int_Ptr);

end P;
with Q; use Q;

package body P is

procedure Lock is
S : Integer;
begin
Read(S);
Restore(S);
exception
when others => Restore(S);
end;

procedure Exec(P : Int_Ptr) is
begin
while P /= NULL loop
Lock;
end loop;
end;

end P;


Re: Problem with SSA inlining and default defs

2007-12-16 Thread Jakub Jelinek
On Sun, Dec 16, 2007 at 06:54:29PM +0100, Eric Botcazou wrote:
> How SSA inlining and default defs for uninitialized variables are supposed to 
> interact?  Suppose you have the following situation
> 
> BB0 ...
>  |   \
> (ab) |BB1 s_2 = f(s_1(D))
>  |   /
> BB2 s_3 = PHI 
> 
> in a function that gets inlined into a loop.  The liveness of s_1(D) in BB0 
> will propagate to BB2 along the backwards edge and you get overlapping live 
> ranges for s_1(D) and s_3.  If s_1(D) is SSA_NAME_OCCURS_IN_ABNORMAL_PHI, the 
> compilation will abort during SSA coalescing because they must be coalesced.

This sounds like PR31081.

Jakub


Re: Problem with SSA inlining and default defs

2007-12-16 Thread Eric Botcazou
> This sounds like PR31081.

Indeed, the C++ testcase is the exact translation of my Ada testcase. :-)

The problem seems to arise relatively often in Ada, I think the PR should be 
made "critical".

-- 
Eric Botcazou


Re: libgfortran, libgomp not compiled with BOOT_CFLAGS.

2007-12-16 Thread Serge Belyshev
Toon Moene <[EMAIL PROTECTED]> writes:

> L.S.,
>
> Recently, I've begun to bootstrap with make BOOT_CFLAGS="flags",
> basically to get the run time libraries (libgfortran, libgomp)
> compiled with -mcpu=native -mtune=native (the speed of the compiler
> doesn't interest me that much).
>
> However, I see that almost everything is compiled with -mcpu=native
> -mtune=native, *except* the run time libraries ...
>
> Is that because they're target libraries - if so, how would one get
> them compiled "optimally" if not building a cross-compiler ?
Yeah, try adding appropriate FCFLAGS into environment.

Also CXXFLAGS, GCJFLAGS affect some of resulting binaries or libraries
when bootstrapping, and CFLAGS (don't remember about this one for sure).


Re: Rant about ChangeLog entries and commit messages

2007-12-16 Thread NightStrike
On 12/16/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote:
> On Dec 16, 2007, NightStrike <[EMAIL PROTECTED]> wrote:
>
> > On 12/15/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote:
> >> ... a good example of compliance with the GPL:
> >>
> >> 5. Conveying Modified Source Versions.
> >>
> >> a) The work must carry prominent notices stating that you modified
> >> it, and giving a relevant date.
>
> > Maybe Changelogs should be reserved for important changes.  For
> > instance, something like "Fixed a typo" is a complete waste.  I doubt
> > anyone looks ta a Changelog to see if someone fixed a typo recently or
> > at any point in the past.
>
> I've done that, while backporting patches.  Oftentimes there are small
> fixes on top of larger patches, and you want to credit those who made
> the small fixes, and you want to be sure you caught them next time you
> look at the patch.  ChangeLogs for these are useful for this purpose.
>
> > Perhaps there could be some criteria so that not every single iota
> > gets a log entry.
>
> How would leaving changes out comply with 5a above?

It wouldn't without some "creative interpretations".


Re: HIRLAM with -ftree-loop-distribution.

2007-12-16 Thread Toon Moene

Sebastian Pop wrote:



Wow! Thanks for the numbers.  I guess from your message that there
were no ICEs or other problems with the loop distribution patch.


Exactly.

--
Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003


Re: Problem with SSA inlining and default defs

2007-12-16 Thread Jakub Jelinek
On Sun, Dec 16, 2007 at 07:11:49PM +0100, Eric Botcazou wrote:
> > This sounds like PR31081.
> 
> Indeed, the C++ testcase is the exact translation of my Ada testcase. :-)
> 
> The problem seems to arise relatively often in Ada, I think the PR should be 
> made "critical".

Yeah, to me this looks like the most worrisome P1 4.3 regression.

Jakub


Re: Problem with posix threads

2007-12-16 Thread Lucas Prado Melo
 Please forgive me if this is off-topic:
 I've written a simple test program with posix threads and a 'glibc'
attempt was detected.

 The code:


 -main.c---

 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include "stack.c"

 /*
  * THREAD EXPERIMENT
  *
  * There are various threads:
  * I, II and main
  * thread I repeatly pushes to 'stck' a value (NTIMES times)
  * thread II then pop repeatly and show the value (NTIMES-1 times)
  * main thread then pushes the last values and quit
  */
 #define NTIMES 1000
 #define die(msg) do { perror(msg); exit(1); } while(0)

 void * doPush(void * data);
 void * doPop(void * data);

 struct pack{
 Stack stck;
 pthread_mutex_t mutex;
 };

 int main(int argc, char *argv[]){
 void *trashbin;
 pthread_t peer[2];
 struct pack pck;
 //initialize pck mutex
 if( pthread_mutex_init(&( pck.mutex), NULL) != 0 )
 die("pthread_mutex_init");

 //and make it multi-threaded
 if( pthread_create(&peer[0], NULL, doPush, (void*)&pck) != 0 )
 die("pthread_create");
 if( pthread_create(&peer[1], NULL, doPop, (void*)&pck) != 0 )
 die("pthread_create");
 //wait all threads do their stuff
 pthread_join(peer[0],&trashbin);
 pthread_join(peer[1],&trashbin);

 pthread_mutex_lock( &(pck.mutex) );
 printf("Last one: %c\n", pop( &(pck.stck) ));
 pthread_mutex_unlock( &(pck.mutex) );

 //destroy pck mutex
 if( pthread_mutex_destroy( &( pck.mutex) ) != 0 )
 die("pthread_mutex_destroy");
 return 0;
 }
 void * doPush(void * data){
 struct pack * pck = (struct pack *)data;
 int x;
 for(x=0;xmutex) );
 push( &(pck->stck), (void*)chr );
 pthread_mutex_unlock( &(pck->mutex) );
 }
 pthread_exit( NULL );
 }
 void * doPop(void * data){
 struct pack * pck = (struct pack *)data;
 int x;
 for(x=0;x<(NTIMES-1);x++){
 pthread_mutex_lock( &(pck->mutex) );
 printf("%c ", pop( &(pck->stck) ) );
 pthread_mutex_unlock( &(pck->mutex) );
 }
 printf("\n");
 pthread_exit( NULL );
 }


 --

 -stack.c---
 #ifndef STACK_C
 #define STACK_C 1
 #include 


 struct stack {
 void * el;
 struct stack *next;
 };

 typedef struct stack * Stack;

 void stackInit(Stack * stck){
 *stck = NULL;
 return;
 }
 //pushes an element to the stack
 void push(Stack * stck, void * el){
 struct stack * ne;
 ne = malloc(sizeof(struct stack));
 ne->el = el;
 ne->next = *stck;
 *stck = ne;
 return;
 }

 //pops an element from stack
 //return NULL if there's no element in the stack
 void * pop(Stack * stck){
 struct stack * de;
 void * el;
 de = *stck;
 *stck = (*stck)->next;
 if(de != NULL ){
 el = de->el;
 free(de);
 }
 else
 el = NULL;
 return el;
 }

 #endif
 --


unable to find a register to spill in class ‘MD_REGS’

2007-12-16 Thread Rodrigo González Alberquilla
Dear GCC Developers/Users,

I am working on a port of a target backend to PISA architecture (a MIPS-IV like 
ISA used by the SimpleScalar simulator). When compiling libgcc2 for __muldi3:

#ifdef L_muldi3
DWtype
__muldi3 (DWtype u, DWtype v)
{
  const DWunion uu = {.ll = u};
  const DWunion vv = {.ll = v};
  DWunion w = {.ll = __umulsidi3 (uu.s.low, vv.s.low)};

  w.s.high += ((UWtype) uu.s.low * (UWtype) vv.s.high
   + (UWtype) uu.s.high * (UWtype) vv.s.low);

  return w.ll;
}
#endif

I get the following error:
../.././gcc/libgcc2.c: In function ‘__muldi3’:
../.././gcc/libgcc2.c:542: error: unable to find a register to spill in class 
‘MD_REGS’
../.././gcc/libgcc2.c:542: error: este es el insn:
(insn 37 36 38 2 (set (reg:DI 116)
(mult:DI (zero_extend:DI (reg:SI 3 v1 [orig:117 __ul ] [117]))
(zero_extend:DI (reg:SI 2 v0 [orig:118 __vl ] [118] 14 
{umulsidi3_32bit_internal} (nil)
(expr_list:REG_DEAD (reg:SI 3 v1 [orig:117 __ul ] [117])
(expr_list:REG_DEAD (reg:SI 2 v0 [orig:118 __vl ] [118])
(nil

I have compile it with -da and the dumps are:

greg:
Spilling for insn 37.
Using reg 4 for reload 0
reload failure for reload 1

Reloads for insn # 37
Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
Reload 1: reload_out (DI) = (reg:DI 116)
MD_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (reg:DI 116)
secondary_out_reload = 0

lreg:
[...]
  Register 116 costs: LEA_REGS:1000 GR_REGS:1000 MEM:8000
[...]
Register 116 used 2 times across 2 insns in block 2; set 1 time; 8 bytes; 
NO_REGS or none.
[...]
(insn 37 36 38 2 (set (reg:DI 116)
(mult:DI (zero_extend:DI (reg:SI 117 [ __ul ]))
(zero_extend:DI (reg:SI 118 [ __vl ] 14 
{umulsidi3_32bit_internal} (nil)
(expr_list:REG_DEAD (reg:SI 117 [ __ul ])
(expr_list:REG_DEAD (reg:SI 118 [ __vl ])
(nil
[...]

I have read a thread of a guy having a paroblem like this but it has not helped 
me. If anybody would tell me where the problem could be, I would be very 
pleased.

Regards,
Rodrigo González


Re: Designs for better debug info in GCC

2007-12-16 Thread Mark Mitchell
Alexandre Oliva wrote:

>> Yes, please.  I would very much like to see an abstract design
>> document on what you are trying to accomplish.
> 
> Other than the ones I've already posted, here's one:
> 
> http://dwarfstd.org/Dwarf3Std.php
> 
> Seriously.  There is a standard for this stuff. 

That's the specification for the encoding format.  I agree with you that
emitting incorrect debugging information, in the sense of declaring that
the location of a variable is in one place, even though its value is not
available in that place, is bad.  In -O0 code, I consider it a serious bug.

In -O2 code, I think it's still a bug, but with our current
infrastructure, we may have little choice: we either deny all knowledge
of the variable's location, or give one that's sometimes incorrect.
Which alternative is better depends on what you're trying to do with the
information; for interactive debugging, mostly-right is probably better
than nothing, whereas for some programmatic activities, the opposite may
be true.

If your goal is to avoid the information ever being wrong -- without
worrying about whether it is complete -- there is of course a trivial
solution: do not emit the information.  That is not a serious
suggestion, but it does provide a path to a serious suggestion, which I
gave earlier: conservatively emit location information you provide based
on what you can prove at the time you generate debugging information.
For example, if the value of "x" is in a register, and you cross a call
which might clobber that register value, then emit debugging information
that says that at that point the value is unavailable.  You could
probably do this kind of thing with relatively few changes to the GCC
internal representation; you would run a pass before debug-information
generation that attempted to prove dataflow properties about variables
and told you where values could reliably be found.

Your earlier messages, however, suggest that you are trying to do
something harder: emit information that is essentially both complete (in
the sense of providing as much information as possible about the
locations and values of variables) and correct (in the sense of never
giving incorrect information).  If you want to do that, you're going to
have to answer the harder questions, like "what line number corresponds
to this address?" and "what should the debugging information say that
the value of a variable is when it has been optimized away?"

If that's still your goal, then pointing at the DWARF3 specification
doesn't help.  Diego and I are asking you to confront these fundamental
questions about what information you want to provide and what the
correctness criteria are.

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: Problem with posix threads

2007-12-16 Thread Lucas Prado Melo
Why does it happen?


Re: Problem with posix threads

2007-12-16 Thread Daniel Jacobowitz
On Sun, Dec 16, 2007 at 07:20:37PM -0300, Lucas Prado Melo wrote:
> Why does it happen?

This list is for the development of GCC.  Try gcc-help or some other
programming forum, please.

-- 
Daniel Jacobowitz
CodeSourcery


Re: Designs for better debug info in GCC

2007-12-16 Thread Daniel Berlin
> It is obvious that you misunderstood what I want, and how intrusive
> the approach is.
>

Yes Alexandre, everyone who disagrees with you must not understand!
That's really the problem here.
None of us understand but you.


Re: Designs for better debug info in GCC

2007-12-16 Thread Joe Buck
On Sun, Dec 16, 2007 at 08:12:07PM -0500, Daniel Berlin wrote:
> > It is obvious that you misunderstood what I want, and how intrusive
> > the approach is.
> >
> 
> Yes Alexandre, everyone who disagrees with you must not understand!
> That's really the problem here.
> None of us understand but you.

I have some sympathy for going in Alexandre's direction, in that it
would be nice to have a mode that provided optimization as well as
accurate debugging.  However, since preserving accurate debug information
has a cost, I think it would be better to turn -O1, not -O2, into the
mode that Alexandre wants, where debug information is preserved.  Trying
to rework all optimizations to keep perfect debug information is going
to take forever and make the compiler worse.


Re: Designs for better debug info in GCC

2007-12-16 Thread Geert Bosch


On Dec 16, 2007, at 20:27, Joe Buck wrote:

I have some sympathy for going in Alexandre's direction, in that it
would be nice to have a mode that provided optimization as well as
accurate debugging.  However, since preserving accurate debug  
information

has a cost, I think it would be better to turn -O1, not -O2, into the
mode that Alexandre wants, where debug information is preserved.   
Trying

to rework all optimizations to keep perfect debug information is going
to take forever and make the compiler worse.


Right, at the moment -O1 is far too much like -O2.
There is room for an optimization mode that is mostly local,
scales well far large programs and allows for high-quality debug
information. Fortunately, these goals seem all to match.

We could conceptually have inspection points between each source
statement and declaration, which would roughly correspond to a
use of all memory and all source variables, wether in memory or
in registers.
These inspections points would be considered potentially trapping.

This approach would still allow some scheduling. For example, loads
and arithmetic operations that are known not to trap could still
be done early. On the other hand, when breaking at any statement,
all variables can be printed.

Also, since no user-visible state can be modified by speculatively
executed instructions such as loads, such instructions should not
be tagged with their original source location information.
This would prevent the very annoying and unhelpful jumping around
the program during debugging.

The method I describe here, which roughly corresponds to the semantics
of Ada's "pragma Inspection_Point", seems relatively easy to implement
using an empty "asm" or similar.

  -Geert

PS. For convenience, I'm including a snippet of the Ada 2005 standard,
the full version of which is freely available on the web.


H.3.2 Pragma Inspection_Point

1 An occurrence of a pragma Inspection_Point identifies a set of  
objects

each of whose values is to be available at the point(s) during program
execution corresponding to the position of the pragma in the  
compilation unit.

The purpose of such a pragma is to facilitate code validation.


   Syntax

2 The form of a pragma Inspection_Point is as follows:

3   pragma Inspection_Point[(object_name {, object_name})];


   Legality Rules

4 A pragma Inspection_Point is allowed wherever a declarative_item  
or
statement is allowed. Each object_name shall statically denote the  
declaration

of an object.


  Static Semantics

5/2   An inspection point is a point in the object code corresponding  
to the
occurrence of a pragma Inspection_Point in the compilation unit. An  
object is

inspectable at an inspection point if the corresponding pragma
Inspection_Point either has an argument denoting that object, or has no
arguments and the declaration of the object is visible at the inspection
point.


  Dynamic Semantics

6 Execution of a pragma Inspection_Point has no effect.


 Implementation Requirements

7 Reaching an inspection point is an external interaction with  
respect to

the values of the inspectable objects at that point (see 1.1.3).


 Documentation Requirements

8 For each inspection point, the implementation shall identify a  
mapping
between each inspectable object and the machine resources (such as  
memory

locations or registers) from which the object's value can be obtained.

  NOTES

9/2   7  The implementation is not allowed to perform "dead store
  elimination" on the last assignment to a variable prior to a  
point where the
  variable is inspectable. Thus an inspection point has the  
effect of an

  implicit read of each of its inspectable objects.

108  Inspection points are useful in maintaining a correspondence  
between
  the state of the program in source code terms, and the machine  
state
  during the program's execution. Assertions about the values of  
program
  objects can be tested in machine terms at inspection points.  
Object code
  between inspection points can be processed by automated tools  
to verify

  programs mechanically.

119  The identification of the mapping from source program objects  
to
  machine resources is allowed to be in the form of an annotated  
object

  listing, in human-readable or tool-processable form.