Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-27 Thread Christoph Mallon

Marius Strobl wrote:

marius  2008-06-25 21:04:59 UTC

  FreeBSD src repository

  Modified files:
sys/sparc64/include  in_cksum.h 
  Log:

  SVN rev 180011 on 2008-06-25 21:04:59Z by marius
  
  Use "__asm __volatile" rather than "__asm" for instruction sequences

  that modify condition codes (the carry bit, in this case). Without
  "__volatile", the compiler might add the inline assembler instructions
  between unrelated code which also uses condition codes, modifying the
  latter.
  This prevents the TCP pseudo header checksum calculation done in
  tcp_output() from having effects on other conditions when compiled
  with GCC 4.2.1 at "-O2" and "options INET6" left out. [1]
  
  Reported & tested by:   Boris Kochergin [1]

  MFC after:  3 days


This approach seems wrong to me and I think it works only by chance. The 
condition codes ("cc") should be added to the clobbered list of the asm 
statement instead of making the statement volatile: __asm("..." : $OUT : 
$IN : "cc");

This very case is also mentioned in the GCC documentation:
"If your assembler instruction can alter the condition code register,
add `cc' to the list of clobbered registers.  GCC on some machines
represents the condition codes as a specific hardware register; `cc'
serves to name this register.  On other machines, the condition code is
handled differently, and specifying `cc' has no effect.  But it is
valid no matter what the machine." (Section 5.35 Assembler Instructions 
with C Expression Operands)


I wrote a letter directly to Marius about this issue two days ago, but I 
got no response so far. Because this change has a MFC after 3 days, I'm 
writing to this list.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-27 Thread Christoph Mallon

Marius Strobl wrote:

marius  2008-06-27 22:17:14 UTC

  FreeBSD src repository

  Modified files:
sys/sparc64/include  in_cksum.h 
  Log:

  SVN rev 180073 on 2008-06-27 22:17:14Z by marius
  
  Improve r180011 by explicitly adding the condition codes to the

  clobber list.


You should remove the volatile specifier. For example volatile prevents 
common subexpression elimination and other types of optimisations.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-27 Thread Christoph Mallon

Marius Strobl wrote:

I wasn't aware that the clobber list allows to explicitly specify
the condition codes, thanks for the hint. Though it unfortunately
took me longer than two days to verify it's effect on the generated
code; sparc64 could still have been one of the archs where "cc" has
no effect. Besides I don't think using "__volatile" for this is
that wrong, given that the sparc64 code generated by using "cc"
and "__volatile" is nearly identical and given that at least i386
relies on "__volatile" telling GCC that the inline assembler uses
the condition codes since quite some time. So the condition codes
are probably part of what GCC treats as "important side-effects".


If this is true and GCC only handles the eflags on x86 correctly, when 
__volatile is used, but not if "cc" is marked as clobbered, then this is 
clearly a bug.



Regarding the MFC, they don't happen automatically and the change
was not wrong in general so there was no need to hurry :)


I still think, using __volatile only works by accident.  volatile for an 
assembler block mostly means "this asm statement has an effect, even 
though the register specification looks otherwise, so do not optimise 
this away (i.e. no CSE, do not remove if result is unused etc.).



On a related note: Is inline assembler really necessary here? For 
example couldn't in_addword() be written as

static __inline u_short
in_addword(u_short const sum, u_short const b)
{
u_int const t = sum + b;
return t + (t >> 16);
} ?
This should at least produce equally good code and because the compiler 
has more knowledge about it than an assembler block, it potentially 
leads to better code. I have no SPARC compiler at hand, though.


In fact the in/out specification for this asm block looks rather bad:
"=&r" (__ret), "=&r" (__tmp) : "r" (sum), "r" (b) : "cc");
The "&"-modifiers (do not use the same registers as for any input 
operand value) force the compiler to use 4 (!) register in total for 
this asm block. It could be done with 2 registers if a proper in/out 
specification was used. At the very least the in/out specification can 
be improved, but I suspect using plain C is the better choice.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Bruce Evans wrote:

On Fri, 27 Jun 2008, Christoph Mallon wrote:


Marius Strobl wrote:

marius  2008-06-25 21:04:59 UTC

  FreeBSD src repository

  Modified files:
sys/sparc64/include  in_cksum.h   Log:
  SVN rev 180011 on 2008-06-25 21:04:59Z by marius
Use "__asm __volatile" rather than "__asm" for instruction sequences
  that modify condition codes (the carry bit, in this case). Without
...


This approach seems wrong to me and I think it works only by chance. 
The condition codes ("cc") should be added to the clobbered list of 
the asm statement instead of making the statement volatile: 
__asm("..." : $OUT : $IN : "cc");



This very case is also mentioned in the GCC documentation:
"If your assembler instruction can alter the condition code register,
add `cc' to the list of clobbered registers.  GCC on some machines
represents the condition codes as a specific hardware register; `cc'
serves to name this register.  On other machines, the condition code is
handled differently, and specifying `cc' has no effect.  But it is
valid no matter what the machine." (Section 5.35 Assembler 
Instructions with C Expression Operands)


Does sparc64 do anything good with this?  Later it says that this is
irrelevant for the type type of bug in in_cksum.* (expecting cc to
be preserved across separate asms).  From gcc.info:


I think the bug was the following:
subcc %foo, %bar, %g0 /* SPARC compare */
#APP
/* inline assembler of in_addword() here, which modifies the condition 
codes */

#NO_APP
bpl $somewhere /* condition branch depending on condition code */

The bpl is supposed to jump depending on the condition of the subcc, but 
if the compiler schedules the inline assembler block between the subcc 
and the bpl, it jumps depending on garbage.
It was not expected to preserve the condition codes across separate 
inline assembler blocks, but it was not specified that the single inline 
assembler block did not modify the condition codes.



%  Similarly, you can't expect a sequence of volatile `asm' instructions
% to remain perfectly consecutive.  If you want consecutive output, use a
^
% single `asm'.  Also, GCC will perform some optimizations across a
  ^
% volatile `asm' instruction; GCC does not "forget everything" when it
% encounters a volatile `asm' instruction the way some other compilers do.
% %  It is a natural idea to look for a way to give access to the condition
% code left by the assembler instruction.  However, when we attempted to
% implement this, we found no way to make it work reliably.  The problem
  
% is that output operands might need reloading, which would result in
% additional following "store" instructions.  On most machines, these
% instructions would alter the condition code before there was time to
% test it.  This problem doesn't arise for ordinary "test" and "compare"
% instructions because they don't have any output operands.
% %  For reasons similar to those described above, it is not possible to
 ^
% give an assembler instruction access to the condition code left by
  ^^
% previous instructions.
  ^^


I think the excerpt does not apply here, because the problem is the 
other way round (inline assembler interrupting C, not C interrupting 
multiple inline assembler blocks), see above.



On i386, specifying cc has no effect (the compiler must always assume that
cc is clobbered), and specifying cc in asms is a style bug in FreeBSD.


I have to disagree. Where does the GCC documentation state, that "cc" 
has no effect on x86? It is the other way round: The compiler assumes, 
the condition codes are *not* modified, if it is not explicitely stated. 
Exactly this caused the bug (though here on SPARC), which was tried to 
be solved by volatile. I still am convinced that specifying "cc" in the 
clobber list and not using volatile is the correct solution. If it is a 
style bug to specify "cc", the style should be changed, otherwise you 
cannot use inline assembler correctly.



It took about 15 years (1992-2007) for the bugs in in_cksum.* to be
finally fixed on i386, by following the rule given in gcc.info.  IIRC,
this rule wasn't present in 1992, and wasn't really needed then because
the optimizer wasn't agressive enough to move things, especially volatile
asms, but it has been there for about 10 years.


As far as I can see the rule in the gcc documentation was not followed 
in the x86 version: The assembler block is volatile and "cc" is not 
specified as clobbered. This, imo, is a bug.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Bruce Evans wrote:

On Sat, 28 Jun 2008, Christoph Mallon wrote:

I still think, using __volatile only works by accident.  volatile for 
an assembler block mostly means "this asm statement has an effect, 
even though the register specification looks otherwise, so do not 
optimise this away (i.e. no CSE, do not remove if result is unused etc.).


Right.  Though I've never seen unnecessary's __volatiles significantly
affecting i386 code.  This is because the code in the asms can't be
removed completely, and can't be moved much either.  With out of order
execution, the type of moves that are permitted (not across dependencies)
are precisely the type of moves that the CPU's scheduler can do or undo
no matter how the compiler orders the code.


I disagree. For example look at the use of in_addword() in 
dev/sk/if_sk.cv in line 2819:

  csum1 = htons(csum & 0x);
  csum2 = htons((csum >> 16) & 0x);
  ipcsum = in_addword(csum1, ~csum2 & 0x);
  /* checksum fixup for IP options */
  len = hlen - sizeof(struct ip);
  if (len > 0) {
return;
  }

The calculation will be executed even if the following if (len > 0) 
leaves the function and the value of ipcsum is unused.
If in_addword() is not marked volatile it can be moved after the if and 
not be executed in all cases. csum1 and csum2 can be moved after the if, 
too.


On a related note: Is inline assembler really necessary here? For 
example couldn't in_addword() be written as

static __inline u_short
in_addword(u_short const sum, u_short const b)
{
   u_int const t = sum + b;
   return t + (t >> 16);
} ?
This should at least produce equally good code and because the 
compiler has more knowledge about it than an assembler block, it 
potentially leads to better code. I have no SPARC compiler at hand, 
though.


Last time I tried on i386, I couldn't get gcc to generate operations
involving carries for things like this, or the bswap instruction from
C code to reorder a word.  gcc4.2 -O3  on i386 now generates for the above:

movzwlb, %eax# starting from b and sum in memory
movzwlsum, %edx
addl%eax, %edx# 32-bit add
movl%edx, %eax
shrl$16, %eax# it does the shift laboriously
addl%edx, %eax
movzwl%ax, %eax# don't really need 32-bit result
# but need something to discard the high bits


If the upper 16 bits are not "looked at" then the final movzwl can be 
optimised away. Many instructions, like add, shl and mul, can live with 
"garbage" in the upper 16 bits. Only if a "bad" instruction, like shr or 
div, is encountered, the upper 16 bits have to be cleared.
The current x86 implementation of in_addword() using inline assembler 
causes the compiler to add a movzwl, too, before the return.



In non-inline asm, I would write this as:

movwsum,%ax
addwb,%ax
adcw$0,%ax
movzwl%ax,%eax


You do not want to use 16bit instructions on modern x86 processors. 
These instructions are slow. Intel states that decoding a 16bit 
operation takes 6 cycles instead of the usual 1. (Intel® 64 and IA-32 
Architectures Optimization Reference Manual, section 2.1.2.2 Instruction 
PreDecode)



Pipelining can make bloated code run better than it looks, but probably
not for the generated code above, since shifts are slow on some i386's
and there is an extra dependency for the extra shift operation.


Shifts were slow on early generations of the Pentium 4. Intel corrected 
this "glitch" in later generations.



In fact the in/out specification for this asm block looks rather bad:
"=&r" (__ret), "=&r" (__tmp) : "r" (sum), "r" (b) : "cc");
The "&"-modifiers (do not use the same registers as for any input 
operand value) force the compiler to use 4 (!) register in total for 
this asm block. It could be done with 2 registers if a proper in/out 
specification was used. At the very least the in/out specification can 
be improved, but I suspect using plain C is the better choice.


Hmm, the i386 version is much simpler.  It just forces use of 2 registers
when 1 is enough (change its constraint for (b) from "r" to "rm" to permit
adcw from either register or memory, so that it can generate the above code
if b happens to be in memory; on most i386's, this optimizes for space but
makes no difference for time).


Sometimes a few less bytes in an inner loop can have dramatic effects. 
Saving a register is always (ok, most of the time) a good idea on x86. 
On the other hand, the function is inlined so proably its operands do 
not originate from memory locations.
But I still think the better solution is to use simple C instead of 
inline assembler.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Marius Strobl wrote:
On a related note: Is inline assembler really necessary here? For 
example couldn't in_addword() be written as

static __inline u_short
in_addword(u_short const sum, u_short const b)
{
u_int const t = sum + b;
return t + (t >> 16);
} ?
This should at least produce equally good code and because the compiler 
has more knowledge about it than an assembler block, it potentially 
leads to better code. I have no SPARC compiler at hand, though.


With GCC 4.2.1 at -O2 the code generated for the above C version
takes on more instruction than the inline assembler so if one 


On SPARC?  What code does it produce? I have not SPARC compiler at hand.
Even if it is one more instruction, I think the reduced register 
pressure makes more than up for it.



wants to go for micro-optimizing one should certainly prefer the
inline assembler version.


As a compiler construction I can tell you, that regarding optimisation 
there is no such thing as "certainty".
The worst part about inline assembler is, that the compiler knows 
nothing about the instructions in there and has to copy them verbatim. 
For example it can not do any clever things with the two shifts at the 
beginning of the inline assembler block of in_addword().



In fact the in/out specification for this asm block looks rather bad:
"=&r" (__ret), "=&r" (__tmp) : "r" (sum), "r" (b) : "cc");
The "&"-modifiers (do not use the same registers as for any input 
operand value) force the compiler to use 4 (!) register in total for 
this asm block. It could be done with 2 registers if a proper in/out 
specification was used. At the very least the in/out specification can 
be improved, but I suspect using plain C is the better choice.




The "&"-modifiers are necessary as the inline assembler in
question consumes output operands before all input operands
are consumed. Omitting them caused GCC to generate broken
code in the past.


This should work fine and only use two registers (though the compiler 
can choose to use three, if it deems it beneficial):


static __inline u_short
in_addword(u_short const sum, u_short const b)
{
  u_long const sum16 = sum << 16;
  u_long const b16   = b   << 16;
  u_long   ret;

  __asm(
"addcc %1, %2, %0\n\t"
"srl   %0, 16, %0\n\t"
"addc  %0,  0, %0\n"
: "=r" (ret) : "r" (sum16), "r" (b16) : "cc");

  return (ret);
}

But I still prefer the C version.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Christoph Mallon wrote:
As a compiler construction I can tell you, that regarding optimisation 
there is no such thing as "certainty".


Uh...right... s/construction/constructor/
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Marius Strobl wrote:

On Sat, Jun 28, 2008 at 12:56:08AM +0200, Christoph Mallon wrote:

Marius Strobl wrote:

marius  2008-06-27 22:17:14 UTC

 FreeBSD src repository

 Modified files:
   sys/sparc64/include  in_cksum.h 
 Log:

 SVN rev 180073 on 2008-06-27 22:17:14Z by marius
 
 Improve r180011 by explicitly adding the condition codes to the

 clobber list.
You should remove the volatile specifier. For example volatile prevents 
common subexpression elimination and other types of optimisations.




I had to adjust the constraint strings in this source file
twice now in order to keep GCC from generating broken code,
thus I prefer to be conservative by using a slightly bigger
hammer and leave the "__volatile" in in order to keep these
kind of problems from coming back to haunt us over and over
again. Especially when it comes to something as vaguely
("important side-effects", "access memory in an unpredictable
fashion", etc) documented as the GCC assembler constraints
and thus hard to get right without studying the GCC source
and maybe requiring "__volatile" in the future anyway.


volatile is for stuff, which cannot be expressed as data dependencies in 
the constraints, like writing to a machine status register of the CPU, 
accessing memory mapped hardware registers, which triggers something, 
stuff like that. The code in question only does some adds and shifts on 
registers, which is harmless and covered by the data dependecies of the 
input/output/clobber constraints. volatile is simply the wrong hammer.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Marius Strobl wrote:

On Sat, Jun 28, 2008 at 02:08:10PM +0200, Christoph Mallon wrote:

Marius Strobl wrote:
On a related note: Is inline assembler really necessary here? For 
example couldn't in_addword() be written as

static __inline u_short
in_addword(u_short const sum, u_short const b)
{
   u_int const t = sum + b;
   return t + (t >> 16);
} ?
This should at least produce equally good code and because the compiler 
has more knowledge about it than an assembler block, it potentially 
leads to better code. I have no SPARC compiler at hand, though.

With GCC 4.2.1 at -O2 the code generated for the above C version
takes on more instruction than the inline assembler so if one 

On SPARC?  What code does it produce? I have not SPARC compiler at hand.
Even if it is one more instruction, I think the reduced register 
pressure makes more than up for it.


Correct, it only uses two registers:

 :
   0:   92 02 00 09 add  %o0, %o1, %o1
   4:   91 32 60 10 srl  %o1, 0x10, %o0
   8:   90 02 00 09 add  %o0, %o1, %o0
   c:   91 2a 20 10 sll  %o0, 0x10, %o0
  10:   91 32 20 10 srl  %o0, 0x10, %o0
  14:   81 c3 e0 08 retl 
  18:   91 3a 20 00 sra  %o0, 0, %o0
  1c:   01 00 00 00 nop 


One more instruction? That's five instructions for the actual 
calculation afaict, just like the inline assembler version. The sra in 
the delay slot should be present in the inline assembler version, too.


This should work fine and only use two registers (though the compiler 
can choose to use three, if it deems it beneficial):


static __inline u_short
in_addword(u_short const sum, u_short const b)
{
  u_long const sum16 = sum << 16;
  u_long const b16   = b   << 16;
  u_long   ret;

  __asm(
"addcc %1, %2, %0\n\t"
"srl   %0, 16, %0\n\t"
"addc  %0,  0, %0\n"
: "=r" (ret) : "r" (sum16), "r" (b16) : "cc");

  return (ret);
}


This is ten instructions with two registers. Where is the
break even regarding instructions vs. registers for sparc64? :)


I still have no SPARC compiler. Ten instructions? All I did was write 
the two shifts in C and adjust the register constraints. It should 
produce identical code.



But I still prefer the C version.



And I prefer to not re-write otherwise working code for
micro-optimizations, there are enough unfixed real bugs


Obviously the inline assembler magic did not work and is/was a real bug.


to deal with. Similarly we should not waste time discussing
how to possibly optimize MD versions even more but rather
spend the time improving the MI version so it's good enough
that using MD versions isn't worth the effort.


The C alternative is MI and in length on par with the inline assembler 
version, isn't it?



Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Bruce Evans wrote:

So by now you are saying that using "__volatile" in this case
is the wrong solution and that using "cc" is a style bug. How
am I supposed to tell the compiler that the inline assembler
alters the condition codes then, which it apparently needs to
know as it at shown to otherwise generate broken code even
when using a single __asm() for the istructions.


Using volatile is wrong in this and most cases.

Use "cc" when it isn't a style bug :-).  I guess this is on some CPUs
including sparc64 now, and in all new asms on i386.


What is the rule for "cc" being a style bug? In style(9) nothing is 
mentioned. The only sensible rule I can think of is "if the assembler 
code destroys the condition codes, add "cc" to the clobber list, so GCC 
is informed about this and can get the data depedencies right".


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-28 Thread Christoph Mallon

Bruce Evans wrote:

I think the bug was the following:
subcc %foo, %bar, %g0 /* SPARC compare */
#APP
/* inline assembler of in_addword() here, which modifies the condition 
codes */

#NO_APP
bpl $somewhere /* condition branch depending on condition code */


Is the non-#APP part all from C code?


Yes, this is an example what happend here: The compiler moved an 
assembler block between code generated from C. The C code uses the 
condition codes, but the assembler code destroys the contents. This 
happens, when "cc" is not in the clobber list, because the C compiler 
(correctly) assumes that the assembler code does not modify the 
condition codes.


Certainly if gcc wants to put asm block in the middle of generated code, 
then

it needs to know all the clobbers.


Exactly. It's always a good idea not to lie to your compiler. (:


% %  For reasons similar to those described above, it is not possible to
 ^
% give an assembler instruction access to the condition code left by
  ^^
% previous instructions.
  ^^


I think the excerpt does not apply here, because the problem is the 
other way round (inline assembler interrupting C, not C interrupting 
multiple inline assembler blocks), see above.


Yes, it doesn't apply to any asm code wrapped in inline functions (except
to the internals of the functions).  No one would expect an inline function
to return results in the condition codes.


This is not about returing values in condition codes or inline 
functions. It is about an assembler block, which simply destroys the 
contents of the condition codes and the compiler not expecting this 
(because nobody told it).


On i386, specifying cc has no effect (the compiler must always assume 
that

cc is clobbered), and specifying cc in asms is a style bug in FreeBSD.


I have to disagree. Where does the GCC documentation state, that "cc" 
has no effect on x86? It is the other way round: The compiler assumes, 
the condition codes are *not* modified, if it is not explicitely 
stated. Exactly this caused the bug (though here on SPARC), which was 
tried to be solved by volatile. I still am convinced that specifying 
"cc" in the clobber list and not using volatile is the correct 
solution. If it is a style bug to specify "cc", the style should be 
changed, otherwise you cannot use inline assembler correctly.


Well, in gcc-1.40, gcc.info says nothing about "cc", apparently because
"cc" didn't exist in gcc-1.40.  gcc-1.40 had to assume that cc was
always clobbered.  I think there are enough old i386 asm statements


GCC 1.4? Would you please bury this corpse again? It smells. (;


around (e.g., almost all in FreeBSD and many in Linux) for gcc to preserve
compatiblity by not changing its assumption (for insignificant benefits


I've seen quite a bit assembler code break when going from 3.x to 4.x, 
because GCC moves assembler blocks more while scheduling and takes the 
given in/out constraints much more seriously.



since gcc doesn't really understand cc in asms).  (Linux-2.6.10/include/


It's not about "understanding cc in asms", it's just about telling GCC 
that the "cc" does not survive the assembler block undamaged.



asm-i386/*.h has 177 __asm__ statements (133 with volatile) with only
11 "cc"'s.  Examples of ones without "cc"'s are everything in string.h,
where things like scasb for strcmp() normally clobber "cc".)

Newer versions of gcc.info have to say to use it in clobber lists, to
support CPUs with a shorter history than i386 and to prepare for changing
the assumptions on i386.


Three fourth of the assembler statements needing volatile seems way to 
high to me. I did not find the strcmp() implementation, which uses 
scasb. I only found strcmp.S, which is not inline assembler and does not 
use scas either, which would have suprised me (nor cmpsb, which would be 
more logical).


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sparc64/include in_cksum.h

2008-06-29 Thread Christoph Mallon

Bruce Evans wrote:

Right.  Though I've never seen unnecessary's __volatiles significantly
affecting i386 code.  This is because the code in the asms can't be
removed completely, and can't be moved much either.  With out of order
execution, the type of moves that are permitted (not across 
dependencies)
  
^^

are precisely the type of moves that the CPU's scheduler can do or undo
no matter how the compiler orders the code.


I disagree. For example look at the use of in_addword() in 
dev/sk/if_sk.cv in line 2819:

 csum1 = htons(csum & 0x);
 csum2 = htons((csum >> 16) & 0x);
 ipcsum = in_addword(csum1, ~csum2 & 0x);
 /* checksum fixup for IP options */
 len = hlen - sizeof(struct ip);
 if (len > 0) {
   return;
 }

The calculation will be executed even if the following if (len > 0) 
leaves the function and the value of ipcsum is unused.
If in_addword() is not marked volatile it can be moved after the if 
and not be executed in all cases. csum1 and csum2 can be moved after 
the if, too.


No, volatile has no effect on whether the above calculation will be
executed, since the early return has no dependencies on the caclulation.


The volatile induces a dependency.


Old versions of gcc used to handle volatile like that, but this changed
in gcc-3 or earlier.  gcc.info now says:

% The `volatile' keyword indicates that the instruction has important
% side-effects.  GCC will not delete a volatile `asm' if it is reachable.
  ^^^


This is not about whether the code is reachable or not (it is 
reachable), it is about whether the result is used (i.e. whether the 
code is dead).



% (The instruction can still be deleted if GCC can prove that
% control-flow will never reach the location of the instruction.)  Note
% that even a volatile `asm' instruction can be moved relative to other
   
% code, including across jump instructions.  For example, on many targets
  


jump != conditional jump.
If it moves the *volatile* asm statement across the if, it would only 
appear on *some* execution paths, which is wrong. It is perfectly fine 
to move it, when the statement is not volatile, though.



Even if gcc didn't move the caclulation, then CPUs with out of order
execution might schedule it so that it is effectively never executed
(most likely by executing it in otherwise-unused pipelines while the
main pipeline returns).  This is valid for the same reasons that gcc
can move the volatile asms -- the return doesn't depend on the result
of the caclulation.


This is for the CPU to decide. If the assembler block really contains 
"important" stuff like memory barriers, writes to machine control 
registers etc., the CPU will not "effectively never execute" the code. 
The compiler does not know this, all it sees is the word "volatile".



The above C code is fairly bad, but generates not so bad code on i386:

% % movl%esi, %eax
% #APP
% xchgb %ah, %al# byte operations can be slow; this one not
# too bad, but I wonder if rorw $8 is better
# (rorl $16 is already used for corresponding
# 32-bit operations) where there is no xchg
# alternative
% #NO_APP


And this again is an example why not to use inline assembler, but let 
the compiler decide this:


unsigned short swap16(unsigned short x)
{
  return x >> 8 | x << 8;
}

is compiled to

swap16:
movl4(%esp), %eax
rolw$8, %ax
ret

The compiler is even able to do optimisations, which it absolutely 
cannot do, if inline assembler is used, example:


unsigned short id(unsigned short x)
{
  return swap16(swap16(x));
}

results in

id:
movl4(%esp), %eax
ret

Maybe the MD htons() macros should be replaced by MI code.


% shrl$16, %esi
% movl%esi, %edx
% #APP
% xchgb %dh, %dl# as above
% #NO_APP
% notl%edx# poor asm code -- the top 16 bits are unused
# except here to stall for merging them with
# the previous byte operation


The compiler simply does not know, that the inline assembler only 
operates on parts of the register. Another reason not to use inline 
assembler:


u_int g(u_short x)
{
  return ~swap16(x);
}

g:
movl4(%esp), %eax
rolw$8, %ax
movzwl  %ax, %eax # avoid stall
notl%eax
ret


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-10-29 Thread Christoph Mallon

Andrey A. Chernov wrote:

ache2007-10-27 22:32:28 UTC

  FreeBSD src repository

  Modified files:
include  _ctype.h 
  Log:

  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)
  
  Revision  ChangesPath

  1.33  +1 -1  src/include/_ctype.h


Actually this is rather a micro-pessimisation. Every compiler worth its 
money transforms the range check into single unsigned comparison. The 
latter test on the other hand on x86 gets probably transformed into a 
test instruction. This instruction has no form with sign extended 8bit 
immediate, but only with 32bit immediate. This results in a 
significantly longer opcode (three bytes more) than a single 
(unsigned)_c > 127, which a sane compiler produces. I suspect some RISC 
machines need one more instruction for the "micro-optimised" code, too.
In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > 
127, but it does not do this (the only compiler I found, which does this 
transformation, is LLVM).

Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.

Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-10-31 Thread Christoph Mallon

Andrey Chernov wrote:

On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote:

Andrey A. Chernov wrote:

ache2007-10-27 22:32:28 UTC
  FreeBSD src repository
  Modified files:
include  _ctype.h   Log:
  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)
Revision  ChangesPath
  1.33  +1 -1  src/include/_ctype.h
Actually this is rather a micro-pessimisation. Every compiler worth its 
money transforms the range check into single unsigned comparison. The 
latter test on the other hand on x86 gets probably transformed into a test 
instruction. This instruction has no form with sign extended 8bit 
immediate, but only with 32bit immediate. This results in a significantly 
longer opcode (three bytes more) than a single (unsigned)_c > 127, which a 
sane compiler produces. I suspect some RISC machines need one more 
instruction for the "micro-optimised" code, too.
In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > 
127, but it does not do this (the only compiler I found, which does this 
transformation, is LLVM).

Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.


1. My variant is compiler optimization level independent. F.e. without 
optimization completely there is no range check transform you talk about 
at all and very long asm code is generated. I also mean the case where gcc 
optimization bug was avoided, removing optimization (like compiling large 
part of Xorg server recently), using non-gcc compilers etc. cases.


Compiling without any optimisations makes the code slow for a zillion 
other reasons (no load/store optimisations, constant folding, common 
subexpression elimination, if-conversion, partial redundant expression 
elimination, strength reduction, reassociation, code placement, and many 
more), so a not transformed range check is really not of any concern.



2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously
big problems to decifer. I just want to keep all ctype in style.


Repeating cryptic code does not make it better, IMO.

3. I see no "longer opcode (three bytes more)" you talk about in my tests 
(andl vs cmpl was there, no testl).


See the reply to the mail with your code example.

Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-10-31 Thread Christoph Mallon

Andrey Chernov wrote:

On Thu, Nov 01, 2007 at 03:23:56AM +1100, Bruce Evans wrote:

In fact, one of the cleanups/optimizations in rev.1.5 and 1.6 by ache
and me was to get rid of the mask.  There was already a check for _c
< 0, so the mask cost even more.  The top limit was 256 instead of
128, so the point about 8bit immediates didn't apply, but I don't know
of any machines where the mask is faster (didn't look hard :-).  OTOH,
_c is often a char or a u_char (it is declared as mumble_rune_t, but
the functions are inline so the compiler can see the original type.
If _c is u_char and u_char is uint8_t, then (_c < 0 || c >= 256) is
always false, so the compiler should generate no code for it.  The top
limit of 256 was preferred so that this optimization is possible.  A
top limit of 128 doesn't work so well.


Please see the tests posted in this thread.


Please see my remarks on the tests.

Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-10-31 Thread Christoph Mallon

Andrey Chernov wrote:

On Wed, Oct 31, 2007 at 10:33:49PM +, Alexey Dokuchaev wrote:
For ones who doubts there two tests compiled with -O2. As you may see the 
result is almost identical (andl vs cmpl):

Q.E.D.  How about to restore original, more reader-friendly version then?


1. Reader-friendly version generates long code when absolutely no 
optimization used in compiler (for some reason f.e. to avoid optimization 
bugs).


Code with "absolutely no optimization" is slow for many other reasons.

2. It also breaks common style ctype using for is{w}ascii(). If revert 
this, is{w}ascii() should be rewritted too.


I think, the latter is a good suggestion.

Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-10-31 Thread Christoph Mallon

Andrey Chernov wrote:

On Thu, Nov 01, 2007 at 02:44:25AM +0100, Christoph Mallon wrote:
Also the example is still unrealistic: You usually don't multiply chars by 
two. Lets try something more realistic: an ASCII filter


int filter_ascii0(int c)
{
return c < 0 || c >= 128 ? '?' : c;
}

int filter_ascii1(int c)
{
return c & ~0x7F ? '?' : c;
}


We don't need that reaslistic examples, we need only what __isctype() 
does, and it just returns 0 or 1, not 'c'.


Sorry, I don't understand what you want to tell me. I showed, that your 
example is invalid (because of undefined behaviour) and unrealistic, 
therefore I provided a better example on how this condition is used. 
But, of course, let's look at __isctype() in both variants:


#include <_ctype.h>

int my__isctype0(__ct_rune_t _c, unsigned long _f)
{
return (_c & ~0x7F) ? 0 :
!!(_DefaultRuneLocale.__runetype[_c] & _f);
}


int my__isctype1(__ct_rune_t _c, unsigned long _f)
{
return (_c < 0 || _c >= 128) ? 0 :
!!(_DefaultRuneLocale.__runetype[_c] & _f);
}


 :
   0:   8b 4c 24 04 mov0x4(%esp),%ecx
   4:   31 d2   xor%edx,%edx
   6:   f7 c1 80 ff ff ff   test   $0xff80,%ecx
   c:   75 13   jne21 
   e:   8b 44 24 08 mov0x8(%esp),%eax
  12:   85 04 8d 34 00 00 00test   %eax,0x34(,%ecx,4)
  19:   b8 01 00 00 00  mov$0x1,%eax
  1e:   0f 45 d0cmovne %eax,%edx
  21:   89 d0   mov%edx,%eax
  23:   c3  ret
  24:   8d b6 00 00 00 00   lea0x0(%esi),%esi
  2a:   8d bf 00 00 00 00   lea0x0(%edi),%edi

0030 :
  30:   8b 4c 24 04 mov0x4(%esp),%ecx
  34:   31 d2   xor%edx,%edx
  36:   83 f9 7fcmp$0x7f,%ecx
  39:   77 13   ja 4e 
  3b:   8b 44 24 08 mov0x8(%esp),%eax
  3f:   85 04 8d 34 00 00 00test   %eax,0x34(,%ecx,4)
  46:   b8 01 00 00 00  mov$0x1,%eax
  4b:   0f 45 d0cmovne %eax,%edx
  4e:   89 d0   mov%edx,%eax
  50:   c3  ret


Here, again, the value of _c does not die at the condition, so a test 
instruction is used, which results in the expected difference of three 
bytes.


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include _ctype.h

2007-11-01 Thread Christoph Mallon

Andrey Chernov wrote:

On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:

* "Andrey A. Chernov" <[EMAIL PROTECTED]> [ 2007-10-27 ]
[ cvs commit: src/include _ctype.h ]

ache2007-10-27 22:32:28 UTC

  FreeBSD src repository

  Modified files:
include  _ctype.h 
  Log:

  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)

Isn't that a non-optimization in code and a minor pessimization of readability?
Maybe I'm getting rusty, but those seem to result in nearly identical code on
i386 with a relatively modern GCC.  Did you look at the compiler output for this
optimization?  Is there a specific expensive instruction you're trying to avoid?
For such thoroughyl bit-aligned range checks, you shouldn't even get a branch
for the former case.  Is there a platform other than i386 I should look at where
the previous expression is more clearly pessimized?  Or a different compiler
than GCC?


For ones who doubts there two tests compiled with -O2. As you may see the 
result is almost identical (andl vs cmpl):

 a.c 
main () {

int c;

return (c & ~0x7f) ? 0 : c * 2;
}
 a.s 
.file   "a.c"
.text
.p2align 4,,15
.globl main
.type   main, @function
main:
leal4(%esp), %ecx
andl$-16, %esp
pushl   -4(%ecx)
movl%eax, %edx
andl$-128, %edx
addl%eax, %eax
cmpl$1, %edx
sbbl%edx, %edx
pushl   %ebp
andl%edx, %eax
movl%esp, %ebp
pushl   %ecx
popl%ecx
popl%ebp
leal-4(%ecx), %esp
ret
.size   main, .-main
.ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
 a1.c 
main () {

int c;

return (c < 0 || c >= 128) ? 0 : c * 2;


}
 a1.s 
.file   "a1.c"
.text
.p2align 4,,15
.globl main
.type   main, @function
main:
leal4(%esp), %ecx
andl$-16, %esp
pushl   -4(%ecx)
addl%eax, %eax
cmpl$128, %eax
sbbl%edx, %edx
andl%edx, %eax
pushl   %ebp
movl%esp, %ebp
pushl   %ecx
popl%ecx
popl%ebp
leal-4(%ecx), %esp
ret
.size   main, .-main
.ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"


Your example is invalid. The value of c is undefined in this function 
and you see random garbage as result (for example in the code snippet 
you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which 
uses %eax, too). In fact it would be perfectly legal for the compiler to 
always return 0, call abort(), or let demons fly out of your nose.


Also the example is still unrealistic: You usually don't multiply chars 
by two. Lets try something more realistic: an ASCII filter


int filter_ascii0(int c)
{
return c < 0 || c >= 128 ? '?' : c;
}

int filter_ascii1(int c)
{
return c & ~0x7F ? '?' : c;
}

Especially mind that c is not dead after the condition. Even if your 
example did not used an undefined value, the value of c is dead after 
the test, which is unlikely for typical string handling code.


And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp 
-fomit-frame-pointer - I used these switches to get more compact code. 
It has no influence on the condition test.):


 :
   0:   8b 54 24 04 mov0x4(%esp),%edx
   4:   b8 3f 00 00 00  mov$0x3f,%eax
   9:   83 fa 7fcmp$0x7f,%edx
   c:   0f 46 c2cmovbe %edx,%eax
   f:   c3  ret

0010 :
  10:   8b 54 24 04 mov0x4(%esp),%edx
  14:   b8 3f 00 00 00  mov$0x3f,%eax
  19:   f7 c2 80 ff ff ff   test   $0xff80,%edx
  1f:   0f 44 c2cmove  %edx,%eax
  22:   c3  ret

You see there is a test instruction used in filter_ascii1, because the 
value in %edx does not die at the test, but is used again in the cmove.


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/include assert.h

2007-12-01 Thread Christoph Mallon

Poul-Henning Kamp wrote:

phk 2007-12-01 18:56:50 UTC

  FreeBSD src repository

  Modified files:
include  assert.h 
  Log:

  Add missing #ifndef _ASSERT_H_ protection against multiple inclusions


This change violates ANSI C:
"The assert macro is redefined according to the current state of NDEBUG 
each time that  is included." (ISO C99 §7.2)


Please also see the comment in assert.h:
/*
 * Unlike other ANSI header files,  may usefully be included
 * multiple times, with and without NDEBUG defined.
 */

Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/tools/regression/lib/libc/stdio test-printfloat.c test-scanfloat.c

2007-12-03 Thread Christoph Mallon

Dag-Erling Smørgrav wrote:

Andrey Chernov <[EMAIL PROTECTED]> writes:

David Schultz <[EMAIL PROTECTED]> writes:

test-printfloat is failing on my machine due to the fact that the
default locale in FreeBSD seems to have a thousands separator now.
I thought the default locale wasn't supposed to. Can someone
comment?

Default locale don't have thousands separator as you can see:
[...]
Something wrong in your default locale setup, use LC_ALL=C for sure.


If a regression test is dependent on a specific locale, it should set
the locale at startup rather than rely on the system default.


Don't C programs per definition always start in the "C" locale, not 
matter what LC_ALL etc. are set to?


Ah, I just checked the source code: There is a setlocale(LC_NUMERIC, "") 
in there.


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sbin/init init.c

2008-09-28 Thread Christoph Mallon

Kostik Belousov wrote:

On Sun, Sep 28, 2008 at 12:55:42PM -0700, Xin LI wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Kostik Belousov wrote:

On Sat, Sep 27, 2008 at 12:09:10AM +, Xin LI wrote:

delphij 2008-09-27 00:09:10 UTC

  FreeBSD src repository

  Modified files:
sbin/initinit.c 
  Log:

  SVN rev 183391 on 2008-09-27 00:09:10Z by delphij
  
  Static-ify procedures in init(8).
  
  Revision  ChangesPath

  1.66  +79 -79src/sbin/init/init.c

What is a reason for the change ?

This would reduce the size of generated binary...


I am quite curious. Could you, please, show the numbers.
Is the reduction in size is due to function inlining, or it is purely
symbol table size issue ?


If a function or variable can be made static, it should be. There is 
basically not excuse not to do so. In fact, it is considered a design 
flaw of C, that symbols default to external linkage instead of internal 
by default. The compiler can do more optimisations on things, it knows 
it is the only user of. For example it can change the calling convention 
of functions to pass parameters in registers instead of on the stack, if 
it knows all callers. If a function is not static, it must assume there 
are callers, which it has not control of. Changing the calling 
convention is just one example of possible optimisations. Further it 
reduces the chances of name clashes with functions local to other 
translation units.


Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/netinet tcp_input.c tcp_usrreq.c

2007-07-30 Thread Christoph Mallon

Dag-Erling Smørgrav wrote:

Dag-Erling Smørgrav <[EMAIL PROTECTED]> writes:

The attached patch unbreaks the build.


With additional hunk to fix usr.bin/netstat/ipx.c, which relied on the
brokenness of .


Hi,

may I suggest to declare the array as
static const char * const tcpstates[].
Mind the second const, so not only the strings are const (const char*), 
but the array itself, too.


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/amd64/amd64 busdma_machdep.c mem.c mp_watchdog.c pmap.c trap.c src/sys/amd64/conf GENERIC src/sys/amd64/isa isa_dma.c src/sys/arm/arm busdma_machdep.c mem.c pmap.c src/sys/arm/

2007-03-31 Thread Christoph Mallon

Kris Kennaway wrote:

On Sun, Apr 01, 2007 at 12:10:03AM -0700, [EMAIL PROTECTED] wrote:
eh? when I went to see what damage had been caused by this megachange I 
found no trace of it at all. Is this an April Fools mail?


Hmm, I think I might have screwed something up :(  CVS meisters, please
help!


Maybe his CVS mirror just hasn't caught up yet. (-:
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/amd64/amd64 busdma_machdep.c mem.c mp_watchdog.c pmap.c trap.c src/sys/amd64/conf GENERIC src/sys/amd64/isa isa_dma.c src/sys/arm/arm busdma_machdep.c mem.c pmap.c src/sys/arm/

2007-03-31 Thread Christoph Mallon

[EMAIL PROTECTED] wrote:



Arg- I meant '[EMAIL PROTECTED]:/home/ncvs'.

I'm still seeing this as a bogus checkin

The header for the mail from 'Kris' is:

Received: from elvis.mu.org (elvis.mu.org [192.203.228.196])
by mx1.freebsd.org (Postfix) with ESMTP id D770A13C43E;
Sun,  1 Apr 2007 07:13:53 + (UTC)
(envelope-from [EMAIL PROTECTED])
Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196])
by elvis.mu.org (Postfix) with ESMTP id 4E8751A4DAC;
Sun,  1 Apr 2007 00:13:53 -0700 (PDT)
Received: by obsecurity.dyndns.org (Postfix, from userid 1000)
id 9EFC6517E2; Sun,  1 Apr 2007 03:13:52 -0400 (EDT)


This is more like Paul Saab's IP address.




freefall.freebsd.org?

On Sun, 1 Apr 2007, Christoph Mallon wrote:


Kris Kennaway wrote:

On Sun, Apr 01, 2007 at 12:10:03AM -0700, [EMAIL PROTECTED] wrote:
eh? when I went to see what damage had been caused by this 
megachange I found no trace of it at all. Is this an April Fools mail?


Hmm, I think I might have screwed something up :(  CVS meisters, please
help!


Maybe his CVS mirror just hasn't caught up yet. (-:







Hm, it could be a temporal paradox. Do you happen to live somewhere 
where it isn't 04:21:44 (the point in time when it was commited) yet? If 
yes then just wait till this time. After all information cannot travel 
faster than the speed of light, otherwise it would end up in the future.
Alternatively this could be a joke by Kris, but I think a temporal 
paradox if far more likely. (-:

___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/kern sched_ule.c

2007-03-17 Thread Christoph Mallon

Max Laier schrieb:

On Saturday 17 March 2007 20:09, Jeff Roberson wrote:

Any language lawyers care to comment on this?


I find this strange.  According to the spec "(Decrementing is equivalent 
to subtracting 1.)", but "pri = --pri % RQ_NQS;" will behave like you 
expect, while "pri = (pri - 1) % RQ_NQS;" clearly didn't.


"pri = --pri % RQ_NQS;" modifies "pri" twice between two sequence 
points. According to the C standard this causes undefined behaviour!

___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/netinet/libalias alias_util.c

2007-12-04 Thread Christoph Mallon

Eygene Ryabinkin wrote:

Tue, Dec 04, 2007 at 05:25:35PM +, Alexey Dokuchaev wrote:

*ptr++ would choke since pointer arith on (void *) is undefined AFAIK.

I've been under impression that ++ on void * whould simply increase it
by one.


This behaviour is documented for GCC:
http://www.mcs.vuw.ac.nz/cgi-bin/info2www?(gcc)Pointer+Arith

Just for the record (gcc 4.2.1):
-
$ gcc -o test -Wall -ansi -pedantic test.c
test.c: In function 'main':
test.c:9: warning: wrong type argument to increment

$ ./test
'2'

$ g++ -o test -Wall -ansi test.c
test.c: In function 'int main()':
test.c:9: error: ISO C++ forbids incrementing a pointer of type 'void*'

$ cat test.c
#include 

int
main(void)
{
char c[] = "123456789abcdef";
void *p = c;

p++;
printf("'%c'\n", *((char *)p));
return 0;
}
-

It seems to me that ++ adds one to the void pointer because it is
demanded by C99 (ISO/IEC 9899:TC2, 6.2.5, requirement 26, page 36)
that 'char *' and 'void *' have the same representation and
alignment requirements.  So, it seems to me that (p++) has implicit
conversion from 'void *' to 'char *' for 'void *p', at least it
can be interpreted in this way.

But some people say that void* arithmetics is GCC'ism.  It worth to
note that the warning about void* arithmetics lived in GCC at least
since 1992: see
http://gcc.gnu.org/viewcvs/trunk/gcc/c-typeck.c?revision=364&view=markup
function 'pointer_int_sum'.

And the problem of 'void *' arithmetics had been touched in the
-current a while ago:
http://lists.freebsd.org/pipermail/freebsd-current/2003-July/006439.html

I am failing to find a place in the C standard where void arithmetics
is prohibited, but I can be blind.  Anyone?


Arithmethic on void pointers is forbidden. The relevant parts of the 
(C99) standard are:


?6.2.5 clause 19: "The void type comprises an empty set of values; it is 
an incomplete type that cannot be completed."


?6.2.5 clause 1: "[...] Types are partitioned into object types [...], 
function types [...], and incomplete types [...].


?6.5.6 clause 2: "For addition [...] one operand shall be a pointer to 
an object type and the other shall have integer type. [...]"
(subtraction has an analogous statement, increment and decrement are 
just addition/substraction by one)


So the conclusion is:
- void* is a pointer to an incomplete type.
- Incomplete types are not object types.
- Addition is only allowed on pointers to an object type.
Therefore arithmetic on void pointers is not allowed. Arithmetic on void 
pointers is indeed a GCCism.


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src/sys/sys param.h

2007-12-06 Thread Christoph Mallon

Kip Macy wrote:

On Dec 6, 2007 2:25 AM, Bruce Evans <[EMAIL PROTECTED]> wrote:

On Thu, 6 Dec 2007, Kip Macy wrote:


kmacy   2007-12-06 04:00:59 UTC

 FreeBSD src repository

 Modified files:
   sys/sys  param.h
 Log:
 Respect the fact that the value a may be constant so cast to const uint8_t *

 Revision  ChangesPath
 1.318 +2 -2  src/sys/sys/param.h

The correct fix is to back out 1.317.  If not, at least spell `unsigned
char' correctly and fix the other new style bug (a line longer than
80 characters from adding `const').  Using uint8_t is only a style bug
since POSIX probably requires unsigned char to be the same as uint8_t.
If unsigned char is larger than uint8_t, then revs.1.317-318 give
undefined behaviour (aliasing bugs) and clearly broken behaviour (wrong
divisor NBBY).  These bugs are easy to avoid by using the correct
spelling.



I'm inclined to do whatever you say so long as my code works without a
substantial rewrite. However, can you please point me at where it says
uint8_t is not style(9) compliant?


I think, he means that uint8_t is not necessarily the same type as 
unsigned char. In fact according to the C99 standard the type uint8_t 
does not necessarily exist (section 7.18.1.1).


Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: cvs commit: src UPDATING src/include fts.h src/lib/libc/gen Makefile.inc Symbol.map fts-compat.c fts-compat.h fts.3 fts.c src/sys/sys param.h

2008-01-28 Thread Christoph Mallon

Erik Trulsson wrote:

On Mon, Jan 28, 2008 at 11:55:29AM +0100, Dag-Erling Smørgrav wrote:

Yar Tikhiy <[EMAIL PROTECTED]> writes:

Excuse me, but did you notice that fts(3) is not a part of sys?  It's
generic userland code, albeit it's contaminated by system-dependent
parts for performance or whatever.

Irrelevant.


But let intN_t be mostly confined in the kernel and system-dependent
userland code.  E.g., system-dependent include files can use them
to define more portable types such as ino_t, nlink_t, or whatever.

C99 doesn't define those either.


Userland code should be portable and useful to other systems in the
chosen domain of compatibility, e.g., C99 or POSIX, unless there
are substantial reasons for it not to.  That's how different projects
can benefit from each other's work.

Both C99 and POSIX *require* int64_t and uint64_t on all platforms that
have 64-bit integer types.

FreeBSD has never run on any platform that doesn't.  I don't think
NetBSD or OpenBSD has either, nor Solaris, nor Linux to my knowledge.


Those are all good reasons for why using 'int64_t' would be OK.
None of it is a reason for why using 'long long' would not be OK when you
want at least 64 bits, but do not require exactly 64 bits.


How about int_least64_t? It's a required type of at least 64bits.
I'd like my bikeshed green with yellow dots, please.

Regards
Christoph
___
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"