------- Comment #5 from sergstesh at yahoo dot com  2006-11-18 15:17 -------
IIRC, misaligned data should cause performance penalty, not segmentation fault.

Look at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29818 , at the case when
there is no segfault:

"
when the code runs fine (i.e. compiled by gcc-4.1.1), the screen output is:

"
checkpoint 1
&inp_array_1[0]=80498e0
checkpoint 2
bn=1
&inp_array_1[1]=80498e4
checkpoint 3
checkpoint 4
...
"

- as you see '&inp_array_1[1]=80498e4', and its WRT to line numbers 31..35 in:

     29   while(half_nos - bn >= NUMBER_OF_ELEMENTS_IN_VECTOR)
     30     {
     31     fprintf(stderr, "bn=%u\n", bn);
     32     fprintf(stderr, "&inp_array_1[%u]=%0lx\n", bn, (unsigned
long)&inp_array_1[bn]);
     33
     34     fprintf(stderr, "checkpoint 3\n");
     35     vtmp1.v = *(vFloat *)&inp_array_1[bn];
     36     fprintf(stderr, "checkpoint 4\n");
     37
     38     bn += NUMBER_OF_ELEMENTS_IN_VECTOR;
     39     } // while(half_nos - bn >= NUMBER_OF_ELEMENTS_IN_VECTOR)
.

In this case the address is 80498e4, i.e. no a multiple of 16, still, the
code does not segfault, even though a misaligned operation:

     35     vtmp1.v = *(vFloat *)&inp_array_1[bn];

is executed.

This is what I found in the documentation:

http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options
:

"
-mpreferred-stack-boundary=num
    Attempt to keep the stack boundary aligned to a 2 raised to num byte
boundary. If -mpreferred-stack-boundary is not specified, the default is 4 (16
bytes or 128 bits), except when optimizing for code size (-Os), in which case
the default is the minimum correct alignment (4 bytes for x86, and 8 bytes for
x86-64).

    On Pentium and PentiumPro, double and long double values should be aligned
to an 8 byte boundary (see -malign-double) or suffer significant run time
performance penalties. On Pentium III, the Streaming SIMD Extension (SSE) data
type __m128 suffers similar penalties if it is not 16 byte aligned.
...
"

- from the above I expected to "suffer significant run time performance
penalties", but not a segfault.

Could you please:

1) point me to the documentation which says that misaligned SSE data will
cause segfault;

2) if such a document does not exist, update the documentation, preferably
pointing also to Intel documentation stating that misaligned SSE data causes
segmentation fault;

3) explain, how/why the code in

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29818

does not segfault even though it has the same misalignment as here


?

Thanks in advance.


-- 

sergstesh at yahoo dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29884

Reply via email to