I found and fixed another problem in the latest memcpy/memest changes
- with this fix all the failing tests mentioned in #51134 started
passing. Bootstraps are also ok.
Though I still see fails in 32-bit make check, so probably, it'd be
better to revert the changes till these fails are fixed.

On 21 November 2011 20:36, Michael Zolotukhin
<michael.v.zolotuk...@gmail.com> wrote:
> Hi,
>
> Continuing investigation of fails on bootstrap I found next problem
> (besides the problem with unknown alignment described above): there is
> a mess with size_needed and epilogue_size_needed when we generate
> epilogue loop which also use SSE-moves, but no unrolled - that's
> probably the reason of the fails we saw.
>
> Please check the attached patch - though the full testing isn't over
> yet. bootstraps seem to be ok as well as arrayarg.f90-test (with
> sse_loop enabled).
>
> On 19 November 2011 05:38, Jan Hubicka <hubi...@ucw.cz> wrote:
>>> Given that x86 memset/memcpy is still broken, I think we should revert
>>> it for now.
>>
>> Well, looking into the code, the SSE alignment issues needs work - the
>> alignment test merely tests whether some alignmnet is known not whether 16 
>> byte
>> alignment is known that is the cause of failures in 32bit bootstrap.  I 
>> originally
>> convinced myself that this is safe since we soot for unaligned load/stores 
>> anyway.
>>
>>
>> I've commited the following patch that disabled SSE codegen and unbreaks atom
>> bootstrap.  This seems more sensible to me given that the patch cumulated 
>> some
>> good improvements on the non-SSE path as well and we could return into the 
>> SSE
>> alignment issues incremntally.  There is still falure in the fortran testcase
>> that I am convinced is previously latent issue.
>>
>> I will be offline tomorrow.  If there are futher serious problems, just fell
>> free to revert the changes and we could look into them for next stage1.
>>
>> Honza
>>
>>        * i386.c (atom_cost): Disable SSE loop until alignment issues are 
>> fixed.
>> Index: i386.c
>> ===================================================================
>> --- i386.c      (revision 181479)
>> +++ i386.c      (working copy)
>> @@ -1783,18 +1783,18 @@ struct processor_costs atom_cost = {
>>   /* stringop_algs for memcpy.
>>      SSE loops works best on Atom, but fall back into non-SSE unrolled loop 
>> variant
>>      if that fails.  */
>> -  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* 
>> Known alignment.  */
>> -    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
>> -   {{libcall, {{2048, sse_loop}, {2048, unrolled_loop}, {-1, libcall}}}, /* 
>> Unknown alignment.  */
>> -    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
>> +  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  
>> */
>> +    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
>> +   {{libcall, {{2048, unrolled_loop}, {-1, libcall}}}, /* Unknown 
>> alignment.  */
>> +    {libcall, {{2048, unrolled_loop},
>>               {-1, libcall}}}}},
>>
>>   /* stringop_algs for memset.  */
>> -  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* 
>> Known alignment.  */
>> -    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
>> -   {{libcall, {{1024, sse_loop}, {1024, unrolled_loop},         /* Unknown 
>> alignment.  */
>> +  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  
>> */
>> +    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
>> +   {{libcall, {{1024, unrolled_loop},   /* Unknown alignment.  */
>>               {-1, libcall}}},
>> -    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
>> +    {libcall, {{2048, unrolled_loop},
>>               {-1, libcall}}}}},
>>   1,                                   /* scalar_stmt_cost.  */
>>   1,                                   /* scalar load_cost.  */
>
>
>
> --
> ---
> Best regards,
> Michael V. Zolotukhin,
> Software Engineer
> Intel Corporation.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

Reply via email to