http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55701



             Bug #: 55701

           Summary: Inline some instances of memset for ARM

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: enhancement

          Priority: P3

         Component: target

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: josh.m.con...@gmail.com





memset() is almost never inlined on ARM, even at -O3.  If the target is known

to be 4-byte aligned or greater, it will be inlined for 1, 2, or 4 byte

lengths.  If the target alignment is unknown, it will be inlined only for a

single byte.



I don't see this problem with similar builtins (memcpy, memmove, and memclear

(memset with a target value of zero)) - they all inline small cases.



It probably makes sense for memset to be inlined up to at least 16 bytes or so

in all cases.



When aligned, memcpy and memmove use a ldmia/stmia (load multiple/store

multiple) sequence to create fairly compact inline code.  We could consider

doing the same sort of optimization with memset, using stmia only.

Reply via email to