https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79581
Bug ID: 79581 Summary: VFP4 slower than VFP3 in C-ray Product: gcc Version: 7.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tulipawn at gmail dot com Target Milestone: --- Created attachment 40762 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40762&action=edit preprocessed source $ gcc -marm -Ofast -mcpu=cortex-a5 -mfpu=vfpv3 c-ray-mt.i -lm -lpthread $ ./a.out -t 32 -s 160x120 -r 8 -i sphfract -o output.ppm ; done Rendering took: 2 seconds (2393 milliseconds) $ gcc -marm -Ofast -mcpu=cortex-a5 -mfpu=vfpv4 c-ray-mt.i -lm -lpthread $ ./a.out -t 32 -s 160x120 -r 8 -i sphfract -o output.ppm ; done Rendering took: 2 seconds (2494 milliseconds) This defect dates back to gcc 4.9 (or earlier) but at least gcc 7 provides a big speedup in vfvp4 code. (roughly 2500 now vs 2700 previously)