https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661
--- Comment #4 from Christophe Monat <christophe.monat at st dot com> --- Hi Pratamesh, You're absolutely right - maybe it's more efficient when there is some hardware sincos available (Intel FSINCOS ?) but I would check also carefully the actual performance. Indeed, it looks to me that either you have to use two different polynomials or shift one argument and use either sin or cos, but anyway twice. We studied that in a slightly different context with Claude-Pierre Jeannerod from ENS Lyon and our PhD Jingyan Lu-Jourdan a while ago : "Simultaneous floating-point sine and cosine for VLIW integer processors" available here: https://hal.archives-ouvertes.fr/hal-00672327 and we were able to gain significant performance by exploiting the low-level parallelism of the processor. Agreed, this is not a full IEEE implementation but the important ideas are there.