Date: Wed, 7 Mar 2018 00:33:04 -0800 From: Eitan Adler <li...@eitanadler.com> Message-ID: <caf6rxgkptzu_+m2howvqb5wxdafdrdl2yqj0vcy2lovswt6...@mail.gmail.com>
| I'd like to commit the patch below. Does anyone have concerns with it? The change looks fine technically, but it would be good to see some benchmark results before committing it - particularly for the more common case where x != 1.0 (but including where x == 2.0 or 0.5) - the change swiches from using arithmetic and a single branch to multiple branches (or would with simplisitc compilation) and branches are slow wrt prefetch and parallel execution. Do make sure the benchmark tests atan2() though, not the loop which surrounds it by making a loop with a lot of atan2() calls it, one after another (even calling over and over again with the same arg.) If the new version slows things down (which it might have once, but now the compilers are smarter, and there might be no difference) then perhaps we can find some other way of writing the expression that avoids overflows, and is still fast. kre