https://llvm.org/bugs/show_bug.cgi?id=26417
vyacheslav.n.kloch...@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |--- --- Comment #5 from vyacheslav.n.kloch...@gmail.com --- The patch https://reviews.llvm.org/rL276521 is rather a partial solution. I have a change-set which fully supports optimizations for ALL FMA opcodes. The solution is quite laconic. Now it is going through internal code-review after which I am going to submit the change-set for code-review in Open Source community. 1) a) Totally there are about 1584 FMA opcodes now. The patch rL276521 enabled only _some_ commute transformation for about 1/4 of them. Memory folding does not fully work for FMAs. Also, k-masked, k-zero-masked FMAs are not optimized (+FMAs with explicit rounding, broadcasts, etc). b) Some additional changes are required in some other places like isNonFoldablePartialRegisterLoad(), etc. c) having switch statements with more than 1.5k 'case <FMA***: >' statements does not seem good, especially because such statements are needed in several places. 2) Commute transformation is still not fully enabled as re-work in X86instrAVX512.td is needed. VMOVAPS instructions are declared as non-'isFoldable', 'isCommuted' switch is not set for many/most of AVX512 opcodes, etc. 3) Commute transformation is going to be not the only user/modifier of FMA code soon, and the FMA opcodes should be re-usable there without need to list all FMA opcodes again. My solution allows to list FMA opcodes once, classify them in one place, i.e. once and use that information anywhere where it is needed. Please let me finish this work, it should not take long. The work definitely should be done in August. -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs