On Fri, 22 Nov 2024 10:36:10 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or >> pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred >> through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and >> their corresponding entry points are defined in the newly added Float16Math >> class. >> - These intrinsics receive unwrapped short arguments encoding IEEE >> 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated >> idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please >> refer to [FAQs >> ](https://github.com/openjdk/jdk/pull/21490#issuecomment-2482867818)for more >> details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are >> always loaded into general purpose register, but FP16 ISA instructions >> generally operate over floating point registers, therefore compiler injectes >> reinterpretation IR before and after Float16 operation nodes to move short >> value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. >> HF2S + S2HF = HF >> 6. Auto-vectorization of newly supported scalar operations. >> 7. X86 and AARCH64 backend implementation for all supported intrinsics. >> 9. Functional and Performance validation tests. >> >> **Missing Pieces:-** >> **- AARCH64 Backend.** >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Testpoints for new value transforms + code cleanups Wow, thanks for tackling this! Ok, lots of style comments. But again: I would have loved to see this split up into these parts: - Scalar - Scalar optimizations (value, ideal, identity) - Vector This will again take many many week to get reviewed because it is a 3k+ change with lots of details. Do you have any tests for the scalar constant folding optimizations? I did not find them. src/hotspot/cpu/x86/x86.ad line 10910: > 10908: %} > 10909: > 10910: instruct convF2HFAndS2HF(regF dst, regF src) I'm starting to see that you use sometimes `H` and sometimes `HF`. That needs to be consistent - unless they are 2 different things? src/hotspot/cpu/x86/x86.ad line 10930: > 10928: %} > 10929: > 10930: instruct scalar_sqrt_fp16_reg(regF dst, regF src) Hmm, and them you also use `fp16`... so now we have `H`, `HF` and `fp16`... src/hotspot/share/opto/addnode.cpp line 713: > 711: > //------------------------------add_of_identity-------------------------------- > 712: // Check for addition of the identity > 713: const Type *AddHFNode::add_of_identity(const Type *t1, const Type *t2) > const { I would generally drop out these comments, unless they actually have something useful to say that the name does not say. You could make a comment why you are returning `nullptr`, i.e. doing nothing. And for style: the `*` belongs with the type ;) Suggestion: const Type* AddHFNode::add_of_identity(const Type* t1, const Type* t2) const { src/hotspot/share/opto/addnode.cpp line 721: > 719: // This also type-checks the inputs for sanity. Guaranteed never to > 720: // be passed a TOP or BOTTOM type, these are filtered out by pre-check. > 721: const Type *AddHFNode::add_ring(const Type *t0, const Type *t1) const { Suggestion: // Supplied function returns the sum of the inputs. // This also type-checks the inputs for sanity. Guaranteed never to // be passed a TOP or BOTTOM type, these are filtered out by pre-check. const Type* AddHFNode::add_ring(const Type* t0, const Type* t1) const { Here the comments are great :) src/hotspot/share/opto/addnode.cpp line 1625: > 1623: > 1624: // handle min of 0.0, -0.0 case. > 1625: return (jint_cast(f0) < jint_cast(f1)) ? r0 : r1; Can you please add some comments for this here? Why is there an int-case on floats? Why not just do the ternary comparison on line 1621: `return f0 < f1 ? r0 : r1;`? src/hotspot/share/opto/addnode.hpp line 179: > 177: virtual Node* Identity(PhaseGVN* phase) { return this; } > 178: virtual uint ideal_reg() const { return Op_RegF; } > 179: }; Please put the `*` with the type everywhere. src/hotspot/share/opto/connode.cpp line 49: > 47: switch( t->basic_type() ) { > 48: case T_INT: return new ConINode( t->is_int() ); > 49: case T_SHORT: return new ConHNode( t->is_half_float_constant() ); That will be quite confusing.... don't you think? src/hotspot/share/opto/connode.hpp line 122: > 120: class ConHNode : public ConNode { > 121: public: > 122: ConHNode( const TypeH *t ) : ConNode(t) {} Suggestion: ConHNode(const TypeH* t) : ConNode(t) {} src/hotspot/share/opto/connode.hpp line 129: > 127: return new ConHNode( TypeH::make(con) ); > 128: } > 129: Suggestion: src/hotspot/share/opto/convertnode.cpp line 256: > 254: > //------------------------------Ideal------------------------------------------ > 255: Node* ConvF2HFNode::Ideal(PhaseGVN* phase, bool can_reshape) { > 256: // Optimize pattern - ConvHF2F (FP32BinOp) ConvF2HF ==> > ReinterpretS2HF (FP16BinOp) ReinterpretHF2S. This is a little dense and I don't understand your notation. So we are pattern matching: `ConvF2HF( FP32BinOp(ConvHF2F(x), ConvHF2F(y)) )` <- I think that would be more readable. You could also create local variables for `x` and `y`, just so it is more readable. And then instead we generate: `ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y)))` Ok, so you are saying why lift to FP32, if we cast down to FP16 anyway... would be nice to have such a comment at the top to motivate the optimization! What confuses me a little here: why do we even have to cast from and to `short` here? Maybe a quick comment about that would also help. src/hotspot/share/opto/convertnode.cpp line 948: > 946: } > 947: > 948: bool Float16NodeFactory::is_binary_oper(int opc) { Suggestion: bool Float16NodeFactory::is_float32_binary_oper(int opc) { Just so it is explicit, since you have the parallel `get_float16_binary_oper` below. src/hotspot/share/opto/convertnode.hpp line 234: > 232: class ReinterpretHF2SNode : public Node { > 233: public: > 234: ReinterpretHF2SNode( Node *in1 ) : Node(0,in1) {} Suggestion: ReinterpretHF2SNode(Node* in1) : Node(0, in1) {} src/hotspot/share/opto/divnode.cpp line 759: > 757: const Type* t2 = phase->type(in(2)); > 758: if(t1 == Type::TOP) return Type::TOP; > 759: if(t2 == Type::TOP) return Type::TOP; Suggestion: if(t1 == Type::TOP) { return Type::TOP; } if(t2 == Type::TOP) { return Type::TOP; } Please use the brackets consistently. src/hotspot/share/opto/divnode.cpp line 765: > 763: if((t1 == bot) || (t2 == bot) || > 764: (t1 == Type::BOTTOM) || (t2 == Type::BOTTOM)) > 765: return bot; Suggestion: if((t1 == bot) || (t2 == bot) || (t1 == Type::BOTTOM) || (t2 == Type::BOTTOM)) { return bot; } Again: please always use brackets. src/hotspot/share/opto/divnode.cpp line 776: > 774: > 775: if(t2 == TypeH::ONE) > 776: return t1; brackets src/hotspot/share/opto/divnode.cpp line 782: > 780: t2->base() == Type::HalfFloatCon && > 781: t2->getf() != 0.0) // could be negative zero > 782: return TypeH::make(t1->getf()/t2->getf()); Suggestion: // If divisor is a constant and not zero, divide the numbers if(t1->base() == Type::HalfFloatCon && t2->base() == Type::HalfFloatCon && t2->getf() != 0.0) { // could be negative zero return TypeH::make(t1->getf() / t2->getf()); } src/hotspot/share/opto/divnode.cpp line 789: > 787: > 788: if(t1 == TypeH::ZERO && !g_isnan(t2->getf()) && t2->getf() != 0.0) > 789: return TypeH::ZERO; brackets for if Ok, why not also do it for negative zero then? src/hotspot/share/opto/divnode.cpp line 797: > 795: > //------------------------------isA_Copy--------------------------------------- > 796: // Dividing by self is 1. > 797: // If the divisor is 1, we are an identity on the dividend. Suggestion: // If the divisor is 1, we are an identity on the dividend. `Dividing by self is 1.` That does not seem to apply here. Maybe you meant `dividing by 1 is self`? src/hotspot/share/opto/divnode.cpp line 804: > 802: > 803: > //------------------------------Idealize--------------------------------------- > 804: Node *DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) { Suggestion: Node* DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) { src/hotspot/share/opto/divnode.cpp line 805: > 803: > //------------------------------Idealize--------------------------------------- > 804: Node *DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) { > 805: if (in(0) && remove_dead_region(phase, can_reshape)) return this; Suggestion: if (in(0) != nullptr && remove_dead_region(phase, can_reshape)) { return this; } brackets for if and no implicit null checks please! src/hotspot/share/opto/divnode.cpp line 814: > 812: > 813: const TypeH* tf = t2->isa_half_float_constant(); > 814: if(!tf) return nullptr; no implicit booleans! src/hotspot/share/opto/divnode.cpp line 836: > 834: > 835: // return multiplication by the reciprocal > 836: return (new MulHFNode(in(1), phase->makecon(TypeH::make(reciprocal)))); Do we have good tests for this optimization? src/hotspot/share/opto/mulnode.cpp line 559: > 557: > 558: // Compute the product type of two half float ranges into this node. > 559: const Type *MulHFNode::mul_ring(const Type *t0, const Type *t1) const { Suggestion: const Type* MulHFNode::mul_ring(const Type* t0, const Type* t1) const { src/hotspot/share/opto/mulnode.cpp line 561: > 559: const Type *MulHFNode::mul_ring(const Type *t0, const Type *t1) const { > 560: if( t0 == Type::HALF_FLOAT || t1 == Type::HALF_FLOAT ) return > Type::HALF_FLOAT; > 561: return TypeH::make( t0->getf() * t1->getf() ); I hope that `TypeH::make` handles the overflow cases well... does it? And do we have tests for this? src/hotspot/share/opto/mulnode.cpp line 1945: > 1943: return TypeH::make(fma(f1, f2, f3)); > 1944: #endif > 1945: } I need: - brackets for ifs - all `*` on the left with the type - An explanation what the `ifdef __STDC_IEC_559__` does. src/hotspot/share/opto/mulnode.hpp line 155: > 153: virtual const Type *mul_ring( const Type *, const Type * ) const; > 154: const Type *mul_id() const { return TypeH::ONE; } > 155: const Type *add_id() const { return TypeH::ZERO; } Suggestion: const Type* mul_id() const { return TypeH::ONE; } const Type* add_id() const { return TypeH::ZERO; } src/hotspot/share/opto/mulnode.hpp line 160: > 158: int max_opcode() const { return Op_MaxHF; } > 159: int min_opcode() const { return Op_MinHF; } > 160: const Type *bottom_type() const { return Type::HALF_FLOAT; } Suggestion: const Type* bottom_type() const { return Type::HALF_FLOAT; } src/hotspot/share/opto/subnode.cpp line 1975: > 1973: if( f < 0.0f ) return Type::HALF_FLOAT; > 1974: return TypeH::make( (float)sqrt( (double)f ) ); > 1975: } if brackets and asterisks with types please src/hotspot/share/opto/subnode.hpp line 143: > 141: const Type *bottom_type() const { return Type::HALF_FLOAT; } > 142: virtual uint ideal_reg() const { return Op_RegF; } > 143: }; Suggestion: //------------------------------SubHFNode-------------------------------------- // Subtract 2 half floats class SubHFNode : public SubFPNode { public: SubHFNode(Node* in1, Node* in2) : SubFPNode(in1, in2) {} virtual int Opcode() const; virtual const Type* sub(const Type *, const Type *) const; const Type* add_id() const { return TypeH::ZERO; } const Type* bottom_type() const { return Type::HALF_FLOAT; } virtual uint ideal_reg() const { return Op_RegF; } }; src/hotspot/share/opto/subnode.hpp line 552: > 550: } > 551: virtual int Opcode() const; > 552: const Type *bottom_type() const { return Type::HALF_FLOAT; } Suggestion: const Type* bottom_type() const { return Type::HALF_FLOAT; } src/hotspot/share/opto/type.cpp line 1487: > 1485: typerr(t); > 1486: > 1487: case HalfFloatCon: // Float-constant vs Float-constant? Suggestion: case HalfFloatCon: // Float-constant vs Float-constant? ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/21490#pullrequestreview-2457382009 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855943470 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855944584 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855948500 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855950333 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855954166 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855955074 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855958333 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855958773 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855959025 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855977560 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855981273 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855982405 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855984366 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855985484 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855988545 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855989752 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855992127 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855994876 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855995436 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855996454 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856000589 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856002336 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856007382 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856006524 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856009749 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856010212 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856010391 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856013278 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856013945 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856014893 PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856016525