Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations [v2]

Emanuel Peter Mon, 25 Nov 2024 00:05:51 -0800

On Fri, 22 Nov 2024 10:36:10 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:


>> Hi All,
>> 
>> This patch adds C2 compiler support for various Float16 operations added by 
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>> 
>> Following is the summary of changes included with this patch:-
>> 
>> 1. Detection of various Float16 operations through inline expansion or 
>> pattern folding idealizations.
>> 2. Float16 operations like add, sub, mul, div, max, and min are inferred 
>> through pattern folding idealization.
>> 3. Float16 SQRT and FMA operation are inferred through inline expansion and 
>> their corresponding entry points are defined in the newly added Float16Math 
>> class.
>>       -    These intrinsics receive unwrapped short arguments encoding IEEE 
>> 754 binary16 values.
>> 5. New specialized IR nodes for Float16 operations, associated 
>> idealizations, and constant folding routines.
>> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please 
>> refer to [FAQs 
>> ](https://github.com/openjdk/jdk/pull/21490#issuecomment-2482867818)for more 
>> details.
>> 7. Since Float16 uses short as its storage type, hence raw FP16 values are 
>> always loaded into general purpose register, but FP16 ISA instructions 
>> generally operate over floating point registers, therefore compiler injectes 
>> reinterpretation IR before and after Float16 operation nodes to move short 
>> value to floating point register and vice versa.
>> 8. New idealization routines to optimize redundant reinterpretation chains. 
>> HF2S + S2HF = HF
>> 6. Auto-vectorization of newly supported scalar operations.
>> 7. X86 and AARCH64 backend implementation for all supported intrinsics.
>> 9. Functional and Performance validation tests.
>> 
>> **Missing Pieces:-**
>> **-  AARCH64 Backend.**
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Testpoints for new value transforms + code cleanups

Wow, thanks for tackling this!

Ok, lots of style comments.

But again:
I would have loved to see this split up into these parts:
- Scalar
- Scalar optimizations (value, ideal, identity)
- Vector

This will again take many many week to get reviewed because it is a 3k+ change 
with lots of details.

Do you have any tests for the scalar constant folding optimizations? I did not 
find them.

src/hotspot/cpu/x86/x86.ad line 10910:

> 10908: %}
> 10909: 
> 10910: instruct convF2HFAndS2HF(regF dst, regF src)

I'm starting to see that you use sometimes `H` and sometimes `HF`. That needs 
to be consistent - unless they are 2 different things?

src/hotspot/cpu/x86/x86.ad line 10930:

> 10928: %}
> 10929: 
> 10930: instruct scalar_sqrt_fp16_reg(regF dst, regF src)

Hmm, and them you also use `fp16`... so now we have `H`, `HF` and `fp16`...

src/hotspot/share/opto/addnode.cpp line 713:

> 711: 
> //------------------------------add_of_identity--------------------------------
> 712: // Check for addition of the identity
> 713: const Type *AddHFNode::add_of_identity(const Type *t1, const Type *t2) 
> const {

I would generally drop out these comments, unless they actually have something 
useful to say that the name does not say. You could make a comment why you are 
returning `nullptr`, i.e. doing nothing.

And for style: the `*` belongs with the type ;)
Suggestion:

const Type* AddHFNode::add_of_identity(const Type* t1, const Type* t2) const {

src/hotspot/share/opto/addnode.cpp line 721:

> 719: // This also type-checks the inputs for sanity.  Guaranteed never to
> 720: // be passed a TOP or BOTTOM type, these are filtered out by pre-check.
> 721: const Type *AddHFNode::add_ring(const Type *t0, const Type *t1) const {

Suggestion:

// Supplied function returns the sum of the inputs.
// This also type-checks the inputs for sanity.  Guaranteed never to
// be passed a TOP or BOTTOM type, these are filtered out by pre-check.
const Type* AddHFNode::add_ring(const Type* t0, const Type* t1) const {

Here the comments are great :)

src/hotspot/share/opto/addnode.cpp line 1625:

> 1623: 
> 1624:   // handle min of 0.0, -0.0 case.
> 1625:   return (jint_cast(f0) < jint_cast(f1)) ? r0 : r1;

Can you please add some comments for this here? Why is there an int-case on 
floats? Why not just do the ternary comparison on line 1621: `return f0 < f1 ? 
r0 : r1;`?

src/hotspot/share/opto/addnode.hpp line 179:

> 177:   virtual Node* Identity(PhaseGVN* phase) { return this; }
> 178:   virtual uint ideal_reg() const { return Op_RegF; }
> 179: };

Please put the `*` with the type everywhere.

src/hotspot/share/opto/connode.cpp line 49:

> 47:   switch( t->basic_type() ) {
> 48:   case T_INT:         return new ConINode( t->is_int() );
> 49:   case T_SHORT:       return new ConHNode( t->is_half_float_constant() );

That will be quite confusing.... don't you think?

src/hotspot/share/opto/connode.hpp line 122:

> 120: class ConHNode : public ConNode {
> 121: public:
> 122:   ConHNode( const TypeH *t ) : ConNode(t) {}

Suggestion:

  ConHNode(const TypeH* t) : ConNode(t) {}

src/hotspot/share/opto/connode.hpp line 129:

> 127:     return new ConHNode( TypeH::make(con) );
> 128:   }
> 129: 

Suggestion:

src/hotspot/share/opto/convertnode.cpp line 256:

> 254: 
> //------------------------------Ideal------------------------------------------
> 255: Node* ConvF2HFNode::Ideal(PhaseGVN* phase, bool can_reshape) {
> 256:   // Optimize pattern - ConvHF2F (FP32BinOp) ConvF2HF ==> 
> ReinterpretS2HF (FP16BinOp) ReinterpretHF2S.

This is a little dense and I don't understand your notation.

So we are pattern matching:
`ConvF2HF( FP32BinOp(ConvHF2F(x), ConvHF2F(y)) )` <- I think that would be more 
readable.
You could also create local variables for `x` and `y`, just so it is more 
readable.
And then instead we generate:
`ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y)))`

Ok, so you are saying why lift to FP32, if we cast down to FP16 anyway... would 
be nice to have such a comment at the top to motivate the optimization!

What confuses me a little here: why do we even have to cast from and to `short` 
here? Maybe a quick comment about that would also help.

src/hotspot/share/opto/convertnode.cpp line 948:

> 946: }
> 947: 
> 948: bool Float16NodeFactory::is_binary_oper(int opc) {

Suggestion:

bool Float16NodeFactory::is_float32_binary_oper(int opc) {

Just so it is explicit, since you have the parallel `get_float16_binary_oper` 
below.

src/hotspot/share/opto/convertnode.hpp line 234:

> 232: class ReinterpretHF2SNode : public Node {
> 233:   public:
> 234:   ReinterpretHF2SNode( Node *in1 ) : Node(0,in1) {}

Suggestion:

  ReinterpretHF2SNode(Node* in1) : Node(0, in1) {}

src/hotspot/share/opto/divnode.cpp line 759:

> 757:   const Type* t2 = phase->type(in(2));
> 758:   if(t1 == Type::TOP) return Type::TOP;
> 759:   if(t2 == Type::TOP) return Type::TOP;

Suggestion:

  if(t1 == Type::TOP) { return Type::TOP; }
  if(t2 == Type::TOP) { return Type::TOP; }

Please use the brackets consistently.

src/hotspot/share/opto/divnode.cpp line 765:

> 763:   if((t1 == bot) || (t2 == bot) ||
> 764:      (t1 == Type::BOTTOM) || (t2 == Type::BOTTOM))
> 765:     return bot;

Suggestion:

  if((t1 == bot) || (t2 == bot) ||
     (t1 == Type::BOTTOM) || (t2 == Type::BOTTOM)) {
    return bot;
  }

Again: please always use brackets.

src/hotspot/share/opto/divnode.cpp line 776:

> 774: 
> 775:   if(t2 == TypeH::ONE)
> 776:     return t1;

brackets

src/hotspot/share/opto/divnode.cpp line 782:

> 780:      t2->base() == Type::HalfFloatCon &&
> 781:      t2->getf() != 0.0) // could be negative zero
> 782:     return TypeH::make(t1->getf()/t2->getf());

Suggestion:

  // If divisor is a constant and not zero, divide the numbers
  if(t1->base() == Type::HalfFloatCon &&
     t2->base() == Type::HalfFloatCon &&
     t2->getf() != 0.0) { // could be negative zero
    return TypeH::make(t1->getf() / t2->getf());
  }

src/hotspot/share/opto/divnode.cpp line 789:

> 787: 
> 788:   if(t1 == TypeH::ZERO && !g_isnan(t2->getf()) && t2->getf() != 0.0)
> 789:     return TypeH::ZERO;

brackets for if

Ok, why not also do it for negative zero then?

src/hotspot/share/opto/divnode.cpp line 797:

> 795: 
> //------------------------------isA_Copy---------------------------------------
> 796: // Dividing by self is 1.
> 797: // If the divisor is 1, we are an identity on the dividend.

Suggestion:

// If the divisor is 1, we are an identity on the dividend.

`Dividing by self is 1.` That does not seem to apply here. Maybe you meant 
`dividing by 1 is self`?

src/hotspot/share/opto/divnode.cpp line 804:

> 802: 
> 803: 
> //------------------------------Idealize---------------------------------------
> 804: Node *DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) {

Suggestion:

Node* DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) {

src/hotspot/share/opto/divnode.cpp line 805:

> 803: 
> //------------------------------Idealize---------------------------------------
> 804: Node *DivHFNode::Ideal(PhaseGVN* phase, bool can_reshape) {
> 805:   if (in(0) && remove_dead_region(phase, can_reshape))  return this;

Suggestion:

  if (in(0) != nullptr && remove_dead_region(phase, can_reshape)) { return 
this; }

brackets for if and no implicit null checks please!

src/hotspot/share/opto/divnode.cpp line 814:

> 812: 
> 813:   const TypeH* tf = t2->isa_half_float_constant();
> 814:   if(!tf) return nullptr;

no implicit booleans!

src/hotspot/share/opto/divnode.cpp line 836:

> 834: 
> 835:   // return multiplication by the reciprocal
> 836:   return (new MulHFNode(in(1), phase->makecon(TypeH::make(reciprocal))));

Do we have good tests for this optimization?

src/hotspot/share/opto/mulnode.cpp line 559:

> 557: 
> 558: // Compute the product type of two half float ranges into this node.
> 559: const Type *MulHFNode::mul_ring(const Type *t0, const Type *t1) const {

Suggestion:

const Type* MulHFNode::mul_ring(const Type* t0, const Type* t1) const {

src/hotspot/share/opto/mulnode.cpp line 561:

> 559: const Type *MulHFNode::mul_ring(const Type *t0, const Type *t1) const {
> 560:   if( t0 == Type::HALF_FLOAT || t1 == Type::HALF_FLOAT ) return 
> Type::HALF_FLOAT;
> 561:   return TypeH::make( t0->getf() * t1->getf() );

I hope that `TypeH::make` handles the overflow cases well... does it?
And do we have tests for this?

src/hotspot/share/opto/mulnode.cpp line 1945:

> 1943:   return TypeH::make(fma(f1, f2, f3));
> 1944: #endif
> 1945: }

I need:
- brackets for ifs
- all `*` on the left with the type
- An explanation what the `ifdef __STDC_IEC_559__` does.

src/hotspot/share/opto/mulnode.hpp line 155:

> 153:   virtual const Type *mul_ring( const Type *, const Type * ) const;
> 154:   const Type *mul_id() const { return TypeH::ONE; }
> 155:   const Type *add_id() const { return TypeH::ZERO; }

Suggestion:

  const Type* mul_id() const { return TypeH::ONE; }
  const Type* add_id() const { return TypeH::ZERO; }

src/hotspot/share/opto/mulnode.hpp line 160:

> 158:   int max_opcode() const { return Op_MaxHF; }
> 159:   int min_opcode() const { return Op_MinHF; }
> 160:   const Type *bottom_type() const { return Type::HALF_FLOAT; }

Suggestion:

  const Type* bottom_type() const { return Type::HALF_FLOAT; }

src/hotspot/share/opto/subnode.cpp line 1975:

> 1973:   if( f < 0.0f ) return Type::HALF_FLOAT;
> 1974:   return TypeH::make( (float)sqrt( (double)f ) );
> 1975: }

if brackets and asterisks with types please

src/hotspot/share/opto/subnode.hpp line 143:

> 141:   const Type   *bottom_type() const { return Type::HALF_FLOAT; }
> 142:   virtual uint  ideal_reg() const { return Op_RegF; }
> 143: };

Suggestion:

//------------------------------SubHFNode--------------------------------------
// Subtract 2 half floats
class SubHFNode : public SubFPNode {
public:
  SubHFNode(Node* in1, Node* in2) : SubFPNode(in1, in2) {}
  virtual int Opcode() const;
  virtual const Type* sub(const Type *, const Type *) const;
  const Type* add_id() const { return TypeH::ZERO; }
  const Type* bottom_type() const { return Type::HALF_FLOAT; }
  virtual uint  ideal_reg() const { return Op_RegF; }
};

src/hotspot/share/opto/subnode.hpp line 552:

> 550:   }
> 551:   virtual int Opcode() const;
> 552:   const Type *bottom_type() const { return Type::HALF_FLOAT; }

Suggestion:

  const Type* bottom_type() const { return Type::HALF_FLOAT; }

src/hotspot/share/opto/type.cpp line 1487:

> 1485:     typerr(t);
> 1486: 
> 1487:   case HalfFloatCon:                // Float-constant vs Float-constant?

Suggestion:

  case HalfFloatCon:             // Float-constant vs Float-constant?

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/21490#pullrequestreview-2457382009
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855943470
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855944584
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855948500
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855950333
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855954166
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855955074
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855958333
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855958773
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855959025
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855977560
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855981273
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855982405
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855984366
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855985484
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855988545
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855989752
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855992127
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855994876
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855995436
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1855996454
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856000589
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856002336
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856007382
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856006524
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856009749
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856010212
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856010391
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856013278
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856013945
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856014893
PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1856016525

Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations [v2]

Reply via email to