vzakhari added a comment.

LGTM
Please address @awarzynski's comment about the test.



================
Comment at: flang/include/flang/Tools/CLOptions.inc:235
+///   passes pipeline
+inline void createHLFIRToFIRPassPipeline(mlir::PassManager &pm,
+    llvm::OptimizationLevel optLevel = defaultOptLevel) {
----------------
tblah wrote:
> vzakhari wrote:
> > Would you mind also calling this in `bbc`  driver?
> Adding this to bbc will have to wait until after `-emit-fir` and 
> `-emit-hlfir` are different flags. Otherwise hlfir ops will be lowered to 
> fir, breaking some tests (and presumably people's workflows).
Okay! Thank you for considering it!


================
Comment at: flang/include/flang/Tools/CLOptions.inc:238
+  pm.addPass(mlir::createCanonicalizerPass());
+  pm.addPass(hlfir::createLowerHLFIRIntrinsicsPass());
+  pm.addPass(hlfir::createBufferizeHLFIRPass());
----------------
tblah wrote:
> vzakhari wrote:
> > I would imagine we may not want to optimize MATMUL(TRANSPOSE) into 
> > MATMUL_TRANSPOSE at O0.  What is the best way to control this?  We may 
> > either disable canonicalization or let `LowerHLFIRIntrinsicsPass` lower 
> > MATMUL_TRANSPOSE differently based on the optimization level.  Or is it 
> > always okay to implement it as a combined operation?
> So far as I know, there should be no loss to precision from implementing it 
> as a combined operation. Memory usage should be reduced as we need one fewer 
> temporary.
> 
> If static linking is used, including MATMUL_TRANSPOSE in the runtime library 
> will increase code size (so long as both matmul and transpose are also called 
> elsewhere). I haven't measured this, but I wouldn't expect this to be a large 
> change relative to the size of a real world application.
> 
> If dynamic linking is used, whether or not this pass runs, MATMUL_TRANSPOSE 
> will make the runtime library a little larger (there are a lot of template 
> instantiations, but MATMUL_TRANSPOSE is only one of many similar functions so 
> the effect as a proportion of the whole shouldn't be much).
> 
> But I'll set the canonicalization pass to only run when we are optimizing for 
> speed. Later canonicalisation passes (after createLowerHLFIRIntrinsicsPass) 
> won't find any hlfir.matmul operations to canonicalise and so won't create a 
> hlfir.matmul_transpose operation.
Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146278/new/

https://reviews.llvm.org/D146278

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to