vzakhari added a comment. LGTM Please address @awarzynski's comment about the test.
================ Comment at: flang/include/flang/Tools/CLOptions.inc:235 +/// passes pipeline +inline void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, + llvm::OptimizationLevel optLevel = defaultOptLevel) { ---------------- tblah wrote: > vzakhari wrote: > > Would you mind also calling this in `bbc` driver? > Adding this to bbc will have to wait until after `-emit-fir` and > `-emit-hlfir` are different flags. Otherwise hlfir ops will be lowered to > fir, breaking some tests (and presumably people's workflows). Okay! Thank you for considering it! ================ Comment at: flang/include/flang/Tools/CLOptions.inc:238 + pm.addPass(mlir::createCanonicalizerPass()); + pm.addPass(hlfir::createLowerHLFIRIntrinsicsPass()); + pm.addPass(hlfir::createBufferizeHLFIRPass()); ---------------- tblah wrote: > vzakhari wrote: > > I would imagine we may not want to optimize MATMUL(TRANSPOSE) into > > MATMUL_TRANSPOSE at O0. What is the best way to control this? We may > > either disable canonicalization or let `LowerHLFIRIntrinsicsPass` lower > > MATMUL_TRANSPOSE differently based on the optimization level. Or is it > > always okay to implement it as a combined operation? > So far as I know, there should be no loss to precision from implementing it > as a combined operation. Memory usage should be reduced as we need one fewer > temporary. > > If static linking is used, including MATMUL_TRANSPOSE in the runtime library > will increase code size (so long as both matmul and transpose are also called > elsewhere). I haven't measured this, but I wouldn't expect this to be a large > change relative to the size of a real world application. > > If dynamic linking is used, whether or not this pass runs, MATMUL_TRANSPOSE > will make the runtime library a little larger (there are a lot of template > instantiations, but MATMUL_TRANSPOSE is only one of many similar functions so > the effect as a proportion of the whole shouldn't be much). > > But I'll set the canonicalization pass to only run when we are optimizing for > speed. Later canonicalisation passes (after createLowerHLFIRIntrinsicsPass) > won't find any hlfir.matmul operations to canonicalise and so won't create a > hlfir.matmul_transpose operation. Thank you! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D146278/new/ https://reviews.llvm.org/D146278 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits