I have a MatShell object that I want to convert to a MATDENSECUDA.
Normally, I use MatComputeOperator for this. However, I would now also like
to use a CUDA Graph so that all the calls to MatMult are captured. I can
wrap a code like
for (int i = 0; i < N; i++)
MatMult(A, x,y);
in a CUDA Graph, and it runs fine. If I try to wrap MatComputeOperator in a
graph, I get runtime errors like
cuda error 906 (cudaErrorStreamCaptureImplicit) : operation would make the
legacy stream depend on a capturing blocking stream
I tried modifying the MatConvert_Shell routine to only put the graph around
the main for loop, but that still gives the same errors. Is there a way to
use CUDA Graphs here (either through a modified MatConvert_Shell or
otherwise)?
Thanks,
Sreeram