Hi Gabriele, I don't think it's due to inliness. It is more efficient to backproject a few projections at once while a piece of the volume is in cache memory. You can validate this by changing RTK_CUDA_PROJECTIONS_SLAB_SIZE at CMake configuration time. The best is probably to store a few projections before processing them. Simon
On Mon, Dec 11, 2023 at 12:49 PM Gabriele Belotti < gabriele.belotti.berg...@gmail.com> wrote: > Dear all, > > I'm testing an inline implementation using RTK vs the offline rtkfdk. > Has anyone else experienced an unexpected reduction in recon/computation > time using a batch reconstruction approach with N projections being > processed together rather than processing every projection at collection? > > As in, for a slab size of 16 I would expect the processing time to be at > most 16x faster than backprojecting a single projection and passing to the > next. > > Do you think this is only due to memory transfer bottleneck in cpu to gpu, > or could there be other reasons? Some tuning done in the filters maybe? > > Best, > Gabriele > _______________________________________________ > Rtk-users mailing list > rtk-us...@openrtk.org > https://www.creatis.insa-lyon.fr/mailman/listinfo/rtk-users >
_______________________________________________ Rtk-users mailing list rtk-us...@openrtk.org https://www.creatis.insa-lyon.fr/mailman/listinfo/rtk-users