On Fri, Jun 15, 2018 at 9:34 PM Peter Zijlstra <pet...@infradead.org> wrote: > Didn't we recently do a bunch of crypto patches to help with this? > > I think they had the pattern: > > kernel_fpu_begin(); > for (units-of-work) { > do_unit_of_work(); > if (need_resched()) { > kernel_fpu_end(); > cond_resched(); > kernel_fpu_begin(); > } > } > kernel_fpu_end();
Right, so that's the thing -- this is an optimization easily available to individual crypto primitives. But I'm interested in applying this kind of optimization to an entire queue of, say, tiny packets, where each packet is processed individually. Or, to a cryptographic construction, where several different primitives are used, such that it'd be meaningful not to have to get the performance hit of end()begin() in between each and everyone of them.