On ti, 2017-05-30 at 13:55 +0100, Chris Wilson wrote:
> When looking at simple ioctls coupled with conveniently small user
> parameters, the overhead of the syscall and drm_ioctl() present large
> low hanging fruit. Profiling trivial microbenchmarks around
> i915_gem_busy_ioctl, the low hanging fruit comprises of the call to
> copy_user(). Those calls are only inlined by the macro where the
> constant is known at compile-time, but the ioctl argument size depends
> on the ioctl number. To help the compiler, explicitly add switches for
> the small sizes that expand to simple moves to/from user. Doing the
> multiple inlines does add significant code bloat, so it is very
> debatable as to its value. Back to the trivial, but frequently used,
> example of i915_gem_busy_ioctl() on a Broadwell avoiding the call gives
> us a 15-25% improvement:
> 
>                        before           after
>       single          100.173ns        84.496ns
>       parallel (x4)   204.275ns       152.957ns
> 
> On a baby Broxton nuc:
> 
>                       before           after
>       single          245.355ns       199.477ns
>       parallel (x2)   280.892ns       232.726ns
> 
> Looking at the cost distribution by moving an equivalent switch into
> arch/x86/lib/usercopy, the overhead to the copy user is split almost
> equally between the function call and the actual copy itself. It seems
> copy_user_enhanced_fast_string simply is not that good at small (single
> register) copies. Something as simple as
> 
> @@ -28,6 +28,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
>  {
>         unsigned ret;
> 
> +       if (len <= 16)
> +               return copy_user_generic_unrolled(to, from, len);
> 
> is enough to speed up i915_gem_busy_ioctl() by 10% :|
> 
> Note that this overhead may entirely be x86 specific.
> 
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com>

I think this should be integrated into __copy_{to,from}_user directly,
but in the meanwhile the code is;

Reviewed-by: Joonas Lahtinen <joonas.lahti...@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to