This results in warnings on compilers which don't support it, objections were raised during the review process about it but went unnoticed, and the speed benefit is highly compiler and version specific, and also not very critical.
We generally hand-write assembly to optimize loops like that, rather than use compiler magic, and for 40% best case scenario, it's simply not worth it. Plus, tree vectorization is still problematic with GCC and disabled by default for a good reason, so enabling it locally is sketchy. Patch attached.
>From 6480a3c4079f9993139ae167019d95f9e9b22ea8 Mon Sep 17 00:00:00 2001 From: Lynne <d...@lynne.ee> Date: Wed, 25 Aug 2021 21:20:18 +0200 Subject: [PATCH] h274: remove optimization pragma This results in warnings on compilers which don't support it, objections were raised during the review process about it but went unnoticed, and the speed benefit is highly compiler and version specific, and also not very critical. We generally hand-write assembly to optimize loops like that, rather than use compiler magic, and for 40% best case scenario, it's simply not worth it. Plus, tree vectorization is still problematic with GCC and disabled by default for a good reason, so enabling it locally is sketchy. --- libavcodec/h274.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/libavcodec/h274.c b/libavcodec/h274.c index 0efc00ca1d..5e2cf150ea 100644 --- a/libavcodec/h274.c +++ b/libavcodec/h274.c @@ -30,10 +30,6 @@ #include "h274.h" -// The code in this file has a lot of loops that vectorize very well, this is -// about a 40% speedup for no obvious downside. -#pragma GCC optimize("tree-vectorize") - static const int8_t Gaussian_LUT[2048+256]; static const uint32_t Seed_LUT[256]; static const int8_t R64T[64][64]; -- 2.33.0.252.g9b09ab0cd71
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".