On Wednesday, July 13, 2022 at 06:16:15 PM GMT+2, James Almer 
<jamr...@gmail.com> wrote: 





On 7/13/2022 12:54 PM, Marco Vianini wrote:
> Sorry, my mail client was using html format.
> I hope now the mail will be sent correctly.
> 
> 
> You can get a very big improvement of performances in the special (but very 
> likely) case of: "(dst_linesize == bytewidth && src_linesize == bytewidth)"
> 
> In this case in fact We can "Coalesce rows", that is using ONLY ONE MEMCPY, 
> instead of a smaller memcpy for every row (that is looping for height times).
> 
> Code:
> "
> static void image_copy_plane(uint8_t       *dst, ptrdiff_t dst_linesize,
>                               const uint8_t *src, ptrdiff_t src_linesize,
>                               ptrdiff_t bytewidth, int height)
> {
>      if (!dst || !src)
>          return;
>      av_assert0(abs(src_linesize) >= bytewidth);
>      av_assert0(abs(dst_linesize) >= bytewidth);
>      
>      /// MY PATCH START
>      /// Coalesce rows.
>      if (dst_linesize == bytewidth && src_linesize == bytewidth) {
>        bytewidth *= height;
>        height = 1;
>        src_linesize = dst_linesize = 0;
>      }
>      /// MY PATCH STOP
> 
>      for (;height > 0; height--) {
>          memcpy(dst, src, bytewidth);
>          dst += dst_linesize;
>          src += src_linesize;
>      }
> }
> "
> 
> 
> I did following tests on Windows 10 64bit.
> I compiled code in Release.
> I copied my pc camera frames 1000 times (resolution 1920x1080):
> 
> With Coalesce:
> copy_cnt=100  size=1920x1080 tot_time_copy(us)=36574 (average=365.74)
> copy_cnt=200  size=1920x1080 tot_time_copy(us)=78207 (average=391.035)
> copy_cnt=300  size=1920x1080 tot_time_copy(us)=122170(average=407.233)
> copy_cnt=400  size=1920x1080 tot_time_copy(us)=163678(average=409.195)
> copy_cnt=500  size=1920x1080 tot_time_copy(us)=201872(average=403.744)
> copy_cnt=600  size=1920x1080 tot_time_copy(us)=246174(average=410.29)
> copy_cnt=700  size=1920x1080 tot_time_copy(us)=287043(average=410.061)
> copy_cnt=800  size=1920x1080 tot_time_copy(us)=326462(average=408.077)
> copy_cnt=900  size=1920x1080 tot_time_copy(us)=356882(average=396.536)
> copy_cnt=1000 size=1920x1080 tot_time_copy(us)=394566(average=394.566)
> 
> Without Coalesce:
> copy_cnt=100  size=1920x1080 tot_time_copy(us)=44303 (average=443.03)
> copy_cnt=200  size=1920x1080 tot_time_copy(us)=100501(average=502.505)
> copy_cnt=300  size=1920x1080 tot_time_copy(us)=150097(average=500.323)
> copy_cnt=400  size=1920x1080 tot_time_copy(us)=201010(average=502.525)
> copy_cnt=500  size=1920x1080 tot_time_copy(us)=256818(average=513.636)
> copy_cnt=600  size=1920x1080 tot_time_copy(us)=303273(average=505.455)
> copy_cnt=700  size=1920x1080 tot_time_copy(us)=359152(average=513.074)
> copy_cnt=800  size=1920x1080 tot_time_copy(us)=414413(average=518.016)
> copy_cnt=900  size=1920x1080 tot_time_copy(us)=465315(average=517.017)
> copy_cnt=1000 size=1920x1080 tot_time_copy(us)=520381(average=520.381)
> 
> 
> I think the results are very good.
> What do you think about?

It looks like a good speed up, but we need a patch created with git 
format-patch that can be applied to the source tree to properly review 
this. Can you send that?

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


I generated the eml file with "git format-patch" (see attachment).
Is it ok for You?
Thanks
--- Begin Message ---
Signed-off-by: Marco Vianini <marco_vian...@yahoo.it>
---
 libavutil/imgutils.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 9ab5757cf6..9ccb398a3b 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -349,6 +349,14 @@ static void image_copy_plane(uint8_t       *dst, ptrdiff_t 
dst_linesize,
         return;
     av_assert0(FFABS(src_linesize) >= bytewidth);
     av_assert0(FFABS(dst_linesize) >= bytewidth);
+
+    if (dst_linesize == bytewidth && src_linesize == bytewidth) {
+        /** Coalesce rows in this specific case, for perfomances improvement */
+        bytewidth *= height;
+        height = 1;
+        src_linesize = dst_linesize = 0;
+    }
+
     for (;height > 0; height--) {
         memcpy(dst, src, bytewidth);
         dst += dst_linesize;
-- 
2.30.0.windows.2


--- End Message ---
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to