Hi,

Thanks, this looks mostly ok now.

There were a few minor issues left that I can fix up before pushing. There were a number of cases with register restoring like this:

        ldr             x30, [sp]
        ldp             x4, x6, [sp, #16]
        ldp             x0, x1, [sp, #32]
        add             sp, sp, #48

Here we should fold the sp update ino the load as well, like this:

        ldp             x4, x6, [sp, #16]
        ldp             x0, x1, [sp, #32]
        ldr             x30, [sp], #48

In a few cases, this wasn't possible, due to the location of the register that is being restored, like this:

        ldr             x30, [sp, #56]
        add             sp, sp, #64

For the most idiomatic aarch64 assembly, I think it would be good to restructure it to keep x30 at the bottom of this area, to allow using the same pattern for restores here too.

I pushed it with these changes. A later patch to restructure the register saves to avoid the separate "add sp" would be appreciated though. I quite certainly doesn't matter from a real-world performance perspective, but it would make the code more idiomatic.

// Martin




On Sat, 23 Sep 2023, Logan.Lyu wrote:

Hi, Martin,

Thanks for your review.

Thanks for the patches. Functionally, they seem to work, and the issues i saw in the code are relatively minor. Unfortunately, some of the issues are issues that we've been through in many earlier patches, so I would hope that you would pay attention to them in the future before posting more patches.
Okay, I have noticed the previous issues and made some modifications according to the issues, And I have completed the modifications based on your comments.

If there are any missing issues that have not been corrected, please let me know.



在 2023/9/17 5:46, Martin Storsjö 写道:
On Thu, 14 Sep 2023, Logan.Lyu wrote:

Hi Martin,

You can try the attached patchset. If that doesn't work, My code branch address is https://github.com/myais2023/FFmpeg/tree/hevc-aarch64

Thanks for the patches. Functionally, they seem to work, and the issues i saw in the code are relatively minor. Unfortunately, some of the issues are issues that we've been through in many earlier patches, so I would hope that you would pay attention to them in the future before posting more patches.


In patch 1, you've got a bunch of sxtw instructions for src/dst stride parameters that have the type ptrdiff_t - that shouldn't be necessary?

In patch 2, you're moving the macros calc_epelh, calc_epelh2, load_epel_filterh - can you split out the move into a separate commit? (This isn't strictly necessary but would make things even clearer.)

In patch 2, you're storing below the stack, then decrementing it afterwards - e.g. like this:

+        stp             x0, x30, [sp, #-16]
+        stp             x1, x2, [sp, #-32]
+        stp             x3, x4, [sp, #-48]
+        stp             x5, x6, [sp, #-64]!

Please change that so that you're first predecrementing the whole area, then storing the other elements above that stack pointer, e.g. like this:

stp x0, x30, [sp, #-64]!
stp x1, x2, [sp, #16]
stp x3, x4, [sp, #32]

etc.

The same issue also appears in variouos places within functions like this:

+        stp             x0, x1, [sp, #-16]
+        stp             x4, x6, [sp, #-32]
+        stp             xzr, x30, [sp, #-48]!

Please fix all of these cases - you can search through your patches for anything related to storing on the stack. Also, storing xzr here seems superfluous - if you've got an odd number of registers to store, just make one instruction str instead of stp (but keep the stack aligned).

Then in patch 4, you've got yet another pattern for doing these stores, where you have superfluous consecutive stack decrements like this:

+        stp             x6, x30, [sp, #-16]!
+        mov             x7, #16
+        stp             x0, x1, [sp, #-16]!
+        stp             x2, x3, [sp, #-16]!
+        stp             x4, x5, [sp, #-16]!

Please just do one stack decrement covering all the stack space you need.

I believe these issues have been raised in earlier reviews as well.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to