From: IndecisiveTurtle
Prevents compiler from mistaking it as a string
Also makes passing it to the GPU in a buffer easier
---
libavcodec/vc2enc_common.c | 2 +-
libavcodec/vc2enc_common.h | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/libavcodec/vc2enc_common.c b
From: IndecisiveTurtle
---
libavcodec/vulkan/common.comp | 51 ---
1 file changed, 41 insertions(+), 10 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index 10af9c0623..59a4a4b1a8 100644
--- a/libavcodec/vulkan
IndecisiveTurtle
έγραψε:
>
> From: IndecisiveTurtle
>
> Performance wise, encoding a 3440x1440 1-minute video is performed in about
> 2.4 minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes
> about 1.3 minutes on my NVIDIA GTX 1650
>
> Haar shader has a subgrou
From: IndecisiveTurtle
Performance wise, encoding a 3440x1440 1-minute video is performed in about 2.4
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
1.3 minutes on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
From: IndecisiveTurtle
---
libavcodec/Makefile| 2 +-
libavcodec/vc2enc.c| 673 ++---
libavcodec/vc2enc_common.c | 575 +++
libavcodec/vc2enc_common.h | 169 ++
4 files changed, 770 insertions(+), 649
> Same benchmarks as v4. Did the switch to put_bits63() not cost performance?
In my tests it did not. I tested with old uint32_t, then immediately
afterwards with uint64_t and times were the same.
Στις Σάβ 24 Μαΐ 2025 στις 2:09 μ.μ., ο/η Andreas Rheinhardt
έγραψε:
>
> IndecisiveTurtle:
From: IndecisiveTurtle
Prevents compiler from mistaking it as a string
Also makes passing it to the GPU in a buffer easier
---
libavcodec/vc2enc_common.c | 2 +-
libavcodec/vc2enc_common.h | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/libavcodec/vc2enc_common.c b
From: IndecisiveTurtle
---
libavcodec/Makefile| 2 +-
libavcodec/vc2enc.c| 669 ++---
libavcodec/vc2enc_common.c | 571 +++
libavcodec/vc2enc_common.h | 168 ++
4 files changed, 765 insertions(+), 645
From: IndecisiveTurtle
Performance wise, encoding a 3440x1440 1-minute video is performed in about 2.4
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
1.3 minutes on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
From: IndecisiveTurtle
---
libavcodec/vulkan/common.comp | 51 ---
1 file changed, 41 insertions(+), 10 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index 10af9c0623..59a4a4b1a8 100644
--- a/libavcodec/vulkan
t is not implemented in vulkan encoder. This is also why I
couldn't unify this array as you mentioned before.
Στις Δευ 19 Μαΐ 2025 στις 8:09 μ.μ., ο/η Andreas Rheinhardt
έγραψε:
>
> IndecisiveTurtle:
> > From: IndecisiveTurtle
> >
> > Performance wise, encoding a 344
ting code to
> skip_put_bytes(). But this file is (if I am not mistaken) supposed to be
> generic, not vc2 specific, so this feels very wrong.
Would it be enough to move it to vc2_encode.comp or should I also
rename the function?
Στις Δευ 19 Μαΐ 2025 στις 7:46 μ.μ., ο/η Andreas Rheinhardt
έγραψε:
&g
I tried to solve all the review comments from the last patchset,
please review in case I missed anything. Thanks
Στις Σάβ 17 Μαΐ 2025 στις 11:49 μ.μ., ο/η IndecisiveTurtle
έγραψε:
>
> From: IndecisiveTurtle
>
> Performance wise, encoding a 3440x1440 1-minute video is performed in
From: IndecisiveTurtle
Performance wise, encoding a 3440x1440 1-minute video is performed in about 2.4
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
1.3 minutes on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
From: IndecisiveTurtle
---
libavcodec/vulkan/common.comp | 54 ---
1 file changed, 44 insertions(+), 10 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index 10af9c0623..db216a2ac6 100644
--- a/libavcodec/vulkan
From: IndecisiveTurtle
Prevents compiler from mistaking it as a string
Also makes passing it to the GPU in a buffer easier
---
libavcodec/vc2enc_common.c | 2 +-
libavcodec/vc2enc_common.h | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/libavcodec/vc2enc_common.c b
From: IndecisiveTurtle
---
libavcodec/Makefile| 2 +-
libavcodec/vc2enc.c| 679 ++---
libavcodec/vc2enc_common.c | 571 +++
libavcodec/vc2enc_common.h | 178 ++
4 files changed, 772 insertions(+), 658
From: IndecisiveTurtle
---
libavcodec/Makefile| 2 +-
libavcodec/vc2enc.c| 513 +
libavcodec/vc2enc_common.c | 376 +++
libavcodec/vc2enc_common.h | 196 ++
4 files changed, 581 insertions(+), 506
From: IndecisiveTurtle
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
wavelet
From: IndecisiveTurtle
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
wavelet
From: IndecisiveTurtle
Prevents compiler from mistaking it as a string
Also makes passing it to the GPU in a buffer easier
---
libavcodec/vc2enc_common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/vc2enc_common.h b/libavcodec/vc2enc_common.h
index eaaf5ac99c
From: IndecisiveTurtle
---
libavcodec/vulkan/common.comp | 54 ---
1 file changed, 44 insertions(+), 10 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index 10af9c0623..db216a2ac6 100644
--- a/libavcodec/vulkan
From: IndecisiveTurtle
---
libavcodec/vc2enc.c| 2 +-
libavcodec/vc2enc_common.c | 18 +-
libavcodec/vc2enc_common.h | 2 +-
3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/libavcodec/vc2enc.c b/libavcodec/vc2enc.c
index 2e849eb09e..c0f542e116 100644
---
libavcodec/vulkan/common.comp | 58 +++
1 file changed, 46 insertions(+), 12 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index e4e983b3e2..3dc1527529 100644
--- a/libavcodec/vulkan/common.comp
+++ b/libavcodec/vulkan/
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
wavelet depth allows it
lavapipe
---
libavcodec/vulkan/vc2_dwt_haar.comp | 66 +++
libavcodec/vulkan/vc2_dwt_haar_subgroup.comp | 89 +
libavcodec/vulkan/vc2_dwt_hor_legall.comp| 66 +++
libavcodec/vulkan/vc2_dwt_upload.comp| 29 +++
libavcodec/vulkan/vc2_dwt_ver_legall.comp| 62 +
From: IndecisiveTurtle
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 499f826635..a96c700745 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -768,7 +768,7 @@ OBJS-$(CONFIG_VC1_CUVID_DECODER) += cuviddec.o
OBJS-$(CONFIG_VC1_MMAL_DECODER)+= mmaldec.o
> a) Don't top-post.
Sorry my mistake, I hope this message is proper now
> b) If you put a static array in a header, it will be included in every
> file that actually uses it and therefore end up being duplicated in the
> binary.
Ah I understand now, thanks will also fix that.
dreas.rheinha...@outlook.com> έγραψε:
> IndecisiveTurtle:
> > From: IndecisiveTurtle
> >
> > ---
> > libavcodec/Makefile| 2 +-
> > libavcodec/vc2enc.c| 515 +
> > libavcodec/vc2enc_common.c | 321 ++
---
libavcodec/vulkan/common.comp | 58 +++
1 file changed, 46 insertions(+), 12 deletions(-)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index e4e983b3e2..3dc1527529 100644
--- a/libavcodec/vulkan/common.comp
+++ b/libavcodec/vulkan/
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA GTX 1650
Haar shader has a subgroup optimized variant that applies when configured
wavelet depth allows it
lavapipe
---
libavcodec/vulkan/vc2_dwt_haar.comp | 70 +++
libavcodec/vulkan/vc2_dwt_haar_subgroup.comp | 89 +
libavcodec/vulkan/vc2_dwt_hor_legall.comp| 66 +++
libavcodec/vulkan/vc2_dwt_upload.comp| 29 +++
libavcodec/vulkan/vc2_dwt_ver_legall.comp| 62 +
From: IndecisiveTurtle
---
libavcodec/Makefile| 2 +-
libavcodec/vc2enc.c| 515 +
libavcodec/vc2enc_common.c | 321 +++
libavcodec/vc2enc_common.h | 323 +++
4 files changed, 653 insertions(+), 508
Useful when creating a descriptor array of separate images
---
libavutil/vulkan.c | 12 ++--
libavutil/vulkan.h | 8
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 31610e2d94..91415957fd 100644
--- a/libavutil/vulkan.
From: IndecisiveTurtle <47210458+raphaelthegr...@users.noreply.github.com>
Useful when creating a descriptor array of separate images
---
libavutil/vulkan.c | 12 ++--
libavutil/vulkan.h | 8
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/libavutil/vulk
Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and
should work with all wavelet depths.
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA
According to the GL_EXT_buffer_reference spec alignment
"must be a power of two and be greater than or equal to the largest
scalar/component type in the block."
This means by using u32vec2 we can drop the requirement alignment from 8 bytes
to 4 bytes
and save a pack64 call in reverse8 (though I
---
libavcodec/vulkan/common.comp | 5 +
1 file changed, 5 insertions(+)
diff --git a/libavcodec/vulkan/common.comp b/libavcodec/vulkan/common.comp
index e0874d304f..799aa86f64 100644
--- a/libavcodec/vulkan/common.comp
+++ b/libavcodec/vulkan/common.comp
@@ -172,3 +172,8 @@ uint64_t put_bits
If caller wrote a divisible by eight number of bits it would write an extra
byte.
Also increment by to_write instead of BUF_BYTES which overly pads the bitstream.
---
libavcodec/vulkan/common.comp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/vulkan/common.com
Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and
should work with all wavelet depths.
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA
Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and
should work with all wavelet depths.
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA
Small cleanup, only blocks[0] seems to ever be used
---
libavcodec/proresenc_kostya.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/proresenc_kostya.c b/libavcodec/proresenc_kostya.c
index 226f95f8c6..1a09488cd7 100644
--- a/libavcodec/proresenc_kostya
Implements a Vulkan based dirac encoder. Supports Haar and Legall wavelets and
should work with all wavelet depths.
Performance wise, encoding a 1080p 1-minute video is performed in about 2.5
minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes about
30 seconds on my NVIDIA
---
libavcodec/Makefile | 7 +++
1 file changed, 7 insertions(+)
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 4eed81ed03..734ab14596 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -1371,3 +1371,10 @@ $(SUBDIR)pcm.o: $(SUBDIR)pcm_tables.h
$(SUBDIR)qdm2.o: $(SUBD
Needed to prevent crashes on vc2 vulkan encoder patch
---
libavutil/vulkan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 44552e97b8..cd617496dc 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -2022,7 +2022,7 @@
45 matches
Mail list logo