From: Rémi Denis-Courmont
RISC-V defines the CLZ instruction as part of the Zbb subset of the
bit mapulation extension (B). We can detect it from the __riscv_zbb
predefined constant. It will be non-zero if supported, zero if enabled
in the compiler flags but not supported by the compiler, and und
From: Rémi Denis-Courmont
RISC-V defines the CLZ instruction as part of the Zbb subset of the
bit mapulation extension (B). We can detect it from the __riscv_zbb
predefined constant. It will be non-zero if supported, zero if enabled
in the compiler flags but not supported by the compiler, and und
From: Rémi Denis-Courmont
If the target supports the Basic bit-manipulation (Zbb) extension, then
REV8 is available to reverse byte order. Note that this instruction
only exists at the "XLEN" register size (available as __riscv_xlen).
---
libavutil/bswap.h | 2 ++
libavutil/riscv/bswap.h
From: Rémi Denis-Courmont
There are no particular reasons to force the compiler to use the same
register as output and input operand. This forces an extra MOV
instruction if the input value needs to be reused after the swap.
In most cases, this makes no differences, as the compiler will seleect
From: Rémi Denis-Courmont
There are no particular reasons to force the compiler to use the same
register as output and input operand. This forces an extra MOV
instruction if the input value needs to be reused after the swap.
In most cases, this makes no differences, as the compiler will seleect
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.h | 33 +
1 file changed, 33 insertions(+)
create mode 100644 libavutil/riscv/asm.h
diff --git a/libavutil/riscv/asm.h b/libavutil/riscv/asm.h
new file mode 100644
index 00..31001b8bdb
--- /dev/null
+++ b
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions: V, Zvl32b, Zvl64b,
Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zve32x, Zve32f, Zve64x, Zve64f and
Zve64d.
At this stage, we don't care about the vector length extensions Zvl*,
as most or all optimisations will be running in a loo
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree, though the size
and scalar arguments are swapped.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 ++-
libavutil/riscv/float_dsp_init.c | 4
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 9 -
libavutil/riscv/float_dsp_rvv.S | 34
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 279412c0
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +
libavutil/riscv/float_dsp_rvv.S | 42
2 files changed, 48 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 4135284c76..a1bb112ec7 1006
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index a1bb112ec7..8539fe9ac5 100644
--- a/libavu
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 8539fe9ac5..2165394585 100644
--- a/libavuti
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 22 ++
2 files changed, 25 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 2165394585..1183460181 100644
--- a/lib
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 1183460181..887706d899 100644
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 23 +++
2 files changed, 25 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 887706d899..7c2fc10e99 100644
--- a/lib
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions: V, Zvl32b, Zvl64b,
Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zve32x, Zve32f, Zve64x, Zve64f and
Zve64d.
At this stage, we don't expose the vector length extensions Zvl*, as
the vector length is most commonly determined at run-t
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 33 +
1 file changed, 33 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..31001b8bdb
--- /dev/null
+++ b
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 ++-
libavutil/risc
From: Rémi Denis-Courmont
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessar
From: Rémi Denis-Courmont
---
doc/optimization.txt | 5 +
1 file changed, 5 insertions(+)
diff --git a/doc/optimization.txt b/doc/optimization.txt
index 974e2f9af2..3ed29fe38c 100644
--- a/doc/optimization.txt
+++ b/doc/optimization.txt
@@ -267,6 +267,11 @@ CELL/SPU:
http://www-01.ibm.com
From: Rémi Denis-Courmont
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessar
From: Rémi Denis-Courmont
RISC-V defines the CLZ instruction as part of the ratified Zbb subset
of the (not yet ratified) bit mapulation extension (B). We can detect
it from the __riscv_zbb predefined constant. At least GCC 12 already
supports this correctly.
Note that the macro will be non-zero
From: Rémi Denis-Courmont
If the target supports the Basic bit-manipulation (Zbb) extension, then
the REV8 instruction is available to reverse byte order.
Note that this instruction only exists at the "XLEN" register size,
so we need to right shift the result down to the data width.
If Zbb is n
From: Rémi Denis-Courmont
This provides some micro-optimisations for signed integer clipping, and
support for bit weight with the Zbb extension.
---
libavutil/intmath.h | 5 +-
libavutil/riscv/intmath.h | 99 +++
2 files changed, 102 insertions(+), 2 de
From: Rémi Denis-Courmont
This provides some micro-optimisations for signed integer clipping, and
support for bit weight with the Zbb extension.
---
libavutil/intmath.h | 5 +-
libavutil/riscv/intmath.h | 103 ++
2 files changed, 106 insertions(+), 2 d
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Z
From: Rémi Denis-Courmont
---
tests/checkasm/checkasm.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index e56fd3850e..a5d0503811 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -226,6 +226,11 @@ static c
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index a1bb112ec7..8539fe9ac5 100644
--- a/libavu
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +
libavutil/riscv/float_dsp_rvv.S | 38
2 files changed, 44 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 4135284c76..a1bb112ec7 1006
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 74 +++
1 file changed, 74 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..7623c161cf
--- /dev/
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 8539fe9ac5..2165394585 100644
--- a/libavuti
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 22 ++
2 files changed, 25 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 2165394585..1183460181 100644
--- a/lib
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 1183460181..887706d899 100644
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 23 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 887706d899..7c2fc10e99 100644
--- a/libav
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 ++-
libavutil/risc
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 2 ++
libavutil/riscv/fixed_dsp_init.c | 33 +++
libavutil/riscv/fixed_dsp_rvv.S | 38
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 9 -
libavutil/riscv/float_dsp_rvv.S | 34
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 279412c0
From: Rémi Denis-Courmont
---
doc/optimization.txt | 5 +
1 file changed, 5 insertions(+)
diff --git a/doc/optimization.txt b/doc/optimization.txt
index 974e2f9af2..3ed29fe38c 100644
--- a/doc/optimization.txt
+++ b/doc/optimization.txt
@@ -267,6 +267,11 @@ CELL/SPU:
http://www-01.ibm.com
From: Rémi Denis-Courmont
RISC-V defines the CLZ instruction as part of the ratified Zbb subset
of the (not yet ratified) bit mapulation extension (B). We can detect
it from the __riscv_zbb predefined constant. At least GCC 12 already
supports this correctly.
Note that the macro will be non-zero
From: Rémi Denis-Courmont
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessar
From: Rémi Denis-Courmont
If the target supports the Basic bit-manipulation (Zbb) extension, then
the REV8 instruction is available to reverse byte order.
Note that this instruction only exists at the "XLEN" register size,
so we need to right shift the result down to the data width.
If Zbb is n
From: Rémi Denis-Courmont
This provides some micro-optimisations for signed integer clipping, and
support for bit weight with the Zbb extension.
---
libavutil/intmath.h | 5 +-
libavutil/riscv/intmath.h | 103 ++
2 files changed, 106 insertions(+), 2 d
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index b63da72acd..9b31ed2ed1 100644
--- a/libavu
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index e6a5efbf68..99cc8afd31 100644
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 9b31ed2ed1..4980214821 100644
--- a/libavuti
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 22 ++
2 files changed, 25 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 4980214821..e6a5efbf68 100644
--- a/lib
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 23 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 99cc8afd31..9c5e06bae9 100644
--- a/libav
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
libavutil/riscv/fixed_dsp_init.c | 33 +++
libavutil/riscv/fixed_dsp_rvv.S | 38
From: Rémi Denis-Courmont
---
configure| 15 +++
ffbuild/arch.mak | 2 ++
2 files changed, 17 insertions(+)
diff --git a/configure b/configure
index b7dc1d8656..c5f20cc323 100755
--- a/configure
+++ b/configure
@@ -462,6 +462,7 @@ Optimization options (experts only):
--d
From: Rémi Denis-Courmont
---
tests/checkasm/checkasm.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index e56fd3850e..a5d0503811 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -226,6 +226,11 @@ static c
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 74 +++
1 file changed, 74 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..7623c161cf
--- /dev/
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 ++-
libavutil/risc
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 9 -
libavutil/riscv/float_dsp_rvv.S | 34
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 7c553e91
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Z
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +
libavutil/riscv/float_dsp_rvv.S | 38
2 files changed, 44 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 49a4c95a0b..b63da72acd 1006
From: Rémi Denis-Courmont
---
doc/optimization.txt | 5 +
1 file changed, 5 insertions(+)
diff --git a/doc/optimization.txt b/doc/optimization.txt
index 974e2f9af2..3ed29fe38c 100644
--- a/doc/optimization.txt
+++ b/doc/optimization.txt
@@ -267,6 +267,11 @@ CELL/SPU:
http://www-01.ibm.com
From: Rémi Denis-Courmont
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessar
From: Rémi Denis-Courmont
RISC-V defines the CLZ instruction as part of the ratified Zbb subset
of the (not yet ratified) bit mapulation extension (B). We can detect
it from the __riscv_zbb predefined constant. At least GCC 12 already
supports this correctly.
Note that the macro will be non-zero
From: Rémi Denis-Courmont
If the target supports the Basic bit-manipulation (Zbb) extension, then
the REV8 instruction is available to reverse byte order.
Note that this instruction only exists at the "XLEN" register size,
so we need to right shift the result down to the data width.
If Zbb is n
From: Rémi Denis-Courmont
This provides some micro-optimisations for signed integer clipping, and
support for bit weight with the Zbb extension.
---
libavutil/intmath.h | 5 +-
libavutil/riscv/intmath.h | 103 ++
2 files changed, 106 insertions(+), 2 d
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 74 +++
1 file changed, 74 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..7623c161cf
--- /dev/
From: Rémi Denis-Courmont
---
Makefile | 2 +-
configure| 15 +++
ffbuild/arch.mak | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 61f79e27ae..1fb742f390 100644
--- a/Makefile
+++ b/Makefile
@@ -91,7 +91,7 @@ ffbuild/
From: Rémi Denis-Courmont
---
tests/checkasm/checkasm.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index e56fd3850e..a5d0503811 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -226,6 +226,11 @@ static c
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Z
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 9 -
libavutil/riscv/float_dsp_rvv.S | 34
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index f1d3d528
From: Rémi Denis-Courmont
This is based on existing code from the VLC git tree with two minor
changes to account for the different function prototypes.
---
libavutil/float_dsp.c| 2 ++
libavutil/float_dsp.h| 1 +
libavutil/riscv/Makefile | 4 ++-
libavutil/risc
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 6 +
libavutil/riscv/float_dsp_rvv.S | 38
2 files changed, 44 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 903da4eeda..1381eadab6 1006
From: Rémi Denis-Courmont
---
libavutil/fixed_dsp.c| 4 +++-
libavutil/fixed_dsp.h| 1 +
libavutil/riscv/Makefile | 4 +++-
libavutil/riscv/fixed_dsp_init.c | 36 ++
libavutil/riscv/fixed_dsp_rvv.S | 38 +
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 21 +
2 files changed, 23 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index cf8c995d7c..055cdc7520 100644
--- a/libav
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 19 +++
2 files changed, 22 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 1381eadab6..9bc1976d04 100644
--- a/libavu
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 35
2 files changed, 38 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index ae089d2fdb..cf8c995d7c 100644
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 2 ++
libavutil/riscv/float_dsp_rvv.S | 18 ++
2 files changed, 20 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index 9bc1976d04..c2b72c3b25 100644
--- a/libavuti
From: Rémi Denis-Courmont
---
libavutil/riscv/float_dsp_init.c | 3 +++
libavutil/riscv/float_dsp_rvv.S | 22 ++
2 files changed, 25 insertions(+)
diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c
index c2b72c3b25..ae089d2fdb 100644
--- a/lib
From: Rémi Denis-Courmont
INT_MAX is (typically) a value with 31 significant bits but float can
only represent 23 significant bits, leading to a rounding error.
This substitutes the actual rounded value to avoid a clang warning:
warning: implicit conversion from 'int' to 'float' changes value
From: Rémi Denis-Courmont
INT_MAX is (typically) a value with 31 significant bits but float can
only represent 23 significant bits, leading to a rounding error.
This substitutes the actual rounded value as an unsigned int,
to avoid a clang warning while not overflowing signed int:
warning: imp
From: Rémi Denis-Courmont
Even though they have the same size, and typically the same alignment,
uint32_t and float are under no circumstances compatible types in C.
The casts from float * to uint32_t * are invalid here. Insofar as the
resulting pointers are dereferenced, this is undefined behav
From: Rémi Denis-Courmont
Even though they have the same size, and typically the same alignment,
uint32_t and float are under no circumstances compatible types in C.
The casts from float * to uint32_t * are invalid here. Insofar as the
resulting pointers are dereferenced, this is undefined behav
From: Rémi Denis-Courmont
Some serious copy-paste / squash / rebase mismanipulation here.
---
libavutil/riscv/intmath.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
index 78f7ba930a..3263a79dc4 100644
--- a/libavuti
From: Rémi Denis-Courmont
This introduces compile-tim and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without the F
extension, and if it does, it probably won't have run-time detection.
So the flag is essentially always set.
But as things stand,
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 74 +++
1 file changed, 74 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..7623c161cf
--- /dev/
From: Rémi Denis-Courmont
RV64G supports MIN & MAX instructions natively only on floating point
registers, not general purpose ones. The later would require the Zbb
extension. Due to that, it is actually faster to perform the clipping
"properly" in FPU.
Benchmarked on SiFive U74-MC:
audiodsp.ve
From: Rémi Denis-Courmont
---
libavutil/lfg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/lfg.h b/libavutil/lfg.h
index 2b669205d1..9a1e277acd 100644
--- a/libavutil/lfg.h
+++ b/libavutil/lfg.h
@@ -27,7 +27,7 @@
/**
* Context structure for the Lagged Fibonacc
From: Rémi Denis-Courmont
---
libavutil/riscv/intmath.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
index 3263a79dc4..45bce9a0e7 100644
--- a/libavutil/riscv/intmath.h
+++ b/libavutil/riscv/intmath.h
@@ -61,8 +61,8
From: Rémi Denis-Courmont
This is not used anywhere and has no implementations other than the
plain C one.
---
libavcodec/fmtconvert.c | 9 -
libavcodec/fmtconvert.h | 10 --
2 files changed, 19 deletions(-)
diff --git a/libavcodec/fmtconvert.c b/libavcodec/fmtconvert.c
index f
From: Rémi Denis-Courmont
This is no longer used since 46089967722f74e794865a044f5f682f26628802.
It also has no implementations other than the plain C one.
---
libavcodec/fmtconvert.c | 9 -
libavcodec/fmtconvert.h | 10 --
2 files changed, 19 deletions(-)
diff --git a/libavcod
From: Rémi Denis-Courmont
The compiler cannot infer that the two float vectors do not alias,
causing unnecessary extra loads and serialisation. This patch caches
the two input values in local variables so that compiler can optimise
individual loop iterations.
---
libavcodec/vorbisdec.c | 22
From: Rémi Denis-Courmont
While this probably never overflows, we are better safe than sorry.
The callback prototype should probably also use ptrdiff_t or size_t but
I diggress.
---
libavcodec/vorbisdec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/vorbisdec.
From: Rémi Denis-Courmont
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But a
From: Rémi Denis-Courmont
RVV defines a total of 12 different extensions, including:
- 5 different instruction subsets:
- Zve32x: 8-, 16- and 32-bit integers,
- Zve32f: Zve32x plus single precision floats,
- Zve64x: Zve32x plus 64-bit integers,
- Zve64f: Zve32f plus Zve64x,
- Zve64d: Z
From: Rémi Denis-Courmont
---
Makefile | 2 +-
configure| 15 +++
ffbuild/arch.mak | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 61f79e27ae..1fb742f390 100644
--- a/Makefile
+++ b/Makefile
@@ -91,7 +91,7 @@ ffbuild/
From: Rémi Denis-Courmont
Benchmarks:
get_pixels_c: 180.0
get_pixels_rvi: 136.7
---
libavcodec/pixblockdsp.c| 2 +
libavcodec/pixblockdsp.h| 2 +
libavcodec/riscv/Makefile | 2 +
libavcodec/riscv/pixblockdsp_init.c | 43 ++
libavcodec/risc
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 74 +++
1 file changed, 74 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..7623c161cf
--- /dev/
From: Rémi Denis-Courmont
RV64G supports MIN & MAX instructions natively only on floating point
registers, not general purpose ones. The later would require the Zbb
extension. Due to that, it is actually faster to perform the clipping
"properly" in FPU.
Benchmarked on SiFive U74-MC:
audiodsp.ve
From: Rémi Denis-Courmont
The compiler cannot infer that the two float vectors do not alias,
causing unnecessary extra loads and serialisation. This patch caches
the two input values in local variables so that compiler can optimise
individual loop iterations.
---
libavcodec/vorbisdec.c | 24
From: Rémi Denis-Courmont
While this probably never overflows, we are better safe than sorry.
The callback prototype should probably also use ptrdiff_t or size_t,
but I diggress (this would affect the DSP callback prototype).
---
libavcodec/vorbisdec.c | 3 +--
1 file changed, 1 insertion(+), 2
From: Rémi Denis-Courmont
While this probably never overflows, we are better safe than sorry.
The callback prototype should probably also use ptrdiff_t or size_t,
but I diggress (this would affect the DSP callback prototype).
---
libavcodec/ppc/vorbisdsp_altivec.c | 4 ++--
libavcodec/vorbisdec
From: Rémi Denis-Courmont
... for a difference between pointers.
---
libavcodec/aarch64/vorbisdsp_init.c | 2 +-
libavcodec/arm/vorbisdsp_init_arm.c | 2 +-
libavcodec/ppc/vorbisdsp_altivec.c | 2 +-
libavcodec/vorbis.h | 2 +-
libavcodec/vorbisdec.c | 2 +-
libavco
From: Rémi Denis-Courmont
The compiler cannot infer that the two float vectors do not alias,
causing unnecessary extra loads and serialisation. This patch caches
the two input values in local variables so that compiler can optimise
individual loop iterations.
---
libavcodec/vorbisdec.c | 24
From: Rémi Denis-Courmont
This introduces compile-time and run-time CPU detection on RISC-V. In
practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of
I, F and D extensions, and if it does, it probably won't have run-time
detection. So the flags are essentially always set.
But a
From: Rémi Denis-Courmont
---
libavutil/riscv/asm.S | 77 +++
1 file changed, 77 insertions(+)
create mode 100644 libavutil/riscv/asm.S
diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S
new file mode 100644
index 00..dbd97f40a4
--- /dev/
1 - 100 of 251 matches
Mail list logo