On 2019-03-26 21:22, Mike Stoner via ffmpeg-devel wrote:
> Hello,
> I’ve accounted for all feedback on this so far, I’m wondering if it is ready
> to be pushed upstream?
>
> Here are my results from ‘checkasm’ (lower is better):
>
> v210_unpack_c: 1636
> v210_unpack_ssse3: 611
> v210_unpack_avx:
I am resending this my patches because I am not sure if I sent this version in
the past. I split my changes into two patches because they do separate things.
I also changed some tabs to spaces in Mike's AVX2 patch.
James Darnley (2):
avcodec/v210dec: move DSP function setting into dedi
sm_check_vf_hflip(void);
void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/v210dec.c b/tests/checkasm/v210dec.c
new file mode 100644
index 00..7dd50a8271
--- /dev/null
+++ b/tests/checkasm/v210dec.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright (c) 2019 James Darnley
+ *
+ * This file is par
From: Michael Stoner
Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck
AVX2 is 1.4x faster than AVX
---
Mike, is this still the patch you want applied. I had to make a small
amendment to it because you had some tabs as indentation.
libavcodec/v210dec.c | 10 +-
libavcodec/
Prepare for checkasm test.
---
libavcodec/v210dec.c | 16 ++--
libavcodec/v210dec.h | 1 +
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/libavcodec/v210dec.c b/libavcodec/v210dec.c
index ddc5dbe8be..fd8a6b0d78 100644
--- a/libavcodec/v210dec.c
+++ b/libavcodec/v210de
On 2019-04-10 14:47, James Darnley wrote:
> From: Michael Stoner
Screw you mailing list or git, which ever one of you managed to screw up
the author's address. I will correct that, if I can.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.o
On 2019-04-10 14:47, James Darnley wrote:
> I am resending this my patches because I am not sure if I sent this version in
> the past. I split my changes into two patches because they do separate
> things.
>
> I also changed some tabs to spaces in Mike's AVX2 patch.
&
On 2019-05-18 09:39, Michael Niedermayer wrote:
> Fixes: "null pointer dereference"
> Fixes:
> 14551/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_V210_fuzzer-5088609952071680
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-o
On 2019-05-18 12:15, Michael Niedermayer wrote:
> On Sat, May 18, 2019 at 12:02:55PM +0200, James Darnley wrote:
>> I object to the commit message though because it isn't a "null pointer
>> dereference" but if that is the error as reported by the tool then keep
>
On 2019-05-24 11:36, lance.lmw...@gmail.com wrote:
> From: Limin Wang
>
> ...
Why? And these are "comments" not "commands".
signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.o
On 2019-05-24 12:06, James Darnley wrote:
> On 2019-05-24 11:36, lance.lmw...@gmail.com wrote:
>> From: Limin Wang
>>
>> ...
>
> Why?
I see why: so you don't screw-up the macros you create later.
signature.asc
Descri
On 2019-05-28 22:00, Derek Buitenhuis wrote:
> On 28/05/2019 20:58, James Almer wrote:
>> I think x26* and vpx/aom call it crf? It's not in option_tables.h in any
>> case.
>
> They do not. This is a constant quantizer mode, not constant rate factor.
IIRC either qp or cqp
signature.asc
Descrip
On 2019-06-28 04:26, Linjie Fu wrote:
> Previously, media driver provided planar format(like 420 8 bit), but
> for HEVC Range Extension (422/444 8/10 bit), the decoded image is
> produced in packed format.
>
> Y210/AYUV/Y410 are packed formats which are needed in HEVC Rext decoding
> for both VAAP
On 2019-06-28 03:03, Hendrik Leppkes wrote:
> On Fri, Jun 28, 2019 at 1:26 AM James Darnley wrote:
>>
>> On 2019-06-28 04:26, Linjie Fu wrote:
>>> Previously, media driver provided planar format(like 420 8 bit), but
>>> for HEVC Range Extension (422/44
On 2019-08-02 15:55, Ramana Jajula wrote:
> Hi,
>
> I am trying to encode my ts file m3u8 using my customised ffmpeg of version
> 4.1. I used below command to do encoding.
>
> ffmpeg -re -threads 8 -i /videos/input.ts -vcodec libx264 -s 320x240 -b:v
> 512000 -maxrate 512000 -acodec libfdk_aac -b:
From: Henrik Gramner
There's an edge case that wasn't properly handled.
---
libavutil/x86/x86inc.asm | 5 +
1 file changed, 5 insertions(+)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 5044ee86f0..bc370a6186 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86
From: Henrik Gramner
---
libavutil/x86/x86inc.asm | 4
1 file changed, 4 insertions(+)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 10b7711637..04dbb6b785 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -293,6 +293,10 @@ DECLARE_REG_TMP_SIZ
From: Henrik Gramner
---
libavutil/x86/x86inc.asm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 04dbb6b785..af35fe1e4d 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -685,7 +685,7 @@ DECLARE_
From: Henrik Gramner
Warn when the following are used without the appropriate cpuflag:
* YMM and ZMM registers
* 'pextrw' with a memory operand
* GPR instruction set extensions
---
libavutil/x86/x86inc.asm | 120 +++
1 file changed, 83 insertions(+), 37 del
Here are a few easy-to-import patches from x264. These are all after x264
commit 4a158b00 "x86inc: Correctly set mmreg variables" which FFmpeg already
has (commit eb5f063e7c).
It does not include the following commits:
* 82721eae "x86inc: Add x86-32 PIC support macros"
* 101bd27d "x86inc: Support
From: Henrik Gramner
Most VEX-encoded instructions require an additional byte to encode when src2
is a high register (e.g. x|ymm8..15). If the instruction is commutative we
can swap src1 and src2 when doing so reduces the instruction length, e.g.
vpaddw xmm0, xmm0, xmm8 -> vpaddw xmm0, xmm8,
From: Henrik Gramner
---
libavutil/x86/x86inc.asm | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index d1b4c982fc..8c8cc97e0c 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc
From: Henrik Gramner
Use register numbers instead of copying the full register names. This makes it
possible to change register widths in the middle of a function and keep the
mmreg permutations intact which can be useful for code that only needs larger
vectors for parts of the function in combin
On 2018-09-03 15:29, James Almer wrote:
> pass 32 - 1 to both av_image_fill_pointers() calls directly?
Please do not add a magic number where nobody will find it. Use one of
the 3 already existing methods for knowing the alignment necessary for
assembly.
If this is unrelated, my apologies.
On 2018-09-05 22:52, Sigríður Regína Sigurþórsdóttir wrote:
> +{"reserve_free_space", "Reserve a given amount of space at the
> beginning og the file for unspecified purpose."
I added the "metadata_header_padding" global option many years ago. Can
you not reuse it for this purpose? Is it not
On 2018-09-06 19:39, Sigríður Regína Sigurþórsdóttir wrote:
> +if (s->metadata_header_padding) {
> +if (s->metadata_header_padding == 1)
> +s->metadata_header_padding++;
> +put_ebml_void(pb, s->metadata_header_padding);
> +}
Unfortunately I was forced to make th
On 2020-04-10 16:53, Anton Khirnov wrote:
> ffmpeg | branch: master | Anton Khirnov | Mon Jan 9
> 18:04:42 2017 +0100| [1f4cf92cfbd3accbae582ac63126ed5570ddfd37] | committer:
> Anton Khirnov
>
> pthread_frame: merge the functionality for normal decoder init and
> init_thread_copy
>
> The cur
On 2020-06-04 01:19, Michael Niedermayer wrote:
> Fixes: array end overread
> Fixes:
> 22395/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BITPACKED_fuzzer-5760940300828672
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-
-frames in chunked mode.
Needs more work.
James Darnley (1):
avcodec/h264: enable draw_horiz_band
Kieran Kunhya (1):
avcodec/h264: fix draw_horiz_band with slice threads
libavcodec/h264_slice.c | 29 +++--
libavcodec/h264dec.c| 2 +-
2 files changed, 24 insert
---
libavcodec/h264dec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/h264dec.c b/libavcodec/h264dec.c
index 8d1bd16a8e..b9f304936c 100644
--- a/libavcodec/h264dec.c
+++ b/libavcodec/h264dec.c
@@ -1056,7 +1056,7 @@ AVCodec ff_h264_decoder = {
.init
From: Kieran Kunhya
---
libavcodec/h264_slice.c | 29 +++--
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c
index 5ceee107a0..fe2aa01ceb 100644
--- a/libavcodec/h264_slice.c
+++ b/libavcodec/h264_slice.c
@@
On 2019-10-11 21:45, Paul B Mahol wrote:
> diff --git a/doc/utils.texi b/doc/utils.texi
> index d55dd315c3..4e2e713505 100644
> --- a/doc/utils.texi
> +++ b/doc/utils.texi
> @@ -920,6 +920,9 @@ corresponding input value will be returned.
> @item round(expr)
> Round the value of expression @var{e
On 2019-11-25 13:52, Chandra Nakka wrote:
> Dear FFmpeg developers,
>
> I'm very happy to have found your details on FFmpeg website for requesting
> FFmpeg feature implementation.
>
> Currently I'm using FFmpeg command line tool on my linux servers to process
> media files into instant mp3 audio
On 2019-12-04 15:43, Linjie Fu wrote:
> Previously, media driver provided planar format(like 420 8 bit),
> but for HEVC Range Extension (422/444 8/10 bit), the decoded image
> is produced in packed format because Windows expects it.
>
> Add some packed pixel formats for hardware decode support in
On 28/01/2020, Liu Steven wrote:
>
>
>> 在 2020年1月27日,下午3:29,Jean-Baptiste Kempf 写道:
>> It will be joinable through some VideoConf tool.
> Can we join by IRC or other things on internet?
> Because these days are Spring Festival (Chinese New Year, Important
> festivals that have lasted for thousand
On 30/12/2019, Lauri Kasanen wrote:
> Hi,
>
> For the Libre RISC-V project, I'm going to research the popular codecs
> and design new instructions to help speed them up. With ffmpeg being
> home to lots of asm folks for many platforms, I also want to ask your
> opinion.
>
> What new instructions w
On 2020-02-22 11:11, Thilo Borgmann wrote:
> Please someone put an IRC log from the meeting there, too. James Darnley?
> Also the audio was streamed, somebody might remember where too exactly.
> Michael?
I can post my log from the day, probably email attachment. Should I
remove any of
On 2020-02-22 13:25, Paul B Mahol wrote:
> On 2/22/20, James Darnley wrote:
>> On 2020-02-22 11:11, Thilo Borgmann wrote:
>>> Please someone put an IRC log from the meeting there, too. James Darnley?
>>> Also the audio was streamed, somebody might remember where too ex
On 2020-02-23 13:22, Michael Niedermayer wrote:
> From: Parker Ernest <@>
>
> commit fc6a5883d6af8cae0e96af84dda0ad74b360a084 breaks build on
> x86_64 CPUs which do not have SSSE3, e.g. AMD Phenom-II
>
> Signed-off-by: Michael Niedermayer
> ---
> libswscale/x86/yuv2rgb.c | 2 ++
> 1 file change
On 2020-02-23 15:12, Jean-Baptiste Kempf wrote:
> Yo,
>
> On Sat, Feb 22, 2020, at 22:18, Josh de Kock wrote:
>> This allows for easy shortlog/log parsing, useful in determining
>> eligible members of the general assembly for the new FFmpeg voting
>> system.
>
> I think this is a good idea.
> But
On 2020-02-23 18:58, Michael Niedermayer wrote:
> On Sun, Feb 23, 2020 at 05:03:36PM +0100, Carl Eugen Hoyos wrote:
>> Am So., 23. Feb. 2020 um 13:30 Uhr schrieb Michael Niedermayer
>> :
>>>
>>> From: Parker Ernest <@>
>>>
>>> commit fc6a5883d6af8cae0e96af84dda0ad74b360a084 breaks build on
>>> x86_
On 2019-02-15 10:01, Kornel wrote:
> libavcodec/gif.c in ff_gif_encoder.pix_fmts seems to passively declare types
> of pixel formats it accepts.
If you want to experiment you can change that so it accepts rgb (also or
only). Then you can implement and test what you want, then you can ask
about s
On 2019-03-03 15:44, Martin Vignali wrote:
> Hello,
>
> ...
>
> Not directly related to this patch, but it can be interesting for testing
> purpose to write a checkasm test for the v210 func decoding.
> So it's more easy to check the perf for "each" cpu flags, and be sure, the
> various width cas
On 2019-03-01 18:41, Michael Stoner wrote:
> The AVX2 code leverages VPERMD to process 12 pixels/iteration. This is my
> first patch submission so any comments are greatly appreciated.
>
> -Mike
>
> Tested on Skylake (Win32 & Win64)
> 1920x1080 input frame
> =
> C code - 440
Prepare for checkasm test.
---
libavcodec/v210dec.c | 13 +
libavcodec/v210dec.h | 1 +
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/libavcodec/v210dec.c b/libavcodec/v210dec.c
index ddc5dbe8be..28cf00d320 100644
--- a/libavcodec/v210dec.c
+++ b/libavcodec/v210dec.c
sm_check_vf_hflip(void);
void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/v210dec.c b/tests/checkasm/v210dec.c
new file mode 100644
index 00..7320ed5e37
--- /dev/null
+++ b/tests/checkasm/v210dec.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (c) 2019 James Darnley
+ *
+ * This file is par
On 2019-03-06 10:11, Paul B Mahol wrote:
> On 3/6/19, Carl Eugen Hoyos wrote:
>> 2019-03-04 23:58 GMT+01:00, James Darnley :
>>> Prepare for checkasm test.
>>> ---
>>> libavcodec/v210dec.c | 13 +
>>> libavcodec/v210dec.h | 1 +
>&g
sm_check_vf_hflip(void);
void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/v210dec.c b/tests/checkasm/v210dec.c
new file mode 100644
index 00..7320ed5e37
--- /dev/null
+++ b/tests/checkasm/v210dec.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (c) 2019 James Darnley
+ *
+ * This file is par
On 2019-03-06 20:31, James Darnley wrote:
> ...
Wrong patch and wrong reference. Please ignore this.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Prepare for checkasm test.
---
libavcodec/v210dec.c | 16 ++--
libavcodec/v210dec.h | 1 +
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/libavcodec/v210dec.c b/libavcodec/v210dec.c
index ddc5dbe8be..6db662538e 100644
--- a/libavcodec/v210dec.c
+++ b/libavcodec/v210de
sm_check_vf_hflip(void);
void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/v210dec.c b/tests/checkasm/v210dec.c
new file mode 100644
index 00..7dd50a8271
--- /dev/null
+++ b/tests/checkasm/v210dec.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright (c) 2019 James Darnley
+ *
+ * This file is par
Prepare for checkasm test.
---
libavcodec/v210dec.c | 16 ++--
libavcodec/v210dec.h | 1 +
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/libavcodec/v210dec.c b/libavcodec/v210dec.c
index ddc5dbe8be..fd8a6b0d78 100644
--- a/libavcodec/v210dec.c
+++ b/libavcodec/v210de
After better testing I have decided to only submit these two functions. The
others did not provide a speedup better than the deviation in testing. Those
patches remain in the list archive should someone wish to try them.
James Darnley (5):
avcodec/h264: change RETs into REP_RETs where
Haswell:
- 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext
Skylake-U:
- 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with mmxext
---
libavcodec/x86/h264_idct.asm | 20
libavcodec/x86/h264dsp_init.c | 2 ++
2 files changed, 22 insertions(+)
di
---
libavcodec/x86/h264_idct.asm | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index c36fea5..878ff02 100644
--- a/libavcodec/x86/h264_idct.asm
+++ b/libavcodec/x86/h264_idct.asm
@@ -695,7 +695,7 @@ cglo
---
libavcodec/x86/h264_idct.asm | 21 +
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index dde40e9..bc4dce4 100644
--- a/libavcodec/x86/h264_idct.asm
+++ b/libavcodec/x86/h264_idct.asm
@@ -87,10 +87,
Haswell:
- 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext
Skylake-U:
- 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
---
libavcodec/x86/h264_idct.asm | 33 -
libavcodec/x86/h264dsp_init.c | 3 +++
2 files changed, 35 ins
The labels get stripped leading to (slightly) nicer disassembly from
objdump.
---
libavcodec/x86/h264_idct.asm | 24
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index 878ff02..dde40e9 100644
--
On 2017-04-05 05:33, James Almer wrote:
> On 4/4/2017 10:53 PM, James Darnley wrote:
>> ---
>> libavcodec/x86/h264_idct.asm | 12 ++--
>> 1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x
On 2017-04-05 13:41, Ronald S. Bultje wrote:
> Hi,
>
> On Tue, Apr 4, 2017 at 9:53 PM, James Darnley wrote:
>
>> The labels get stripped leading to (slightly) nicer disassembly from
>> objdump.
>>
> [..]
>
>> -jz .cycle%1end
>> +jz %%
On 2017-04-05 05:44, James Almer wrote:
> On 4/4/2017 10:53 PM, James Darnley wrote:
>> Haswell:
>> - 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext
>>
>> Skylake-U:
>> - 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
&
On 2017-04-05 06:05, James Almer wrote:
> On 4/4/2017 10:53 PM, James Darnley wrote:
>> Haswell:
>> - 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext
>>
>> Skylake-U:
>> - 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with
On 2017-04-05 22:26, Henrik Gramner wrote:
> On Wed, Apr 5, 2017 at 3:53 AM, James Darnley wrote:
>> call h264_idct_add8_mmx_plane
>> -RET
>> +RET ; TODO: check rep ret after a function call
>
> call followed by RET should be replaced by the TAIL
On 2017-04-06 18:06, James Almer wrote:
> Your numbers are really confusing. Could you post the actual numbers for
> each function instead of doing comparisons?
These figures are the actual numbers!
Using the figures from Haswell above:
> ff_h264_idct_add_8_mmx = 52 cycles
> ff_h264_idct_add_8_s
Changes:
- Added sse2 functions
- Fixed an incorrect xmm register count
I did not make the change suggested by Gramner about TAIL_CALL and I did leave
the TODOs there.
If there are no further objections I will push by Monday at the latest. I want
to get this out the door.
James Darnley (6
---
libavcodec/x86/h264_idct.asm | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index c36fea5..878ff02 100644
--- a/libavcodec/x86/h264_idct.asm
+++ b/libavcodec/x86/h264_idct.asm
@@ -695,7 +695,7 @@ cglo
---
libavcodec/x86/h264_idct.asm | 21 +
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index dde40e9..bc4dce4 100644
--- a/libavcodec/x86/h264_idct.asm
+++ b/libavcodec/x86/h264_idct.asm
@@ -87,10 +87,
Haswell:
- 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext
Skylake-U:
- 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
---
libavcodec/x86/h264_idct.asm | 33 -
libavcodec/x86/h264dsp_init.c | 3 +++
2 files changed, 35 ins
Kaby Lake Pentium:
- ff_h264_idct_add_8_sse2:~1.18x faster than mmxext
- ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
---
libavcodec/x86/h264_idct.asm | 11 +--
libavcodec/x86/h264dsp_init.c | 5 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/libavco
The labels get stripped leading to (slightly) nicer disassembly from
objdump.
---
libavcodec/x86/h264_idct.asm | 24
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm
index 878ff02..dde40e9 100644
--
Haswell:
- 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext
Skylake-U:
- 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with mmxext
---
libavcodec/x86/h264_idct.asm | 20
libavcodec/x86/h264dsp_init.c | 2 ++
2 files changed, 22 insertions(+)
di
100644
index 00..4c5f32a1b6
--- /dev/null
+++ b/libavformat/falcom_xa.c
@@ -0,0 +1,98 @@
+/*
+ * Falcom Xanadu demuxer
+ * Copyright (c) 2016 James Darnley
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the
On 2017-04-15 14:29, Ronald S. Bultje wrote:
> Hi,
>
> On Fri, Apr 14, 2017 at 9:46 PM, James Darnley wrote:
>
>> The labels get stripped leading to (slightly) nicer disassembly from
>> objdump.
>> ---
>> libavcodec/x86/h264_idct.asm | 24 +++
On 2017-04-15 15:36, James Darnley wrote:
> add Falcom Xanadu demuxer
I mean Xanadu Next, not the original one.
signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mail
On 2017-04-15 17:56, James Almer wrote:
> On 4/15/2017 10:36 AM, James Darnley wrote:
>> ---
>> libavformat/Makefile | 1 +
>> libavformat/allformats.c | 1 +
>> libavformat/falcom_xa.c | 98
>>
&g
I want to discuss why we use this and argue that we should be using
`strip -x` all the time anyway.
The man page for strip says that -x removes all non-global symbols. -wN
is a combination of -w for wildcard matching and -N to remove a given
symbol.
-wN gets ..@* as an argument. Together they r
---
For initial review and comments.
I plan to drop the '2' from the filename before pushing. I haven't done it yet
because I am still working on the file. I didn't make any changes with speedup
in mind so I haven't done any benchmarking yet.
libavcodec/x86/Makefile | 4 +-
libavcod
On 2017-05-16 13:08, Rostislav Pehlivanov wrote:
> Reduces the amount of debugging information of external asm from
> uselessly verbose to informative enough.
> ---
> configure | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/configure b/configure
> index e4862f6a35..df8
On 2017-05-18 19:13, Ronald S. Bultje wrote:
> - do you think a checkasm test makes sense? That would also make
> performance measuring easier.
The (I)DCT code seems to have its own test program in the fate-idct8x8
test. That is built from libavcodec/tests/dct.c. It even includes its
own benchma
---
libavcodec/x86/idctdsp_init.c | 38 +++---
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/libavcodec/x86/idctdsp_init.c b/libavcodec/x86/idctdsp_init.c
index 1f308cc079..f1c915aa00 100644
--- a/libavcodec/x86/idctdsp_init.c
+++ b/libavcodec/x86/
le IDCT MMX
+;
+; Copyright (c) 2001, 2002 Michael Niedermayer
+;
+; Conversion from gcc syntax to x264asm syntax with minimal modifications
+; by James Darnley .
+;
+; This file is part of FFmpeg.
+;
+; FFmpeg is free software; you can redistribute it and/or
+; modify it under the terms of the GNU
On 2017-05-29 16:51, James Darnley wrote:
> ---
> Changes:
> - Changed type of d4 constant to dwords because it gets used as dwords.
> - Changed or removed HAVE_MMX_INLINE preprocessor guards.
> - Added note about conversion from inline.
> - New file no lon
On 2017-05-29 23:26, Michael Niedermayer wrote:
> On Mon, May 29, 2017 at 09:40:49PM +0200, James Darnley wrote:
>> On 2017-05-29 16:51, James Darnley wrote:
>>> ---
>>> Changes:
>>> - Changed type of d4 constant to dwords because it gets used
On 2017-05-29 16:51, James Darnley wrote:
> Commit message: reindent
Is this acceptable? Should I be more verbose?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
her errors.
James Darnley (6):
initial alignment corrections for xmm registers
change explicit mmx register use to x264asm style
add and fix xmm version of simple_idct
avcodec/x86: cleanup simple_idct10
add x86_64 8-bit simple_idct function
change coeffs
libavcodec/tests
---
libavcodec/x86/simple_idct.asm | 1172
1 file changed, 586 insertions(+), 586 deletions(-)
Picture s/mm([0-7])/m\1/g here for 1229 lines and 64695 bytes.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
---
libavcodec/x86/simple_idct.asm | 47 ++
1 file changed, 34 insertions(+), 13 deletions(-)
diff --git a/libavcodec/x86/simple_idct.asm b/libavcodec/x86/simple_idct.asm
index 6fedbb5784..b5d05ca653 100644
--- a/libavcodec/x86/simple_idct.asm
+++ b/libavco
---
libavcodec/tests/x86/dct.c | 3 +++
libavcodec/x86/idctdsp_init.c | 1 +
libavcodec/x86/simple_idct.asm | 45 ++
libavcodec/x86/simple_idct.h | 1 +
4 files changed, 50 insertions(+)
diff --git a/libavcodec/tests/x86/dct.c b/libavcodec/tests/x
---
libavcodec/x86/simple_idct10.asm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/x86/simple_idct10.asm b/libavcodec/x86/simple_idct10.asm
index b4b47afcee..ae848b7faf 100644
--- a/libavcodec/x86/simple_idct10.asm
+++ b/libavcodec/x86/simple_idct10.asm
@@ -46,
Use named arguments for the functions so we can remove a define. The
stride/linesize argument is now ptrdiff_t type so we no longer need to
sign extend the register.
---
libavcodec/x86/proresdsp.asm | 2 +-
libavcodec/x86/simple_idct10.asm | 8 ++--
libavcodec/x86/simple_i
---
libavcodec/tests/x86/dct.c | 2 ++
libavcodec/x86/idctdsp_init.c| 10 ++
libavcodec/x86/simple_idct.h | 3 +++
libavcodec/x86/simple_idct10.asm | 6 ++
4 files changed, 21 insertions(+)
diff --git a/libavcodec/tests/x86/dct.c b/libavcodec/tests/x86/dct.c
index 971
To answer the couple of questions that were asked over the weekend.
Rostislav, about the performance. I can see how to force a particular
IDCT implementation for real world decoding (the -idct option) but the
MPEG2 HD sample I've been working with mostly uses the "idct add"
function which doesn't
Incorporate some of the recent changes committed to x264. This is an initial
set with no controversial changes: no nasm requirement, no avx512.
I do want your comments on where I should put the aesni define in the last
patch. I will make a note on that one too. I will attempt to upstream that
d
From: Henrik Gramner
There's no point in emitting a rep prefix before ret on modern CPUs.
---
libavutil/x86/x86inc.asm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index f2a6a3f1db..44069741cc 100644
--- a/libavutil/x86/x
From: Henrik Gramner
Simplifies writing assembly code that depends on available instructions.
LZCNT implies SSE2
BMI1 implies AVX+LZCNT
AVX2 implies BMI2
---
This is the patch I was talking about. Where should I put the aesni define?
x264 doesn't have it but I will try to get it upstreamed.
l
From: Henrik Gramner
Due to a peculiarity in the ModR/M addressing encoding, the r12 and r13
registers sometimes requires an additional byte when used as a base register.
r14 and r15 doesn't have that issue, so prefer using them.
---
libavutil/x86/x86inc.asm | 16
1 file change
From: Henrik Gramner
We overload the `call` instruction with a macro, but it would misbehave when
the macro argument wasn't a valid identifier. Fix it by explicitly checking
if the argument is an identifier.
---
libavutil/x86/x86inc.asm | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
From: Anton Mitrofanov
The use of rsp was pretty much hardcoded there and probably didn't work
otherwise with stack_size > 0.
---
libavutil/x86/x86inc.asm | 19 ++-
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
On 2017-06-09 10:08, Henrik Gramner wrote:
> On Fri, Jun 9, 2017 at 1:05 AM, James Darnley wrote:
>> Where should I put the aesni define?
>
> Between sse42 and avx.
Thank you. I will change this and the first patch to bump the date.
I'll give other people about an hour to
---
configure | 17 -
1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/configure b/configure
index e3941f9dfd..69bbf25bf5 100755
--- a/configure
+++ b/configure
@@ -3258,7 +3258,7 @@ pkg_config_default=pkg-config
ranlib_default="ranlib"
strip_default="strip"
versio
1 - 100 of 517 matches
Mail list logo