Here is the full patch rebased including all previous changes. 2015-07-10 17:56 GMT-03:00 Pedro Arthur <bygran...@gmail.com>:
> I'll check and fix it and then send a new patch. > > 2015-07-09 14:37 GMT-03:00 Michael Niedermayer <michae...@gmx.at>: > >> On Wed, Jul 08, 2015 at 10:36:56PM -0300, Pedro Arthur wrote: >> > Hi, >> > >> > Last week I worked on adding the ring buffer logic into SwsSlice and >> fixing >> > some seg faults. As I'm having some problems with git send-email through >> > gmail I've attached the patch. >> >> these patches dont seem to apply cleanly >> You can try to apply them on a fresh checkout to see the problem >> (also see below) >> >> Also i think you used git merge in your git branch, this is not >> recommanded, and might be related to why they dont apply >> >> In practice its probably best if you use git pull --rebase >> when updating to a new version. Its quite likely that earlier changes >> in the set will need to be updated and that would not be possible >> with a merge occirug later >> >> to correct the patches so they apply again you can probably >> checkout origin/master, create a new branch and cherry pick >> the comits on top of that, correcting any conflicts that might pop >> up, and test that things still work after moving the changes on top >> of the latest code >> >> >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch:142: tab in indent. >> int maxSize; >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch:153: tab in indent. >> maxSize = FFMAX(c->vLumFilterSize, c->vChrFilterSize << >> c->chrSrcVSubSample); >> error: libswscale/slice.c: does not exist in index >> error: patch failed: libswscale/swscale.c:451 >> error: libswscale/swscale.c: patch does not apply >> error: patch failed: libswscale/swscale_internal.h:932 >> error: libswscale/swscale_internal.h: patch does not apply >> Patch failed at 0001 swscale: fix seg fault when accessing src slice >> The copy of the patch that failed is found in: >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch >> When you have resolved this problem, run "git am --continue". >> If you prefer to skip this patch, run "git am --skip" instead. >> To restore the original branch and stop patching, run "git am --abort". >> >> Applying: swscale: fix seg fault when accessing src slice >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch:142: tab in indent. >> int maxSize; >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch:153: tab in indent. >> maxSize = FFMAX(c->vLumFilterSize, c->vChrFilterSize << >> c->chrSrcVSubSample); >> warning: 2 lines add whitespace errors. >> Using index info to reconstruct a base tree... >> A libswscale/slice.c >> M libswscale/swscale.c >> M libswscale/swscale_internal.h >> <stdin>:142: tab in indent. >> int maxSize; >> <stdin>:153: tab in indent. >> maxSize = FFMAX(c->vLumFilterSize, c->vChrFilterSize << >> c->chrSrcVSubSample); >> <stdin>:165: new blank line at EOF. >> + >> warning: 2 lines applied after fixing whitespace errors. >> Falling back to patching base and 3-way merge... >> error: refusing to lose untracked file at 'libswscale/slice.c' >> Auto-merging libswscale/swscale_internal.h >> CONFLICT (content): Merge conflict in libswscale/swscale_internal.h >> Auto-merging libswscale/swscale.c >> CONFLICT (content): Merge conflict in libswscale/swscale.c >> CONFLICT (modify/delete): libswscale/slice.c deleted in HEAD and modified >> in swscale: fix seg fault when accessing src slice. Version swscale: fix >> seg fault when accessing src slice of libswscale/slice.c left in tree. >> Failed to merge in the changes. >> Patch failed at 0001 swscale: fix seg fault when accessing src slice >> The copy of the patch that failed is found in: >> ffmpeg-git/ffmpeg/.git/rebase-apply/patch >> When you have resolved this problem, run "git am --continue". >> If you prefer to skip this patch, run "git am --skip" instead. >> To restore the original branch and stop patching, run "git am --abort". >> >> [...] >> -- >> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB >> >> In fact, the RIAA has been known to suggest that students drop out >> of college or go to community college in order to be able to afford >> settlements. -- The RIAA >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> >
From 1501f5a88afb8c1cd128969a2d950bcca594c11f Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 24 May 2015 12:52:46 -0300 Subject: [PATCH 01/11] swscale refactor: added initial filters Signed-off-by: Pedro Arthur <bygran...@gmail.com> --- libswscale/slice.c | 299 ++++++++++++++++++++++++++++++++++++++++++ libswscale/swscale.c | 33 ++++- libswscale/swscale_internal.h | 41 ++++++ libswscale/utils.c | 3 + 4 files changed, 374 insertions(+), 2 deletions(-) create mode 100644 libswscale/slice.c diff --git a/libswscale/slice.c b/libswscale/slice.c new file mode 100644 index 0000000..4f40ae6 --- /dev/null +++ b/libswscale/slice.c @@ -0,0 +1,299 @@ +#include "swscale_internal.h" + +/* +int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_sub_sample, int h_sub_sample); +void free_slice(SwsSlice *s); +int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int sliceY, int sliceH); +int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH); +int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH); +*/ + + +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); +int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); + +/* +int ff_init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal); +int ff_init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc); +*/ + +int ff_init_filters(SwsContext *c); +int ff_free_filters(SwsContext *c); + + +static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_sub_sample, int h_sub_sample) +{ + int i; + int err = 0; + + int size[4] = { lines, + FF_CEIL_RSHIFT(lines, v_sub_sample), + FF_CEIL_RSHIFT(lines, v_sub_sample), + lines }; + + //s->width; + s->h_chr_sub_sample = h_sub_sample; + s->v_chr_sub_sample = v_sub_sample; + s->fmt = fmt; + + for (i = 0; i < 4; ++i) + { + s->plane[i].line = av_malloc(sizeof(uint8_t*) * size[i]); + if (!s->plane[i].line) + { + err = AVERROR(ENOMEM); + break; + } + s->plane[i].available_lines = size[i]; + s->plane[i].sliceY = 0; + s->plane[i].sliceH = 0; + } + + if (err) + { + for (--i; i >= 0; --i) + av_free(s->plane[i].line); + return err; + } + return 1; +} + +static void free_slice(SwsSlice *s) +{ + int i; + for (i = 0; i < 4; ++i) + av_free(s->plane[i].line); +} + +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip) +{ + int i = 0; + + int start[4] = {sliceY, + sliceY >> s->v_chr_sub_sample, + sliceY >> s->v_chr_sub_sample, + sliceY}; + + int stride1[4] = {stride[0], + stride[1] << skip, + stride[2] << skip, + stride[3]}; + + s->width = srcW; + + for (i = 0; i < 4; ++i) + { + int j; + int lines = FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample); + lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; + + s->plane[i].sliceY = sliceY; + s->plane[i].sliceH = lines; + + for (j = 0; j < lines; j+= 1 << skip) + s->plane[i].line[j] = src[i] + (start[i] + j) * stride1[i]; + + } + + return 1; +} + +int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH) +{ + int i; + s->width = dstW; + for (i = 0; i < 4; ++i) + { + int j; + int lines = FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample); + lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; + + s->plane[i].sliceY = sliceY; + s->plane[i].sliceH = lines; + + for (j = 0; j < lines; ++j) + { + uint8_t * v = linesPool[i] ? linesPool[i][j] : NULL; + s->plane[i].line[j] = v; + } + + } + return 1; +} + +static int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int sliceY, int sliceH) +{ + int i; + uint8_t *ptr[4] = {v, v, v, v2}; + s->width = dstW; + for (i = 0; i < 4; ++i) + { + int j; + int lines = s->plane[i].available_lines; + + s->plane[i].sliceY = sliceY; + s->plane[i].sliceH = lines; + + for (j = 0; j < lines; ++j) + s->plane[i].line[j] = ptr[i]; + + } + return 1; +} + + +static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) +{ + int srcW = desc->src->width; + int dstW = desc->dst->width; + int xInc = desc->xInc; + + uint8_t ** src = desc->src->plane[0].line; + uint8_t ** dst = desc->dst->plane[0].line; + + int src_pos = sliceY - desc->src->plane[0].sliceY; + int dst_pos = sliceY - desc->dst->plane[0].sliceY; + + + + if (!c->hyscale_fast) { + c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], desc->filter, + desc->filter_pos, desc->filter_size); + } else { // fast bilinear upscale / crap downscale + c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc); + } + + if (c->lumConvertRange) + c->lumConvertRange((int16_t*)dst[dst_pos], dstW); + + + if (desc->alpha) + { + src = desc->src->plane[3].line; + dst = desc->dst->plane[3].line; + + src_pos = sliceY - desc->src->plane[3].sliceY; + dst_pos = sliceY - desc->dst->plane[3].sliceY; + + + + if (!c->hyscale_fast) { + c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], desc->filter, + desc->filter_pos, desc->filter_size); + } else { // fast bilinear upscale / crap downscale + c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc); + } + } + + + + return 1; +} + +static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) +{ + int srcW = desc->src->width; + uint32_t * pal = desc->pal; + + int sp = sliceY - desc->src->plane[0].sliceY; + int dp = sliceY - desc->dst->plane[0].sliceY; + + const uint8_t * src[4] = { desc->src->plane[0].line[sp], + desc->src->plane[1].line[sp], + desc->src->plane[2].line[sp], + desc->src->plane[3].line[sp]}; + uint8_t * dst = desc->dst->plane[0].line[0/*dp*/]; + + desc->dst->plane[0].sliceY = sliceY; + desc->dst->plane[0].sliceH = sliceH; + desc->dst->plane[3].sliceY = sliceY; + desc->dst->plane[3].sliceH = sliceH; + + if (c->lumToYV12) { + c->lumToYV12(dst, src[0], src[1], src[2], srcW, pal); + } else if (c->readLumPlanar) { + c->readLumPlanar(dst, src, srcW, c->input_rgb2yuv_table); + } + + + if (desc->alpha) + { + dp = sliceY - desc->dst->plane[3].sliceY; + dst = desc->dst->plane[3].line[dp]; + if (c->alpToYV12) { + c->alpToYV12(dst, src[3], src[1], src[2], srcW, pal); + } else if (c->readAlpPlanar) { + c->readAlpPlanar(dst, src, srcW, NULL); + } + } + + return 1; +} + +static int init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal) +{ + desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); + desc->pal = pal; + desc->src =src; + desc->dst = dst; + desc->process = &lum_convert; + + return 1; +} + + +static int init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc) +{ + desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); + desc->filter = filter; + desc->filter_pos = filter_pos; + desc->filter_size = filter_size; + + desc->src = src; + desc->dst = dst; + + desc->xInc = xInc; + desc->process = &lum_h_scale; + + return 1; +} + +int ff_init_filters(SwsContext * c) +{ + int i; + int need_convert = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; + + c->numDesc = need_convert ? 2 : 1; + c->desc = av_malloc(sizeof(SwsFilterDescriptor) * c->numDesc); + c->slice = av_malloc(sizeof(SwsSlice) * (c->numDesc+1)); + + for (i = 0; i < c->numDesc+1; ++i) + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, 0, 0); + + i = 0; + if (need_convert) + { + init_desc_fmt_convert(&c->desc[i], &c->slice[i], &c->slice[i+1], (uint32_t) usePal(c->srcFormat) ? c->pal_yuv : c->input_rgb2yuv_table); + init_slice_1(&c->slice[i+1], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); + c->desc[i].alpha = c->alpPixBuf != 0; + ++i; + } + + + init_desc_hscale(&c->desc[i], &c->slice[i], &c->slice[i+1], c->hLumFilter, c->hLumFilterPos, c->hLumFilterSize, c->lumXInc); + c->desc[i].alpha = c->alpPixBuf != 0; + + return 1; +} + +int ff_free_filters(SwsContext *c) +{ + av_freep(&c->desc); + if (c->slice) + { + int i; + for (i = 0; i < c->numDesc+1; ++i) + free_slice(&c->slice[i]); + } + return 1; +} diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 1945e1d..0fcf59b 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -315,6 +315,9 @@ static av_always_inline void hcscale(SwsContext *c, int16_t *dst1, if (DEBUG_SWSCALE_BUFFERS) \ av_log(c, AV_LOG_DEBUG, __VA_ARGS__) + +#include "slice.c" + static int swscale(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) @@ -371,6 +374,12 @@ static int swscale(SwsContext *c, const uint8_t *src[], int lastInChrBuf = c->lastInChrBuf; int perform_gamma = c->is_internal_gamma; + int numDesc = c->numDesc; + SwsSlice *src_slice = &c->slice[0]; + SwsSlice *dst_slice = &c->slice[numDesc]; + SwsFilterDescriptor *desc = c->desc; + int16_t **line_pool[4]; + if (!usePal(c->srcFormat)) { pal = c->input_rgb2yuv_table; @@ -439,6 +448,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], } lastDstY = dstY; + for (; dstY < dstH; dstY++) { const int chrDstY = dstY >> c->chrDstVSubSample; uint8_t *dest[4] = { @@ -486,6 +496,19 @@ static int swscale(SwsContext *c, const uint8_t *src[], lastLumSrcY, lastChrSrcY); } +#define NEW_FILTER 1 + + +#if NEW_FILTER + line_pool[0] = &lumPixBuf[lumBufIndex + 1]; + line_pool[1] = &chrUPixBuf[chrBufIndex + 1]; + line_pool[2] = &chrVPixBuf[chrBufIndex + 1]; + line_pool[3] = alpPixBuf ? &alpPixBuf[lumBufIndex + 1] : NULL; + + ff_init_slice_from_src(src_slice, (uint8_t**)src, srcStride, c->srcW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, 0); + ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf); + +#endif // Do horizontal scaling while (lastInLumBuf < lastLumSrcY) { const uint8_t *src1[4] = { @@ -494,6 +517,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], src[2] + (lastInLumBuf + 1 - srcSliceY) * srcStride[2], src[3] + (lastInLumBuf + 1 - srcSliceY) * srcStride[3], }; + int i; lumBufIndex++; av_assert0(lumBufIndex < 2 * vLumBufSize); av_assert0(lastInLumBuf + 1 - srcSliceY < srcSliceH); @@ -501,7 +525,10 @@ static int swscale(SwsContext *c, const uint8_t *src[], if (perform_gamma) gamma_convert((uint8_t **)src1, srcW, c->inv_gamma); - +#if NEW_FILTER + for (i = 0; i < numDesc; ++i) + desc[i].process(c, &desc[i], lastInLumBuf + 1, 1); +#else hyscale(c, lumPixBuf[lumBufIndex], dstW, src1, srcW, lumXInc, hLumFilter, hLumFilterPos, hLumFilterSize, formatConvBuffer, pal, 0); @@ -509,6 +536,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], hyscale(c, alpPixBuf[lumBufIndex], dstW, src1, srcW, lumXInc, hLumFilter, hLumFilterPos, hLumFilterSize, formatConvBuffer, pal, 1); +#endif lastInLumBuf++; DEBUG_BUFFERS("\t\tlumBufIndex %d: lastInLumBuf: %d\n", lumBufIndex, lastInLumBuf); @@ -763,6 +791,8 @@ SwsFunc ff_getSwsFunc(SwsContext *c) if (ARCH_X86) ff_sws_init_swscale_x86(c); + ff_init_filters(c); + return swscale; } @@ -1151,4 +1181,3 @@ int attribute_align_arg sws_scale(struct SwsContext *c, av_free(rgb0_tmp); return ret; } - diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 2299aa5..8a3a1a3 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -269,6 +269,9 @@ typedef void (*yuv2anyX_fn)(struct SwsContext *c, const int16_t *lumFilter, const int16_t **alpSrc, uint8_t **dest, int dstW, int y); +struct SwsSlice; +struct SwsFilterDescriptor; + /* This struct should be aligned on at least a 32-byte boundary. */ typedef struct SwsContext { /** @@ -319,6 +322,10 @@ typedef struct SwsContext { uint16_t *gamma; uint16_t *inv_gamma; + int numDesc; + struct SwsSlice *slice; + struct SwsFilterDescriptor *desc; + uint32_t pal_yuv[256]; uint32_t pal_rgb[256]; @@ -908,4 +915,38 @@ static inline void fillPlane16(uint8_t *plane, int stride, int width, int height } } + +typedef struct SwsPlane +{ + int available_lines; + int sliceY; + int sliceH; + uint8_t **line; +} SwsPlane; + +typedef struct SwsSlice +{ + int width; + int h_chr_sub_sample; + int v_chr_sub_sample; + enum AVPixelFormat fmt; + SwsPlane plane[4]; +} SwsSlice; + +typedef struct SwsFilterDescriptor +{ + SwsSlice * src; + SwsSlice * dst; + + uint16_t * filter; + int * filter_pos; + int filter_size; + + int alpha; + int xInc; + uint32_t * pal; + + int (*process)(SwsContext*, struct SwsFilterDescriptor*, int, int); +} SwsFilterDescriptor; + #endif /* SWSCALE_SWSCALE_INTERNAL_H */ diff --git a/libswscale/utils.c b/libswscale/utils.c index c384aa5..252423d 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -2028,6 +2028,8 @@ void sws_freeFilter(SwsFilter *filter) av_free(filter); } +extern int ff_free_filters(SwsContext *s); + void sws_freeContext(SwsContext *c) { int i; @@ -2102,6 +2104,7 @@ void sws_freeContext(SwsContext *c) av_freep(&c->gamma); av_freep(&c->inv_gamma); + ff_free_filters(c); av_free(c); } -- 1.9.1 From 2bc4e6c4b65dfb6e0faf95bec0d85080e13438f4 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Mon, 15 Jun 2015 12:48:02 -0300 Subject: [PATCH 02/11] FIXES av_free -> av_freep av_malloc -> av_malloc_array added slice.c to Makefile correct slice alocation size --- libswscale/Makefile | 1 + libswscale/slice.c | 55 +++++++++++++++++++------------------------ libswscale/swscale.c | 1 - libswscale/swscale_internal.h | 8 ++++++- 4 files changed, 32 insertions(+), 33 deletions(-) diff --git a/libswscale/Makefile b/libswscale/Makefile index a60b057..d876e75 100644 --- a/libswscale/Makefile +++ b/libswscale/Makefile @@ -14,6 +14,7 @@ OBJS = hscale_fast_bilinear.o \ swscale_unscaled.o \ utils.o \ yuv2rgb.o \ + slice.o \ OBJS-$(CONFIG_SHARED) += log2_tab.o diff --git a/libswscale/slice.c b/libswscale/slice.c index 4f40ae6..353d214 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -1,25 +1,5 @@ #include "swscale_internal.h" -/* -int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_sub_sample, int h_sub_sample); -void free_slice(SwsSlice *s); -int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int sliceY, int sliceH); -int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH); -int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH); -*/ - - -int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); -int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); - -/* -int ff_init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal); -int ff_init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc); -*/ - -int ff_init_filters(SwsContext *c); -int ff_free_filters(SwsContext *c); - static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_sub_sample, int h_sub_sample) { @@ -38,7 +18,7 @@ static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_su for (i = 0; i < 4; ++i) { - s->plane[i].line = av_malloc(sizeof(uint8_t*) * size[i]); + s->plane[i].line = av_malloc_array(sizeof(uint8_t*), size[i]); if (!s->plane[i].line) { err = AVERROR(ENOMEM); @@ -52,7 +32,7 @@ static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_su if (err) { for (--i; i >= 0; --i) - av_free(s->plane[i].line); + av_freep(&s->plane[i].line); return err; } return 1; @@ -62,32 +42,37 @@ static void free_slice(SwsSlice *s) { int i; for (i = 0; i < 4; ++i) - av_free(s->plane[i].line); + av_freep(&s->plane[i].line); } int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip) { int i = 0; - int start[4] = {sliceY, + const int start[4] = {sliceY, sliceY >> s->v_chr_sub_sample, sliceY >> s->v_chr_sub_sample, sliceY}; - int stride1[4] = {stride[0], + const int stride1[4] = {stride[0], stride[1] << skip, stride[2] << skip, stride[3]}; + const int height[4] = {sliceH, + FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), + FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), + sliceH}; + s->width = srcW; for (i = 0; i < 4; ++i) { int j; - int lines = FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample); + int lines = height[i]; lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; - s->plane[i].sliceY = sliceY; + s->plane[i].sliceY = start[i]; s->plane[i].sliceH = lines; for (j = 0; j < lines; j+= 1 << skip) @@ -101,14 +86,22 @@ int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int src int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH) { int i; + const int start[4] = {sliceY, + sliceY >> s->v_chr_sub_sample, + sliceY >> s->v_chr_sub_sample, + sliceY}; + const int height[4] = {sliceH, + FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), + FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), + sliceH}; s->width = dstW; for (i = 0; i < 4; ++i) { int j; - int lines = FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample); + int lines = height[i]; lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; - s->plane[i].sliceY = sliceY; + s->plane[i].sliceY = start[i]; s->plane[i].sliceH = lines; for (j = 0; j < lines; ++j) @@ -264,8 +257,8 @@ int ff_init_filters(SwsContext * c) int need_convert = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; c->numDesc = need_convert ? 2 : 1; - c->desc = av_malloc(sizeof(SwsFilterDescriptor) * c->numDesc); - c->slice = av_malloc(sizeof(SwsSlice) * (c->numDesc+1)); + c->desc = av_malloc_array(sizeof(SwsFilterDescriptor), c->numDesc); + c->slice = av_malloc_array(sizeof(SwsSlice), c->numDesc + 1); for (i = 0; i < c->numDesc+1; ++i) alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, 0, 0); diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 0fcf59b..34ca956 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -316,7 +316,6 @@ static av_always_inline void hcscale(SwsContext *c, int16_t *dst1, av_log(c, AV_LOG_DEBUG, __VA_ARGS__) -#include "slice.c" static int swscale(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 8a3a1a3..40b8633 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -915,6 +915,7 @@ static inline void fillPlane16(uint8_t *plane, int stride, int width, int height } } +#define MAX_SLICE_PLANES 4 typedef struct SwsPlane { @@ -930,7 +931,7 @@ typedef struct SwsSlice int h_chr_sub_sample; int v_chr_sub_sample; enum AVPixelFormat fmt; - SwsPlane plane[4]; + SwsPlane plane[MAX_SLICE_PLANES]; } SwsSlice; typedef struct SwsFilterDescriptor @@ -949,4 +950,9 @@ typedef struct SwsFilterDescriptor int (*process)(SwsContext*, struct SwsFilterDescriptor*, int, int); } SwsFilterDescriptor; +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); +int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); +int ff_init_filters(SwsContext *c); +int ff_free_filters(SwsContext *c); + #endif /* SWSCALE_SWSCALE_INTERNAL_H */ -- 1.9.1 From a7d20dd467ea1faa1c6c312bb1df6117182af420 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Wed, 17 Jun 2015 18:45:16 -0300 Subject: [PATCH 03/11] swscale: move variable filter attributes to filter->instance --- libswscale/slice.c | 41 +++++++++++++++++++++++++++++------------ libswscale/swscale_internal.h | 20 ++++++++++++++------ libswscale/utils.c | 2 -- 3 files changed, 43 insertions(+), 20 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 353d214..23532d8 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -137,9 +137,10 @@ static int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int slic static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) { + LumScaleInstance *instance = desc->instance; int srcW = desc->src->width; int dstW = desc->dst->width; - int xInc = desc->xInc; + int xInc = instance->xInc; uint8_t ** src = desc->src->plane[0].line; uint8_t ** dst = desc->dst->plane[0].line; @@ -150,8 +151,8 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int if (!c->hyscale_fast) { - c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], desc->filter, - desc->filter_pos, desc->filter_size); + c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter, + instance->filter_pos, instance->filter_size); } else { // fast bilinear upscale / crap downscale c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc); } @@ -171,8 +172,8 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int if (!c->hyscale_fast) { - c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], desc->filter, - desc->filter_pos, desc->filter_size); + c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter, + instance->filter_pos, instance->filter_size); } else { // fast bilinear upscale / crap downscale c->hyscale_fast(c, (int16_t*)dst[dst_pos], dstW, src[src_pos], srcW, xInc); } @@ -186,7 +187,8 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) { int srcW = desc->src->width; - uint32_t * pal = desc->pal; + LumConvertInstance * instance = desc->instance; + uint32_t * pal = instance->pal; int sp = sliceY - desc->src->plane[0].sliceY; int dp = sliceY - desc->dst->plane[0].sliceY; @@ -225,8 +227,13 @@ static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int static int init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal) { + LumConvertInstance * li = av_malloc(sizeof(LumConvertInstance)); + if (!li) + return AVERROR(ENOMEM); + li->pal = pal; + desc->instance = li; + desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); - desc->pal = pal; desc->src =src; desc->dst = dst; desc->process = &lum_convert; @@ -237,15 +244,21 @@ static int init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsS static int init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc) { - desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); - desc->filter = filter; - desc->filter_pos = filter_pos; - desc->filter_size = filter_size; + LumScaleInstance *li = av_malloc(sizeof(LumScaleInstance)); + if (!li) + return AVERROR(ENOMEM); + + li->filter = filter; + li->filter_pos = filter_pos; + li->filter_size = filter_size; + li->xInc = xInc; + + desc->instance = li; + desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); desc->src = src; desc->dst = dst; - desc->xInc = xInc; desc->process = &lum_h_scale; return 1; @@ -281,6 +294,10 @@ int ff_init_filters(SwsContext * c) int ff_free_filters(SwsContext *c) { + int i; + for (i = 0; i < c->numDesc; ++i) + av_freep(&c->desc->instance); + av_freep(&c->desc); if (c->slice) { diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 40b8633..2fbbb5e 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -939,17 +939,25 @@ typedef struct SwsFilterDescriptor SwsSlice * src; SwsSlice * dst; - uint16_t * filter; - int * filter_pos; - int filter_size; - int alpha; - int xInc; - uint32_t * pal; + void * instance; int (*process)(SwsContext*, struct SwsFilterDescriptor*, int, int); } SwsFilterDescriptor; +typedef struct LumConvertInstance +{ + uint32_t * pal; +} LumConvertInstance; + +typedef struct LumScaleInstance +{ + uint16_t * filter; + int * filter_pos; + int filter_size; + int xInc; +} LumScaleInstance; + int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); int ff_init_filters(SwsContext *c); diff --git a/libswscale/utils.c b/libswscale/utils.c index 252423d..a19ea38 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -2028,8 +2028,6 @@ void sws_freeFilter(SwsFilter *filter) av_free(filter); } -extern int ff_free_filters(SwsContext *s); - void sws_freeContext(SwsContext *c) { int i; -- 1.9.1 From 70d7ca477c90a9c295ed2164a43365422fb70b16 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 21 Jun 2015 12:52:26 -0300 Subject: [PATCH 04/11] swscale: rename instance structs --- libswscale/slice.c | 8 ++++---- libswscale/swscale_internal.h | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 23532d8..07eb754 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -137,7 +137,7 @@ static int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int slic static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) { - LumScaleInstance *instance = desc->instance; + ScaleInstance *instance = desc->instance; int srcW = desc->src->width; int dstW = desc->dst->width; int xInc = instance->xInc; @@ -187,7 +187,7 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) { int srcW = desc->src->width; - LumConvertInstance * instance = desc->instance; + ConvertInstance * instance = desc->instance; uint32_t * pal = instance->pal; int sp = sliceY - desc->src->plane[0].sliceY; @@ -227,7 +227,7 @@ static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int static int init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal) { - LumConvertInstance * li = av_malloc(sizeof(LumConvertInstance)); + ConvertInstance * li = av_malloc(sizeof(ConvertInstance)); if (!li) return AVERROR(ENOMEM); li->pal = pal; @@ -244,7 +244,7 @@ static int init_desc_fmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsS static int init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc) { - LumScaleInstance *li = av_malloc(sizeof(LumScaleInstance)); + ScaleInstance *li = av_malloc(sizeof(ScaleInstance)); if (!li) return AVERROR(ENOMEM); diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 2fbbb5e..3256b7b 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -945,18 +945,18 @@ typedef struct SwsFilterDescriptor int (*process)(SwsContext*, struct SwsFilterDescriptor*, int, int); } SwsFilterDescriptor; -typedef struct LumConvertInstance +typedef struct ConvertInstance { uint32_t * pal; -} LumConvertInstance; +} ConvertInstance; -typedef struct LumScaleInstance +typedef struct ScaleInstance { uint16_t * filter; int * filter_pos; int filter_size; int xInc; -} LumScaleInstance; +} ScaleInstance; int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); -- 1.9.1 From dd644e7747d2fd48889d48bf31e7599f8dbe13f0 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 21 Jun 2015 13:26:12 -0300 Subject: [PATCH 05/11] swscale: corrected chroma sliceY & sliceH --- libswscale/slice.c | 63 ++++++++++++++++++++++--------------------- libswscale/swscale.c | 4 +-- libswscale/swscale_internal.h | 4 +-- 3 files changed, 36 insertions(+), 35 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 07eb754..3fff0fd 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -1,15 +1,15 @@ #include "swscale_internal.h" -static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lines, int v_sub_sample, int h_sub_sample) +static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample) { int i; int err = 0; - - int size[4] = { lines, - FF_CEIL_RSHIFT(lines, v_sub_sample), - FF_CEIL_RSHIFT(lines, v_sub_sample), - lines }; + + int size[4] = { lumLines, + chrLines, + chrLines, + lumLines }; //s->width; s->h_chr_sub_sample = h_sub_sample; @@ -45,24 +45,24 @@ static void free_slice(SwsSlice *s) av_freep(&s->plane[i].line); } -int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip) +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH, int skip) { int i = 0; - const int start[4] = {sliceY, - sliceY >> s->v_chr_sub_sample, - sliceY >> s->v_chr_sub_sample, - sliceY}; - + const int start[4] = {lumY, + chrY, + chrY, + lumY}; + const int stride1[4] = {stride[0], - stride[1] << skip, - stride[2] << skip, - stride[3]}; - - const int height[4] = {sliceH, - FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), - FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), - sliceH}; + stride[1] << skip, + stride[2] << skip, + stride[3]}; + + const int height[4] = {lumH, + chrH, + chrH, + lumH}; s->width = srcW; @@ -83,17 +83,18 @@ int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int src return 1; } -int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH) +int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int lumY, int lumH, int chrY, int chrH) { int i; - const int start[4] = {sliceY, - sliceY >> s->v_chr_sub_sample, - sliceY >> s->v_chr_sub_sample, - sliceY}; - const int height[4] = {sliceH, - FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), - FF_CEIL_RSHIFT(sliceH, s->v_chr_sub_sample), - sliceH}; + const int start[4] = {lumY, + chrY, + chrY, + lumY}; + + const int height[4] = {lumH, + chrH, + chrH, + lumH}; s->width = dstW; for (i = 0; i < 4; ++i) { @@ -117,7 +118,7 @@ int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int slice static int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int sliceY, int sliceH) { int i; - uint8_t *ptr[4] = {v, v, v, v2}; + uint8_t *ptr[4] = {v, v, v2, v2}; s->width = dstW; for (i = 0; i < 4; ++i) { @@ -274,7 +275,7 @@ int ff_init_filters(SwsContext * c) c->slice = av_malloc_array(sizeof(SwsSlice), c->numDesc + 1); for (i = 0; i < c->numDesc+1; ++i) - alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, 0, 0); + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample); i = 0; if (need_convert) diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 34ca956..5e55d98 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -504,8 +504,8 @@ static int swscale(SwsContext *c, const uint8_t *src[], line_pool[2] = &chrVPixBuf[chrBufIndex + 1]; line_pool[3] = alpPixBuf ? &alpPixBuf[lumBufIndex + 1] : NULL; - ff_init_slice_from_src(src_slice, (uint8_t**)src, srcStride, c->srcW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, 0); - ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf); + ff_init_slice_from_src(src_slice, (uint8_t**)src, srcStride, c->srcW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf, 0); + ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf); #endif // Do horizontal scaling diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 3256b7b..1941503 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -958,8 +958,8 @@ typedef struct ScaleInstance int xInc; } ScaleInstance; -int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int sliceY, int sliceH, int skip); -int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int sliceY, int sliceH); +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH, int skip); +int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int lumY, int lumH, int chrY, int chrH); int ff_init_filters(SwsContext *c); int ff_free_filters(SwsContext *c); -- 1.9.1 From 45e9bdb67f3723af779f64a1df726e1443bcb740 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 21 Jun 2015 14:01:07 -0300 Subject: [PATCH 06/11] swscale: initial horizontal chroma scaling work --- libswscale/slice.c | 122 ++++++++++++++++++++++++++++++++++++++++-- libswscale/swscale.c | 8 +-- libswscale/swscale_internal.h | 1 + 3 files changed, 123 insertions(+), 8 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 3fff0fd..95dc9b7 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -265,30 +265,142 @@ static int init_desc_hscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice * return 1; } +static int chr_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) +{ + ScaleInstance *instance = desc->instance; + int srcW = FF_CEIL_RSHIFT(desc->src->width, desc->src->h_chr_sub_sample); + int dstW = FF_CEIL_RSHIFT(desc->dst->width, desc->dst->h_chr_sub_sample); + int xInc = instance->xInc; + + uint8_t ** src1 = desc->src->plane[1].line; + uint8_t ** dst1 = desc->dst->plane[1].line; + uint8_t ** src2 = desc->src->plane[2].line; + uint8_t ** dst2 = desc->dst->plane[2].line; + + int src_pos1 = sliceY - desc->src->plane[1].sliceY; + int dst_pos1 = sliceY - desc->dst->plane[1].sliceY; + + int src_pos2 = sliceY - desc->src->plane[2].sliceY; + int dst_pos2 = sliceY - desc->dst->plane[2].sliceY; + + + + if (!c->hcscale_fast) { + c->hcScale(c, (uint16_t*)dst1[dst_pos1], dstW, src1[src_pos1], instance->filter, instance->filter_pos, instance->filter_size); + c->hcScale(c, (uint16_t*)dst2[dst_pos2], dstW, src2[src_pos2], instance->filter, instance->filter_pos, instance->filter_size); + } else { // fast bilinear upscale / crap downscale + c->hcscale_fast(c, (uint16_t*)dst1[dst_pos1], (uint16_t*)dst2[dst_pos2], dstW, src1[src_pos1], src2[src_pos2], srcW, xInc); + } + + if (c->chrConvertRange) + c->chrConvertRange((uint16_t*)dst1[dst_pos1], (uint16_t*)dst2[dst_pos2], dstW); + + return 1; +} + +static int chr_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) +{ + int srcW = FF_CEIL_RSHIFT(desc->src->width, desc->src->h_chr_sub_sample); + ConvertInstance * instance = desc->instance; + uint32_t * pal = instance->pal; + + int sp = sliceY - desc->src->plane[1].sliceY; + int dp = sliceY - desc->dst->plane[1].sliceY; + + const uint8_t * src[4] = { desc->src->plane[0].line[sp], + desc->src->plane[1].line[sp], + desc->src->plane[2].line[sp], + desc->src->plane[3].line[sp]}; + uint8_t * dst1 = desc->dst->plane[1].line[0/*dp*/]; + uint8_t * dst2 = desc->dst->plane[2].line[0/*dp*/]; + + desc->dst->plane[1].sliceY = sliceY; + desc->dst->plane[1].sliceH = sliceH; + desc->dst->plane[2].sliceY = sliceY; + desc->dst->plane[2].sliceH = sliceH; + + if (c->chrToYV12) { + c->chrToYV12(dst1, dst2, src[0], src[1], src[2], srcW, pal); + } else if (c->readChrPlanar) { + c->readChrPlanar(dst1, dst2, src, srcW, c->input_rgb2yuv_table); + } + + return 1; +} + +static int init_desc_cfmt_convert(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst, uint32_t *pal) +{ + ConvertInstance * li = av_malloc(sizeof(ConvertInstance)); + if (!li) + return AVERROR(ENOMEM); + li->pal = pal; + desc->instance = li; + + desc->src =src; + desc->dst = dst; + desc->process = &chr_convert; + + return 1; +} + +static int init_desc_chscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice *dst, uint16_t *filter, int * filter_pos, int filter_size, int xInc) +{ + ScaleInstance *li = av_malloc(sizeof(ScaleInstance)); + if (!li) + return AVERROR(ENOMEM); + + li->filter = filter; + li->filter_pos = filter_pos; + li->filter_size = filter_size; + li->xInc = xInc; + + desc->instance = li; + + desc->alpha = isALPHA(src->fmt) && isALPHA(dst->fmt); + desc->src = src; + desc->dst = dst; + + desc->process = &chr_h_scale; + + return 1; +} + int ff_init_filters(SwsContext * c) { int i; + int numDescPerChannel; int need_convert = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; - c->numDesc = need_convert ? 2 : 1; + numDescPerChannel= need_convert ? 2 : 1; + c->numSlice = numDescPerChannel + 1; + + c->numDesc = (c->needs_hcscale ? 2 : 1) * numDescPerChannel; + c->desc = av_malloc_array(sizeof(SwsFilterDescriptor), c->numDesc); - c->slice = av_malloc_array(sizeof(SwsSlice), c->numDesc + 1); + c->slice = av_malloc_array(sizeof(SwsSlice), c->numSlice); - for (i = 0; i < c->numDesc+1; ++i) + for (i = 0; i < c->numSlice-1; ++i) alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample); + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample); i = 0; if (need_convert) { - init_desc_fmt_convert(&c->desc[i], &c->slice[i], &c->slice[i+1], (uint32_t) usePal(c->srcFormat) ? c->pal_yuv : c->input_rgb2yuv_table); + init_desc_fmt_convert(&c->desc[i], &c->slice[i], &c->slice[i+1], usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table); init_slice_1(&c->slice[i+1], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); c->desc[i].alpha = c->alpPixBuf != 0; + + if (c->needs_hcscale) + init_desc_cfmt_convert(&c->desc[i+numDescPerChannel], &c->slice[i], &c->slice[i+1], usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table); + ++i; } init_desc_hscale(&c->desc[i], &c->slice[i], &c->slice[i+1], c->hLumFilter, c->hLumFilterPos, c->hLumFilterSize, c->lumXInc); c->desc[i].alpha = c->alpPixBuf != 0; + if (c->needs_hcscale) + init_desc_chscale(&c->desc[i+numDescPerChannel], &c->slice[i], &c->slice[i+1], c->hChrFilter, c->hChrFilterPos, c->hChrFilterSize, c->chrXInc); return 1; } @@ -303,7 +415,7 @@ int ff_free_filters(SwsContext *c) if (c->slice) { int i; - for (i = 0; i < c->numDesc+1; ++i) + for (i = 0; i < c->numSlice; ++i) free_slice(&c->slice[i]); } return 1; diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 5e55d98..eaaefc3 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -374,8 +374,10 @@ static int swscale(SwsContext *c, const uint8_t *src[], int perform_gamma = c->is_internal_gamma; int numDesc = c->numDesc; - SwsSlice *src_slice = &c->slice[0]; - SwsSlice *dst_slice = &c->slice[numDesc]; + int lumStart = 0; + int lumEnd = c->needs_hcscale ? numDesc / 2 : numDesc; + SwsSlice *src_slice = &c->slice[lumStart]; + SwsSlice *dst_slice = &c->slice[lumEnd]; SwsFilterDescriptor *desc = c->desc; int16_t **line_pool[4]; @@ -525,7 +527,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], if (perform_gamma) gamma_convert((uint8_t **)src1, srcW, c->inv_gamma); #if NEW_FILTER - for (i = 0; i < numDesc; ++i) + for (i = lumStart; i < lumEnd; ++i) desc[i].process(c, &desc[i], lastInLumBuf + 1, 1); #else hyscale(c, lumPixBuf[lumBufIndex], dstW, src1, srcW, lumXInc, diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 1941503..f7d4f56 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -323,6 +323,7 @@ typedef struct SwsContext { uint16_t *inv_gamma; int numDesc; + int numSlice; struct SwsSlice *slice; struct SwsFilterDescriptor *desc; -- 1.9.1 From 27f5ec99cf850fc2ba1aa784833bd1176d79aea7 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 21 Jun 2015 21:57:16 -0300 Subject: [PATCH 07/11] swscale: plug chroma horizontal scaling --- libswscale/slice.c | 66 +++++++++++++++++++++++++++++++------------ libswscale/swscale.c | 13 +++++++-- libswscale/swscale_internal.h | 1 + 3 files changed, 59 insertions(+), 21 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 95dc9b7..770b974 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -368,13 +368,24 @@ static int init_desc_chscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice int ff_init_filters(SwsContext * c) { int i; - int numDescPerChannel; - int need_convert = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; + int index; + int num_ydesc; + int num_cdesc; + int need_lum_conv = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; + int need_chr_conv = c->chrToYV12 || c->readChrPlanar; + int srcIdx, dstIdx; - numDescPerChannel= need_convert ? 2 : 1; - c->numSlice = numDescPerChannel + 1; + uint32_t * pal = usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table; - c->numDesc = (c->needs_hcscale ? 2 : 1) * numDescPerChannel; + num_ydesc = need_lum_conv ? 2 : 1; + num_cdesc = c->needs_hcscale ? (need_chr_conv ? 2 : 1) : 0; + + c->numSlice = FFMAX(num_ydesc, num_cdesc) + 1; + c->numDesc = num_ydesc + num_cdesc; + c->descIndex[0] = num_ydesc; + c->descIndex[1] = num_ydesc + num_cdesc; + + c->desc = av_malloc_array(sizeof(SwsFilterDescriptor), c->numDesc); c->slice = av_malloc_array(sizeof(SwsSlice), c->numSlice); @@ -383,24 +394,43 @@ int ff_init_filters(SwsContext * c) alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample); alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample); - i = 0; - if (need_convert) - { - init_desc_fmt_convert(&c->desc[i], &c->slice[i], &c->slice[i+1], usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table); - init_slice_1(&c->slice[i+1], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); - c->desc[i].alpha = c->alpPixBuf != 0; + index = 0; + srcIdx = 0; + dstIdx = 1; - if (c->needs_hcscale) - init_desc_cfmt_convert(&c->desc[i+numDescPerChannel], &c->slice[i], &c->slice[i+1], usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table); + // temp slice for color space conversion + if (need_lum_conv || need_chr_conv) + init_slice_1(&c->slice[dstIdx], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); - ++i; + if (need_lum_conv) + { + init_desc_fmt_convert(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx], pal); + c->desc[index].alpha = c->alpPixBuf != 0; + ++index; + srcIdx = dstIdx; } - - init_desc_hscale(&c->desc[i], &c->slice[i], &c->slice[i+1], c->hLumFilter, c->hLumFilterPos, c->hLumFilterSize, c->lumXInc); - c->desc[i].alpha = c->alpPixBuf != 0; + + dstIdx = FFMAX(num_ydesc, num_cdesc); + init_desc_hscale(&c->desc[index], &c->slice[index], &c->slice[dstIdx], c->hLumFilter, c->hLumFilterPos, c->hLumFilterSize, c->lumXInc); + c->desc[index].alpha = c->alpPixBuf != 0; + + + ++index; if (c->needs_hcscale) - init_desc_chscale(&c->desc[i+numDescPerChannel], &c->slice[i], &c->slice[i+1], c->hChrFilter, c->hChrFilterPos, c->hChrFilterSize, c->chrXInc); + { + srcIdx = 0; + dstIdx = 1; + if (need_chr_conv) + { + init_desc_cfmt_convert(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx], pal); + ++index; + srcIdx = dstIdx; + } + + dstIdx = FFMAX(num_ydesc, num_cdesc); + init_desc_chscale(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx], c->hChrFilter, c->hChrFilterPos, c->hChrFilterSize, c->chrXInc); + } return 1; } diff --git a/libswscale/swscale.c b/libswscale/swscale.c index eaaefc3..4770f13 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -375,9 +375,11 @@ static int swscale(SwsContext *c, const uint8_t *src[], int numDesc = c->numDesc; int lumStart = 0; - int lumEnd = c->needs_hcscale ? numDesc / 2 : numDesc; + int lumEnd = c->descIndex[0]; + int chrStart = lumEnd; + int chrEnd = c->descIndex[1]; SwsSlice *src_slice = &c->slice[lumStart]; - SwsSlice *dst_slice = &c->slice[lumEnd]; + SwsSlice *dst_slice = &c->slice[c->numSlice-1]; SwsFilterDescriptor *desc = c->desc; int16_t **line_pool[4]; @@ -549,17 +551,22 @@ static int swscale(SwsContext *c, const uint8_t *src[], src[2] + (lastInChrBuf + 1 - chrSrcSliceY) * srcStride[2], src[3] + (lastInChrBuf + 1 - chrSrcSliceY) * srcStride[3], }; + int i; chrBufIndex++; av_assert0(chrBufIndex < 2 * vChrBufSize); av_assert0(lastInChrBuf + 1 - chrSrcSliceY < (chrSrcSliceH)); av_assert0(lastInChrBuf + 1 - chrSrcSliceY >= 0); // FIXME replace parameters through context struct (some at least) - +#if NEW_FILTER + for (i = chrStart; i < chrEnd; ++i) + desc[i].process(c, &desc[i], lastInChrBuf + 1, 1); +#else if (c->needs_hcscale) hcscale(c, chrUPixBuf[chrBufIndex], chrVPixBuf[chrBufIndex], chrDstW, src1, chrSrcW, chrXInc, hChrFilter, hChrFilterPos, hChrFilterSize, formatConvBuffer, pal); +#endif lastInChrBuf++; DEBUG_BUFFERS("\t\tchrBufIndex %d: lastInChrBuf: %d\n", chrBufIndex, lastInChrBuf); diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index f7d4f56..0ba37bd 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -323,6 +323,7 @@ typedef struct SwsContext { uint16_t *inv_gamma; int numDesc; + int descIndex[2]; int numSlice; struct SwsSlice *slice; struct SwsFilterDescriptor *desc; -- 1.9.1 From 231b77ce7fbb84e381211253ba9c546a33ba194a Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Fri, 3 Jul 2015 10:55:45 -0300 Subject: [PATCH 08/11] swscale: fix seg fault when accessing src slice --- libswscale/slice.c | 79 ++++++++++++++++++++++++++----------------- libswscale/swscale.c | 5 ++- libswscale/swscale_internal.h | 3 +- 3 files changed, 54 insertions(+), 33 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 770b974..8cc071e 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -1,7 +1,7 @@ #include "swscale_internal.h" -static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample) +static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample, int ring) { int i; int err = 0; @@ -15,10 +15,11 @@ static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int c s->h_chr_sub_sample = h_sub_sample; s->v_chr_sub_sample = v_sub_sample; s->fmt = fmt; + s->is_ring = ring; for (i = 0; i < 4; ++i) { - s->plane[i].line = av_malloc_array(sizeof(uint8_t*), size[i]); + s->plane[i].line = av_malloc_array(sizeof(uint8_t*), size[i] * ( ring == 0 ? 1 : 2)); if (!s->plane[i].line) { err = AVERROR(ENOMEM); @@ -45,7 +46,7 @@ static void free_slice(SwsSlice *s) av_freep(&s->plane[i].line); } -int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH, int skip) +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH) { int i = 0; @@ -53,30 +54,39 @@ int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int src chrY, chrY, lumY}; - - const int stride1[4] = {stride[0], - stride[1] << skip, - stride[2] << skip, - stride[3]}; - - const int height[4] = {lumH, - chrH, - chrH, - lumH}; + + const int end[4] = {lumY +lumH, + chrY + chrH, + chrY + chrH, + lumY + lumH}; s->width = srcW; for (i = 0; i < 4; ++i) { int j; - int lines = height[i]; + int lines = end[i]; lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; - s->plane[i].sliceY = start[i]; - s->plane[i].sliceH = lines; + if (end[i] > s->plane[i].sliceY+s->plane[i].sliceH) + { + if (start[i] <= s->plane[i].sliceY+1) + s->plane[i].sliceY = FFMIN(start[i], s->plane[i].sliceY); + else + s->plane[i].sliceY = start[i]; + s->plane[i].sliceH = end[i] - s->plane[i].sliceY; + } + else + { + if (end[i] >= s->plane[i].sliceY) + s->plane[i].sliceH = s->plane[i].sliceY + s->plane[i].sliceH - start[i]; + else + s->plane[i].sliceH = end[i] - start[i]; + s->plane[i].sliceY = start[i]; + } - for (j = 0; j < lines; j+= 1 << skip) - s->plane[i].line[j] = src[i] + (start[i] + j) * stride1[i]; + for (j = start[i]; j < lines; j+= 1) + s->plane[i].line[j] = src[i] + (start[i] + j) * stride[i]; } @@ -191,13 +201,14 @@ static int lum_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int ConvertInstance * instance = desc->instance; uint32_t * pal = instance->pal; - int sp = sliceY - desc->src->plane[0].sliceY; + int sp0 = sliceY - desc->src->plane[0].sliceY; + int sp1 = (sliceY >> desc->src->v_chr_sub_sample) - desc->src->plane[1].sliceY; int dp = sliceY - desc->dst->plane[0].sliceY; - const uint8_t * src[4] = { desc->src->plane[0].line[sp], - desc->src->plane[1].line[sp], - desc->src->plane[2].line[sp], - desc->src->plane[3].line[sp]}; + const uint8_t * src[4] = { desc->src->plane[0].line[sp0], + desc->src->plane[1].line[sp1], + desc->src->plane[2].line[sp1], + desc->src->plane[3].line[sp0]}; uint8_t * dst = desc->dst->plane[0].line[0/*dp*/]; desc->dst->plane[0].sliceY = sliceY; @@ -304,13 +315,15 @@ static int chr_convert(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int ConvertInstance * instance = desc->instance; uint32_t * pal = instance->pal; - int sp = sliceY - desc->src->plane[1].sliceY; + int sp0 = (sliceY - (desc->src->plane[0].sliceY >> desc->src->v_chr_sub_sample)) << desc->src->v_chr_sub_sample; + int sp1 = sliceY - desc->src->plane[1].sliceY; int dp = sliceY - desc->dst->plane[1].sliceY; - const uint8_t * src[4] = { desc->src->plane[0].line[sp], - desc->src->plane[1].line[sp], - desc->src->plane[2].line[sp], - desc->src->plane[3].line[sp]}; + const uint8_t * src[4] = { desc->src->plane[0].line[sp0], + desc->src->plane[1].line[sp1], + desc->src->plane[2].line[sp1], + desc->src->plane[3].line[sp0]}; + uint8_t * dst1 = desc->dst->plane[1].line[0/*dp*/]; uint8_t * dst2 = desc->dst->plane[2].line[0/*dp*/]; @@ -374,6 +387,7 @@ int ff_init_filters(SwsContext * c) int need_lum_conv = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; int need_chr_conv = c->chrToYV12 || c->readChrPlanar; int srcIdx, dstIdx; + int maxSize; uint32_t * pal = usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table; @@ -390,9 +404,11 @@ int ff_init_filters(SwsContext * c) c->desc = av_malloc_array(sizeof(SwsFilterDescriptor), c->numDesc); c->slice = av_malloc_array(sizeof(SwsSlice), c->numSlice); - for (i = 0; i < c->numSlice-1; ++i) - alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample); - alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample); + maxSize = FFMAX(c->vLumFilterSize, c->vChrFilterSize << c->chrSrcVSubSample); + alloc_slice(&c->slice[0], c->srcFormat, c->srcH, c->chrSrcH, c->chrSrcHSubSample, c->chrSrcVSubSample, 0); + for (i = 1; i < c->numSlice-1; ++i) + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample, 0); + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample, 0); index = 0; srcIdx = 0; @@ -450,3 +466,4 @@ int ff_free_filters(SwsContext *c) } return 1; } + diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 4770f13..457a29a 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -451,6 +451,10 @@ static int swscale(SwsContext *c, const uint8_t *src[], } lastDstY = dstY; + ff_init_slice_from_src(src_slice, (uint8_t**)src, srcStride, c->srcW, + srcSliceY, srcSliceH, + chrSrcSliceY, chrSrcSliceH); + for (; dstY < dstH; dstY++) { const int chrDstY = dstY >> c->chrDstVSubSample; @@ -508,7 +512,6 @@ static int swscale(SwsContext *c, const uint8_t *src[], line_pool[2] = &chrVPixBuf[chrBufIndex + 1]; line_pool[3] = alpPixBuf ? &alpPixBuf[lumBufIndex + 1] : NULL; - ff_init_slice_from_src(src_slice, (uint8_t**)src, srcStride, c->srcW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf, 0); ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf); #endif diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 0ba37bd..1055a73 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -932,6 +932,7 @@ typedef struct SwsSlice int width; int h_chr_sub_sample; int v_chr_sub_sample; + int is_ring; enum AVPixelFormat fmt; SwsPlane plane[MAX_SLICE_PLANES]; } SwsSlice; @@ -960,7 +961,7 @@ typedef struct ScaleInstance int xInc; } ScaleInstance; -int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH, int skip); +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH); int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int lumY, int lumH, int chrY, int chrH); int ff_init_filters(SwsContext *c); int ff_free_filters(SwsContext *c); -- 1.9.1 From b25f212eeb1c8892b345b86715dcb9fca28edcdd Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Sun, 5 Jul 2015 21:42:57 -0300 Subject: [PATCH 09/11] swscale: WIP SwsSlice ring buffer --- libswscale/slice.c | 78 +++++++++++++++++++++++++++++++++++++++---- libswscale/swscale.c | 39 ++++++++++++++++++++-- libswscale/swscale_internal.h | 1 + 3 files changed, 109 insertions(+), 9 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 8cc071e..ca31df3 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -46,6 +46,57 @@ static void free_slice(SwsSlice *s) av_freep(&s->plane[i].line); } +static int alloc_lines(SwsSlice *s, int width) +{ + int i; + for (i = 0; i < 4; ++i) + { + int n = s->plane[i].available_lines; + int j; + for (j = 0; j < n; ++j) + { + s->plane[i].line[j] = av_mallocz(width); + if (s->is_ring) + s->plane[i].line[j+n] = s->plane[i].line[j]; + } + } + return 1; +} + +int ff_rotate_slice(SwsSlice *s, int lum, int chr) +{ + int i; + if (lum) + { + for (i = 0; i < 4; i+=3) + { + int n = s->plane[i].available_lines; + int l = s->plane[i].sliceH; + + if (l+lum >= n * 2) + { + s->plane[i].sliceY += n; + s->plane[i].sliceH -= n; + } + } + } + if (chr) + { + for (i = 1; i < 3; ++i) + { + int n = s->plane[i].available_lines; + int l = s->plane[i].sliceH; + + if (l+chr >= n * 2) + { + s->plane[i].sliceY += n; + s->plane[i].sliceH -= n; + } + } + } + return 1; +} + int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int srcW, int lumY, int lumH, int chrY, int chrH) { int i = 0; @@ -171,6 +222,7 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int if (c->lumConvertRange) c->lumConvertRange((int16_t*)dst[dst_pos], dstW); + desc->dst->plane[0].sliceH += 1; if (desc->alpha) { @@ -180,7 +232,7 @@ static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int src_pos = sliceY - desc->src->plane[3].sliceY; dst_pos = sliceY - desc->dst->plane[3].sliceY; - + desc->dst->plane[3].sliceH += 1; if (!c->hyscale_fast) { c->hyScale(c, (int16_t*)dst[dst_pos], dstW, (const uint8_t *)src[src_pos], instance->filter, @@ -306,6 +358,8 @@ static int chr_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int if (c->chrConvertRange) c->chrConvertRange((uint16_t*)dst1[dst_pos1], (uint16_t*)dst2[dst_pos2], dstW); + desc->dst->plane[1].sliceH += 1; + desc->dst->plane[2].sliceH += 1; return 1; } @@ -387,10 +441,13 @@ int ff_init_filters(SwsContext * c) int need_lum_conv = c->lumToYV12 || c->readLumPlanar || c->alpToYV12 || c->readAlpPlanar; int need_chr_conv = c->chrToYV12 || c->readChrPlanar; int srcIdx, dstIdx; - int maxSize; + int dst_stride = FFALIGN(c->dstW * sizeof(int16_t) + 66, 16); uint32_t * pal = usePal(c->srcFormat) ? c->pal_yuv : (uint32_t*)c->input_rgb2yuv_table; + if (c->dstBpc == 16) + dst_stride <<= 1; + num_ydesc = need_lum_conv ? 2 : 1; num_cdesc = c->needs_hcscale ? (need_chr_conv ? 2 : 1) : 0; @@ -404,19 +461,23 @@ int ff_init_filters(SwsContext * c) c->desc = av_malloc_array(sizeof(SwsFilterDescriptor), c->numDesc); c->slice = av_malloc_array(sizeof(SwsSlice), c->numSlice); - maxSize = FFMAX(c->vLumFilterSize, c->vChrFilterSize << c->chrSrcVSubSample); + alloc_slice(&c->slice[0], c->srcFormat, c->srcH, c->chrSrcH, c->chrSrcHSubSample, c->chrSrcVSubSample, 0); for (i = 1; i < c->numSlice-1; ++i) + { alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrSrcHSubSample, c->chrSrcVSubSample, 0); - alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample, 0); + alloc_lines(&c->slice[i], FFALIGN(c->srcW*2+78, 16)); + } + alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample, 1); + alloc_lines(&c->slice[i], dst_stride); index = 0; srcIdx = 0; dstIdx = 1; // temp slice for color space conversion - if (need_lum_conv || need_chr_conv) - init_slice_1(&c->slice[dstIdx], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); + //if (need_lum_conv || need_chr_conv) + // init_slice_1(&c->slice[dstIdx], c->formatConvBuffer, (c->formatConvBuffer + FFALIGN(c->srcW*2+78, 16)), c->srcW, 0, c->vLumFilterSize); if (need_lum_conv) { @@ -467,3 +528,8 @@ int ff_free_filters(SwsContext *c) return 1; } + + + + + diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 457a29a..7d6c3dd 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -455,6 +455,16 @@ static int swscale(SwsContext *c, const uint8_t *src[], srcSliceY, srcSliceH, chrSrcSliceY, chrSrcSliceH); + dst_slice->plane[0].sliceY = lastInLumBuf + 1; + dst_slice->plane[1].sliceY = lastInChrBuf + 1; + dst_slice->plane[2].sliceY = lastInChrBuf + 1; + dst_slice->plane[3].sliceY = lastInLumBuf + 1; + + dst_slice->plane[0].sliceH = + dst_slice->plane[1].sliceH = + dst_slice->plane[2].sliceH = + dst_slice->plane[3].sliceH = 0; + dst_slice->width = dstW; for (; dstY < dstH; dstY++) { const int chrDstY = dstY >> c->chrDstVSubSample; @@ -479,10 +489,20 @@ static int swscale(SwsContext *c, const uint8_t *src[], int enough_lines; // handle holes (FAST_BILINEAR & weird filters) - if (firstLumSrcY > lastInLumBuf) + if (firstLumSrcY > lastInLumBuf) { lastInLumBuf = firstLumSrcY - 1; - if (firstChrSrcY > lastInChrBuf) + dst_slice->plane[0].sliceY = lastInLumBuf + 1; + dst_slice->plane[3].sliceY = lastInLumBuf + 1; + dst_slice->plane[0].sliceH = + dst_slice->plane[3].sliceH = 0; + } + if (firstChrSrcY > lastInChrBuf) { lastInChrBuf = firstChrSrcY - 1; + dst_slice->plane[1].sliceY = lastInChrBuf + 1; + dst_slice->plane[2].sliceY = lastInChrBuf + 1; + dst_slice->plane[1].sliceH = + dst_slice->plane[2].sliceH = 0; + } av_assert0(firstLumSrcY >= lastInLumBuf - vLumBufSize + 1); av_assert0(firstChrSrcY >= lastInChrBuf - vChrBufSize + 1); @@ -506,7 +526,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], #define NEW_FILTER 1 -#if NEW_FILTER +#if 0 //NEW_FILTER line_pool[0] = &lumPixBuf[lumBufIndex + 1]; line_pool[1] = &chrUPixBuf[chrBufIndex + 1]; line_pool[2] = &chrVPixBuf[chrBufIndex + 1]; @@ -515,6 +535,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf); #endif + ff_rotate_slice(dst_slice, lastLumSrcY - lastInLumBuf, lastChrSrcY - lastInChrBuf); // Do horizontal scaling while (lastInLumBuf < lastLumSrcY) { const uint8_t *src1[4] = { @@ -599,11 +620,23 @@ static int swscale(SwsContext *c, const uint8_t *src[], } { + +#if NEW_FILTER + const int16_t **lumSrcPtr = (const int16_t **)(void*) dst_slice->plane[0].line + dst_slice->plane[0].sliceH - vLumFilterSize; + const int16_t **chrUSrcPtr = (const int16_t **)(void*) dst_slice->plane[1].line + dst_slice->plane[1].sliceH - vChrFilterSize; + const int16_t **chrVSrcPtr = (const int16_t **)(void*) dst_slice->plane[2].line + dst_slice->plane[2].sliceH - vChrFilterSize; + const int16_t **alpSrcPtr = (CONFIG_SWSCALE_ALPHA && alpPixBuf) ? + (const int16_t **)(void*) dst_slice->plane[3].line + dst_slice->plane[3].sliceH - vLumFilterSize : NULL; +#else const int16_t **lumSrcPtr = (const int16_t **)(void*) lumPixBuf + lumBufIndex + firstLumSrcY - lastInLumBuf + vLumBufSize; const int16_t **chrUSrcPtr = (const int16_t **)(void*) chrUPixBuf + chrBufIndex + firstChrSrcY - lastInChrBuf + vChrBufSize; const int16_t **chrVSrcPtr = (const int16_t **)(void*) chrVPixBuf + chrBufIndex + firstChrSrcY - lastInChrBuf + vChrBufSize; const int16_t **alpSrcPtr = (CONFIG_SWSCALE_ALPHA && alpPixBuf) ? (const int16_t **)(void*) alpPixBuf + lumBufIndex + firstLumSrcY - lastInLumBuf + vLumBufSize : NULL; +#endif + + + int16_t *vLumFilter = c->vLumFilter; int16_t *vChrFilter = c->vChrFilter; diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 1055a73..d832343 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -965,5 +965,6 @@ int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int src int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int lumY, int lumH, int chrY, int chrH); int ff_init_filters(SwsContext *c); int ff_free_filters(SwsContext *c); +int ff_rotate_slice(SwsSlice *s, int lum, int chr); #endif /* SWSCALE_SWSCALE_INTERNAL_H */ -- 1.9.1 From 355520ce3da6a52acc684cb339244010fc474e5f Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Wed, 8 Jul 2015 17:04:34 -0300 Subject: [PATCH 10/11] swscale: fix seg fault when dst format is grayscale --- libswscale/slice.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 47 insertions(+), 3 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index ca31df3..9286ac2 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -432,6 +432,47 @@ static int init_desc_chscale(SwsFilterDescriptor *desc, SwsSlice *src, SwsSlice return 1; } +static void fill_ones(SwsSlice *s, int n, int is16bit) +{ + int i; + for (i = 0; i < 4; ++i) + { + int j; + int size = s->plane[i].available_lines; + for (int j = 0; j < size; ++j) + { + int k; + int end = is16bit ? n>>1: n; + + if (is16bit) + for (k = 0; k < end; ++k) + ((int32_t*)(s->plane[i].line[j]))[k] = 1<<18; + else + for (k = 0; k < end; ++k) + ((int16_t*)(s->plane[i].line[j]))[k] = 1<<14; + } + } +} + +static int no_chr_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) +{ + desc->dst->plane[1].sliceY = sliceY + sliceH - desc->dst->plane[1].available_lines; + desc->dst->plane[1].sliceH = desc->dst->plane[1].available_lines; + desc->dst->plane[2].sliceY = sliceY + sliceH - desc->dst->plane[2].available_lines; + desc->dst->plane[2].sliceH = desc->dst->plane[2].available_lines; + return 0; +} + +static int init_desc_no_chr(SwsFilterDescriptor *desc, SwsSlice * src, SwsSlice *dst) +{ + desc->src = src; + desc->dst = dst; + desc->alpha = 0; + desc->instance = NULL; + desc->process = &no_chr_scale; + return 0; +} + int ff_init_filters(SwsContext * c) { int i; @@ -449,7 +490,7 @@ int ff_init_filters(SwsContext * c) dst_stride <<= 1; num_ydesc = need_lum_conv ? 2 : 1; - num_cdesc = c->needs_hcscale ? (need_chr_conv ? 2 : 1) : 0; + num_cdesc = /*c->needs_hcscale ? */(need_chr_conv ? 2 : 1)/* : 0*/; c->numSlice = FFMAX(num_ydesc, num_cdesc) + 1; c->numDesc = num_ydesc + num_cdesc; @@ -470,6 +511,7 @@ int ff_init_filters(SwsContext * c) } alloc_slice(&c->slice[i], c->srcFormat, c->vLumFilterSize, c->vChrFilterSize, c->chrDstHSubSample, c->chrDstVSubSample, 1); alloc_lines(&c->slice[i], dst_stride); + fill_ones(&c->slice[i], dst_stride>>1, c->dstBpc == 16); index = 0; srcIdx = 0; @@ -494,7 +536,6 @@ int ff_init_filters(SwsContext * c) ++index; - if (c->needs_hcscale) { srcIdx = 0; dstIdx = 1; @@ -506,7 +547,10 @@ int ff_init_filters(SwsContext * c) } dstIdx = FFMAX(num_ydesc, num_cdesc); - init_desc_chscale(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx], c->hChrFilter, c->hChrFilterPos, c->hChrFilterSize, c->chrXInc); + if (c->needs_hcscale) + init_desc_chscale(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx], c->hChrFilter, c->hChrFilterPos, c->hChrFilterSize, c->chrXInc); + else + init_desc_no_chr(&c->desc[index], &c->slice[srcIdx], &c->slice[dstIdx]); } return 1; -- 1.9.1 From 1179f1220750b4dd8c273fa5f418a24750a77a96 Mon Sep 17 00:00:00 2001 From: Pedro Arthur <bygran...@gmail.com> Date: Wed, 8 Jul 2015 22:05:21 -0300 Subject: [PATCH 11/11] swscale: added slice lines freeing functions +removed unused functions --- libswscale/slice.c | 122 +++++++++++++++++------------------------- libswscale/swscale.c | 10 ---- libswscale/swscale_internal.h | 1 + 3 files changed, 51 insertions(+), 82 deletions(-) diff --git a/libswscale/slice.c b/libswscale/slice.c index 9286ac2..04633e3 100644 --- a/libswscale/slice.c +++ b/libswscale/slice.c @@ -1,7 +1,47 @@ #include "swscale_internal.h" +static void free_lines(SwsSlice *s) +{ + int i; + for (i = 0; i < 4; ++i) + { + int n = s->plane[i].available_lines; + int j; + for (j = 0; j < n; ++j) + { + av_freep(&s->plane[i].line[j]); + if (s->is_ring) + s->plane[i].line[j+n] = NULL; + } + } + s->should_free_lines = 0; +} + +static int alloc_lines(SwsSlice *s, int width) +{ + int i; + s->should_free_lines = 1; + + for (i = 0; i < 4; ++i) + { + int n = s->plane[i].available_lines; + int j; + for (j = 0; j < n; ++j) + { + s->plane[i].line[j] = av_mallocz(width); + if (!s->plane[i].line[j]) + { + free_lines(s); + return AVERROR(ENOMEM); + } + if (s->is_ring) + s->plane[i].line[j+n] = s->plane[i].line[j]; + } + } + return 1; +} -static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample, int ring) +static int alloc_slice(SwsSlice *s, enum AVPixelFormat fmt, int lumLines, int chrLines, int h_sub_sample, int v_sub_sample, int ring) { int i; int err = 0; @@ -16,15 +56,21 @@ static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int c s->v_chr_sub_sample = v_sub_sample; s->fmt = fmt; s->is_ring = ring; + s->should_free_lines = 0; for (i = 0; i < 4; ++i) { - s->plane[i].line = av_malloc_array(sizeof(uint8_t*), size[i] * ( ring == 0 ? 1 : 2)); + int j; + int n = size[i] * ( ring == 0 ? 1 : 2); + s->plane[i].line = av_malloc_array(sizeof(uint8_t*), n); if (!s->plane[i].line) { err = AVERROR(ENOMEM); break; } + for (int j = 0; j < n; ++j) + s->plane[i].line[j] = NULL; + s->plane[i].available_lines = size[i]; s->plane[i].sliceY = 0; s->plane[i].sliceH = 0; @@ -42,27 +88,12 @@ static int alloc_slice(SwsSlice * s, enum AVPixelFormat fmt, int lumLines, int c static void free_slice(SwsSlice *s) { int i; + if (s->should_free_lines) + free_lines(s); for (i = 0; i < 4; ++i) av_freep(&s->plane[i].line); } -static int alloc_lines(SwsSlice *s, int width) -{ - int i; - for (i = 0; i < 4; ++i) - { - int n = s->plane[i].available_lines; - int j; - for (j = 0; j < n; ++j) - { - s->plane[i].line[j] = av_mallocz(width); - if (s->is_ring) - s->plane[i].line[j+n] = s->plane[i].line[j]; - } - } - return 1; -} - int ff_rotate_slice(SwsSlice *s, int lum, int chr) { int i; @@ -144,59 +175,6 @@ int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int src return 1; } -int ff_init_slice_from_lp(SwsSlice *s, uint8_t ***linesPool, int dstW, int lumY, int lumH, int chrY, int chrH) -{ - int i; - const int start[4] = {lumY, - chrY, - chrY, - lumY}; - - const int height[4] = {lumH, - chrH, - chrH, - lumH}; - s->width = dstW; - for (i = 0; i < 4; ++i) - { - int j; - int lines = height[i]; - lines = s->plane[i].available_lines < lines ? s->plane[i].available_lines : lines; - - s->plane[i].sliceY = start[i]; - s->plane[i].sliceH = lines; - - for (j = 0; j < lines; ++j) - { - uint8_t * v = linesPool[i] ? linesPool[i][j] : NULL; - s->plane[i].line[j] = v; - } - - } - return 1; -} - -static int init_slice_1(SwsSlice *s, uint8_t *v, uint8_t *v2, int dstW, int sliceY, int sliceH) -{ - int i; - uint8_t *ptr[4] = {v, v, v2, v2}; - s->width = dstW; - for (i = 0; i < 4; ++i) - { - int j; - int lines = s->plane[i].available_lines; - - s->plane[i].sliceY = sliceY; - s->plane[i].sliceH = lines; - - for (j = 0; j < lines; ++j) - s->plane[i].line[j] = ptr[i]; - - } - return 1; -} - - static int lum_h_scale(SwsContext *c, SwsFilterDescriptor *desc, int sliceY, int sliceH) { ScaleInstance *instance = desc->instance; diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 7d6c3dd..5facb99 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -525,16 +525,6 @@ static int swscale(SwsContext *c, const uint8_t *src[], #define NEW_FILTER 1 - -#if 0 //NEW_FILTER - line_pool[0] = &lumPixBuf[lumBufIndex + 1]; - line_pool[1] = &chrUPixBuf[chrBufIndex + 1]; - line_pool[2] = &chrVPixBuf[chrBufIndex + 1]; - line_pool[3] = alpPixBuf ? &alpPixBuf[lumBufIndex + 1] : NULL; - - ff_init_slice_from_lp(dst_slice, (uint8_t ***)line_pool, dstW, lastInLumBuf + 1, lastLumSrcY - lastInLumBuf, lastInChrBuf + 1, lastChrSrcY - lastInChrBuf); - -#endif ff_rotate_slice(dst_slice, lastLumSrcY - lastInLumBuf, lastChrSrcY - lastInChrBuf); // Do horizontal scaling while (lastInLumBuf < lastLumSrcY) { diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index d832343..82713fd 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -933,6 +933,7 @@ typedef struct SwsSlice int h_chr_sub_sample; int v_chr_sub_sample; int is_ring; + int should_free_lines; enum AVPixelFormat fmt; SwsPlane plane[MAX_SLICE_PLANES]; } SwsSlice; -- 1.9.1
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel