On Tue, Jul 28, 2015 at 06:06:04PM -0300, Pedro Arthur wrote: > All these weird bugs were caused by the mmx code expecting the chroma U > and V buffer to be contiguous in memory. As I was allocating the slice > lines U > and V separately the mmx code was using some random memory. > In the attached patch there is a fix for it. > I also added some documentation to the new structs.
very good, this works much better one remaining issue i could find is ./ffplay -i matrixbench_mpeg2.mpg -vf scale=96:96,format=nv12,scale=128:128:flags=1 shows a green line at the right border it happens also with some other formats but seems specific to flags=1 -cpuflags 0 doesnt help do you think this patch would be ready to push to main ffmpeg once this (and any other remaining) issues are fixed or is there still some speed loss ? [...] > +#define FREE_FILTERS_ON_ERROR(err, ctx) if ((err) < 0) { \ > + ff_free_filters((ctx)); \ > + return (err); \ > + } you can avoid the macro by using a goto yes i know goto sucks but such error cleanup is one of the very few cases where using a goto is IMO a good idea [...] > diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h > index 2299aa5..af4b635 100644 > --- a/libswscale/swscale_internal.h > +++ b/libswscale/swscale_internal.h > @@ -269,6 +269,9 @@ typedef void (*yuv2anyX_fn)(struct SwsContext *c, const > int16_t *lumFilter, > const int16_t **alpSrc, uint8_t **dest, > int dstW, int y); > > +struct SwsSlice; > +struct SwsFilterDescriptor; > + > /* This struct should be aligned on at least a 32-byte boundary. */ > typedef struct SwsContext { > /** > @@ -319,6 +322,12 @@ typedef struct SwsContext { > uint16_t *gamma; > uint16_t *inv_gamma; > > + int numDesc; > + int descIndex[2]; > + int numSlice; > + struct SwsSlice *slice; > + struct SwsFilterDescriptor *desc; > + > uint32_t pal_yuv[256]; > uint32_t pal_rgb[256]; > > @@ -908,4 +917,75 @@ static inline void fillPlane16(uint8_t *plane, int > stride, int width, int height > } > } > > +#define MAX_SLICE_PLANES 4 > + > +/// Slice plane > +typedef struct SwsPlane > +{ > + int available_lines; ///< max number of lines that can be hold by > this plane > + int sliceY; ///< index of first line > + int sliceH; ///< number of lines > + uint8_t **line; ///< line buffer > + uint8_t **tmp; ///< Tmp line buffer used by mmx code > +} SwsPlane; > + > +/** > + * Struct which defines a slice of an image to be scaled or a output for > + * a scaled slice. > + * A slice can also be used as intermediate ring buffer for scaling steps. > + */ > +typedef struct SwsSlice > +{ > + int width; ///< Slice line width > + int h_chr_sub_sample; ///< horizontal chroma subsampling factor > + int v_chr_sub_sample; ///< vertical chroma subsampling factor > + int is_ring; ///< flag to identify if this slice is a ring > buffer > + int should_free_lines; ///< flag to identify if there are dynamic > allocated lines > + enum AVPixelFormat fmt; ///< planes pixel format > + SwsPlane plane[MAX_SLICE_PLANES]; ///< color planes > +} SwsSlice; > + > +/** > + * Struct which holds all necessary data for processing a slice. > + * A processing step can be a color conversion or horizontal/vertical > scaling. > + */ > +typedef struct SwsFilterDescriptor > +{ > + SwsSlice *src; ///< Source slice > + SwsSlice *dst; ///< Output slice > + > + int alpha; ///< Flag for processing alpha channel > + void *instance; ///< Filter instance data > + > + /// Function for processing input slice sliceH lines starting from line > sliceY > + int (*process)(SwsContext *c, struct SwsFilterDescriptor *desc, int > sliceY, int sliceH); > +} SwsFilterDescriptor; > + > +/// Color conversion instance data > +typedef struct ColorContext > +{ > + uint32_t *pal; > +} ColorContext; > + > +/// Scaler instance data > +typedef struct FilterContext > +{ > + uint16_t *filter; > + int *filter_pos; > + int filter_size; > + int xInc; > +} FilterContext; This looks fine in what is in there but i think it should be restructured a bit. There should be a struct that contains only constant fields which describes a type/class of filter, like a palette->yuv filter or whatever (constant not in the sense of const but that they are the same for all instances) then a generic context for filter instances is needed which would contain all generic fields a instance needs and then a "private" context for a filter instance which would contain type/class specific fields ColorContext / FilterContext would match that private contexts already i think though FilterContext sounds a bit too generic for a private context in a "shared" header SwsFilterDescriptor though mixes "class" and instance parts the process() pointer i assume would be the same for all instances you could also look at the AVFilter struct for a general idea of such a "const" descriptor > + > +// warp input lines in the form (src + width*i + j) to slice format > (line[i][j]) > +int ff_init_slice_from_src(SwsSlice * s, uint8_t *src[4], int stride[4], int > srcW, int lumY, int lumH, int chrY, int chrH); > + > +// Initialize scaler filter descriptor chain > +int ff_init_filters(SwsContext *c); > + > +// Free all filter data > +int ff_free_filters(SwsContext *c); > + > +// function for applying ring buffer logic into slice s this should be a bit more verbose, "applying ring buffer logic" is not clear in what it does > +int ff_rotate_slice(SwsSlice *s, int lum, int chr); [...] Thanks -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Asymptotically faster algorithms should always be preferred if you have asymptotical amounts of data
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel