On Tue, Sep 30, 2014 at 11:42:20PM +1000, Peter Ross wrote: > +/** > + * Calculate FS44 ratio > + */ > +#define DSD_FS44(sample_rate) (sample_rate / 44100) > + > +/** > + * Calculate DST frame size > + * @return samples per frame (1-bit samples) > + */ > +#define DST_SAMPLES_PER_FRAME(sample_rate) (588 * DSD_FS44(sample_rate))
A header for just these seems a bit overkill. Also I'd prefer static inline functions over macros when there is no reason to use macros. > +static int read_map(GetBitContext *gb, Table *t, unsigned int > map[DST_MAX_CHANNELS], int channels) > +{ > + int ch; > + t->elements = 1; > + if (!get_bits1(gb)){ > + map[0] = 0; > + for (ch = 1; ch < channels; ch++) { > + int bits = av_log2(t->elements) + 1; > + map[ch] = get_bits(gb, bits); > + if (map[ch] == t->elements) { > + t->elements++; > + if (t->elements >= DST_MAX_ELEMENTS) > + return AVERROR_INVALIDDATA; > + } else if (map[ch] > t->elements) { > + return AVERROR_INVALIDDATA; > + } > + } > + } else { > + memset(map, 0, sizeof(*map) * DST_MAX_CHANNELS); > + } Maybe I just don't understand how this should be used (in that case a comment would be nice), but I find it suspicious that the if part initializes channels elements while the else part does initialize DST_MAX_CHANNELS elements. Personally I also prefer to have one-liners always in the "if" branch. > +static av_always_inline int get_sr_golomb_dst(GetBitContext *gb, unsigned > int k) > +{ > +#if 0 > + /* 'run_length' upper bound is not specified; we can never be sure it > will fit into get_bits cache */ > + int v = get_ur_golomb(gb, k, INT_MAX, 0); The code is too hard to read for the time of day (a comment on those different golomb variants sure would help), but sure that get_ur_golomb_shorten doesn't happen to do what you need? > + if (x >= 0) > + c -= (x + 4) / 8; > + else > + c += (-x + 3) / 8; Uh, isn't this actually c -= (x + 4) >> 3; ? > +static uint8_t prob_dst_x_bit(int c) > +{ > + return (ff_reverse[c & 127] >> 1) + 1; > +} Since this seems to be in the inner loop, making your own table might be worth it... > +static void build_filter(int16_t table[DST_MAX_ELEMENTS][16][256], const > Table *fsets) > +{ > + int i, j, k, l; > + for (i = 0; i < fsets->elements; i++) { > + int length = fsets->length[i]; > + for (j = 0; j < 16; j++) { > + int total = FFMAX(0, FFMIN(length - j * 8, 8)); av_clip? > + for (k = 0; k < 256; k++) { > + int v = 0; > + for (l = 0; l < total; l++) > + v += (((k >> l) & 1) * 2 - 1) * fsets->coeff[i][j * 8 + > l]; Is this faster in a relevant way than something more readable like coeff = fsets->coeff[i][j * 8 + l]; v += k & (1 << l) ? coeff : -coeff; ? > + unsigned int half_prob[DST_MAX_CHANNELS]; > + unsigned int map_ch_to_felem[DST_MAX_CHANNELS]; > + unsigned int map_ch_to_pelem[DST_MAX_CHANNELS]; > + DECLARE_ALIGNED(16, uint8_t, status)[DST_MAX_CHANNELS][16]; > + DECLARE_ALIGNED(16, int16_t, filter)[DST_MAX_ELEMENTS][16][256]; At least the last one is quite large I think. Please consider putting it in the context or so instead. That's also less problematic with alignment. > + if (!(avpkt->data[0] & 1)) { > + if (frame->nb_samples > avpkt->size - 1) > + av_log(avctx, AV_LOG_WARNING, "short frame"); > + memcpy(frame->data[0], avpkt->data + 1, FFMIN(frame->nb_samples * > avctx->channels, avpkt->size - 1)); Hm, shouldn't there be a check for avpkt->size > 0 somewhere first? > + if ((ret = init_get_bits8(&gb, avpkt->data, avpkt->size)) < 0) > + if ((ret = read_map(&gb, &fsets, map_ch_to_felem, avctx->channels)) < 0) > + if ((ret = read_map(&gb, &probs, map_ch_to_pelem, avctx->channels)) > < 0) I still think putting assignments into an if is really bad idea all around. > + if (same) { > + probs.elements = fsets.elements; > + memcpy(map_ch_to_pelem, map_ch_to_felem, sizeof(map_ch_to_felem)); > + } else { > + avpriv_request_sample(avctx, "Same_Mapping=0"); > + return ret; > + } Simpler as > if (!same) { > avpriv_request_sample(avctx, "Same_Mapping=0"); > return ret; > } and no else. Also, paranoia: for memcpy preferably use sizeof(dst) over sizeof(src), the effects are less bad if it goes wrong. > + uint64_t * s = (uint64_t*)status[ch]; That looks like a strict aliasing violation. > + v = ((predict >> 15) ^ residual) & 1; > + frame->data[0][ (i >> 3) * avctx->channels + ch] |= v << (7 > - (i & 0x7 )); > + > +#if HAVE_BIGENDIAN > + /* FIXME: not tested */ > + s[0] = (s[0] << 1) | ((s[1] >> 63) & 1); > + s[1] = (s[1] << 1) | v; > +#else > + s[1] = (s[1] << 1) | ((s[0] >> 63) & 1); > + s[0] = (s[0] << 1) | v; > +#endif Haven't fully understood this, but wouldn't it be more efficient especially on 32 bit systems to first collect e.g. 8 bits and only then move them into s? Especially if unrolling would be viable, in which case e.g. 7 - (i & 0x7 ) would become a constant as well. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel