On 8/31/14, Clement Boesch <u...@pkh.me> wrote: > On Fri, Aug 29, 2014 at 06:22:36PM +0000, Paul B Mahol wrote: >> Signed-off-by: Paul B Mahol <one...@gmail.com> >> --- >> doc/filters.texi | 62 ++++++ >> libavfilter/Makefile | 1 + >> libavfilter/af_silenceremove.c | 478 >> +++++++++++++++++++++++++++++++++++++++++ >> libavfilter/allfilters.c | 1 + >> 4 files changed, 542 insertions(+) >> create mode 100644 libavfilter/af_silenceremove.c >> >> diff --git a/doc/filters.texi b/doc/filters.texi >> index cca15fc..ea7da88 100644 >> --- a/doc/filters.texi >> +++ b/doc/filters.texi >> @@ -1881,6 +1881,68 @@ ffmpeg -i silence.mp3 -af >> silencedetect=noise=0.0001 -f null - >> @end example >> @end itemize >> >> +@section silenceremove >> + >> +Removes silence from the beginning, middle or end of the audio. > > We tend to use the "Remove silence ..." form (no 's').
Changed. > > Also, what is exactly the meaning of "from" here? Shouldn't it be "at the > beginning, in the middle or at the end of..."? (Note: as you know, I'm not > a native English speaker). Neither I am. > >> + >> +The filter accepts the following options: >> + >> +@table @option >> +@item start_periods >> +This value is used to indicate if audio should be trimmed at beginning >> of >> +the audio. A value of zero indicates no silence should be trimmed from >> the >> +beginning. When specifying an non-zero value, it trims audio up until it > > "*a* non-zero value"? Changed. > >> +finds non-silence. Normally, when trimming silence from beginning of >> audio >> +the @var{start_periods} will be @code{1} but it can be increased to >> higher >> +values to trim all audio up to specific count of non-silence periods. >> + > > Please specify the default value, ditto below Done. > >> +@item start_duration >> +Specify amount of time that non-silence must be detected before it stops > > Specify *the* amount of time Done. > >> +trimming audio. By increasing the duration, burst of noise can be >> treated >> +as silence and trimmed off. >> + >> +@item start_threshold >> +This indicate what sample value should be treated as silence. For >> digital >> +audio, a value of @code{0} may be fine but for audio recorded from >> analog, >> +you may wish to increase the value to account for background noise. >> + >> +@item stop_periods >> +Set the count for trimming silence from the end of audio. >> +To remove silence from the middle of a file, specify a >> @var{stop_periods} >> +that is negative. This value is the threated as a positive value and is >> also > > is *then* *treated*? Done. > > drop the "also" maybe Done. > >> +used to indicate the effect should restart processing as specified by >> +@var{start_periods}, making it suitable for removing periods of silence >> +in the middle of the audio. >> + >> +@item stop_duration >> +Specify a duration of silence that must exist before audio is not copied >> any >> +more. By specifying a higher duration, silence that is wanted can be >> left >> +in the audio. >> + >> +@item stop_threshold >> +This indicate what sample value should be treated as silence but for >> +trimming silence from the end of audio. >> + >> +@item leave_silence >> +This indicate that @var{stop_duration} length of audio should be left >> intact >> +at the beginning of each period of silence. >> +For example, if you want to remove long pauses between words but do not >> want >> +to remove the pauses completely. >> + >> +@end table >> + > > I must say I'm extremely confused at all of this; more examples would be > welcome if you don't mind. > >> +@subsection Examples >> + >> +@itemize >> +@item >> +The following example shows how this filter can be used to start >> recording > > *a* recording? Done. > >> +that does not contain the delay at the start which usually occurs >> between > >> +pressing the record button and the star of the performance: > > start? Fixed. > > [...] >> +typedef struct SilenceRemoveContext { >> + const AVClass *class; >> + > >> + int mode; > > Could be made an enum Done. > > [...] >> +static const AVOption silenceremove_options[] = { >> + { "start_periods", NULL, OFFSET(start_periods), AV_OPT_TYPE_INT, >> {.i64=0}, 0, 9000, FLAGS }, >> + { "start_duration", NULL, OFFSET(start_duration), >> AV_OPT_TYPE_DURATION, {.i64=0}, 0, 9000, FLAGS }, >> + { "start_threshold", NULL, OFFSET(start_threshold), >> AV_OPT_TYPE_DOUBLE, {.dbl=0}, 0, 1, FLAGS }, >> + { "stop_periods", NULL, OFFSET(stop_periods), AV_OPT_TYPE_INT, >> {.i64=0}, -9000, 9000, FLAGS }, >> + { "stop_duration", NULL, OFFSET(stop_duration), >> AV_OPT_TYPE_DURATION, {.i64=0}, 0, 9000, FLAGS }, >> + { "stop_threshold", NULL, OFFSET(stop_threshold), >> AV_OPT_TYPE_DOUBLE, {.dbl=0}, 0, 1, FLAGS }, >> + { "leave_silence", NULL, OFFSET(leave_silence), AV_OPT_TYPE_INT, >> {.i64=0}, 0, 1, FLAGS }, > ^^^^ > this column makes me sad :( > >> + { NULL } >> +}; >> + >> +AVFILTER_DEFINE_CLASS(silenceremove); >> + >> +#define SILENCE_TRIM 0 >> +#define SILENCE_TRIM_FLUSH 1 >> +#define SILENCE_COPY 2 >> +#define SILENCE_COPY_FLUSH 3 >> +#define SILENCE_STOP 4 >> + > [...] >> +static int config_input(AVFilterLink *inlink) >> +{ >> + AVFilterContext *ctx = inlink->dst; >> + SilenceRemoveContext *s = ctx->priv; >> + > >> + s->window_size = (inlink->sample_rate / 50) * inlink->channels; > > what is 50? Number used for window size calculation. > >> + s->window = av_malloc_array(s->window_size, sizeof(*s->window)); >> + >> + clear_rms(s); >> + >> + s->start_duration = av_rescale(s->start_duration, >> inlink->sample_rate, >> + AV_TIME_BASE); >> + s->stop_duration = av_rescale(s->stop_duration, >> inlink->sample_rate, >> + AV_TIME_BASE); >> + >> + if (s->start_duration) { >> + s->start_holdoff = av_malloc_array(s->start_duration, >> + sizeof(*s->start_holdoff) * >> + inlink->channels); >> + if (!s->start_holdoff) >> + return AVERROR(ENOMEM); >> + } >> + >> + s->start_holdoff_offset = 0; >> + s->start_holdoff_end = 0; >> + s->start_found_periods = 0; >> + >> + if (s->stop_duration) { >> + s->stop_holdoff = av_malloc_array(s->stop_duration, >> + sizeof(*s->stop_holdoff) * >> + inlink->channels); >> + if (!s->stop_holdoff) >> + return AVERROR(ENOMEM); >> + } >> + >> + s->stop_holdoff_offset = 0; >> + s->stop_holdoff_end = 0; >> + s->stop_found_periods = 0; >> + >> + if (s->start_periods) >> + s->mode = SILENCE_TRIM; >> + else >> + s->mode = SILENCE_COPY; >> + >> + return 0; >> +} >> + >> +static int config_output(AVFilterLink *outlink) >> +{ >> + outlink->flags |= FF_LINK_FLAG_REQUEST_LOOP; >> + >> + return 0; >> +} >> + > > nit: you probably can merge the 2 config callbacks > >> +static int above_threshold(double value, double threshold) >> +{ >> + return value > threshold; >> +} >> + > > Is this really necessary? Removed. > >> +static double compute_rms(SilenceRemoveContext *s, double sample) >> +{ >> + double new_sum; >> + >> + new_sum = s->rms_sum; >> + new_sum -= *s->window_current; >> + new_sum += sample * sample; >> + > >> + return sqrt(new_sum / s->window_size); > > Can't you remove the sqrt() and the div by updating the threshold > reference instead once (th=square(th*s->window_size)) ? I believe it will > kind of make the code faster. That can be done later. > >> +} >> + >> +static void update_rms(SilenceRemoveContext *s, double sample) >> +{ >> + s->rms_sum -= *s->window_current; >> + *s->window_current = sample * sample; >> + s->rms_sum += *s->window_current; >> + >> + s->window_current++; >> + if (s->window_current >= s->window_end) >> + s->window_current = s->window; >> +} >> + > >> +static void flush(AVFrame *out, AVFilterLink *outlink, >> + int *nb_samples_written, int *ret) >> +{ >> + if (*nb_samples_written) { >> + out->nb_samples = *nb_samples_written / outlink->channels; >> + *ret = ff_filter_frame(outlink, out); >> + *nb_samples_written = 0; >> + } else { >> + av_frame_free(&out); >> + } >> +} > > I suppose you don't want to return the ret value because it avoids > overwriting its already set value? > >> + >> +static int filter_frame(AVFilterLink *inlink, AVFrame *in) >> +{ >> + AVFilterContext *ctx = inlink->dst; >> + AVFilterLink *outlink = ctx->outputs[0]; >> + SilenceRemoveContext *s = ctx->priv; >> + int i, j, threshold, ret = 0; >> + int nbs, nb_samples_read, nb_samples_written; >> + double *obuf, *ibuf = (double *)in->data[0]; >> + AVFrame *out; >> + >> + nb_samples_read = nb_samples_written = 0; >> + >> + switch (s->mode) { >> + case SILENCE_TRIM: > >> +silence_trim: > > Can't you split all these cases/labels into function and do return func() > instead of all these weird gotos indirections? I would prefer to do that later. > > [...] >> +static av_cold void uninit(AVFilterContext *ctx) >> +{ >> + SilenceRemoveContext *s = ctx->priv; >> + >> + av_freep(&s->start_holdoff); >> + av_freep(&s->stop_holdoff); > > I misses the free of window I believe Fixed. > > [...] > > No other comment from me, assuming updates for Changelog & RELEASE_NOTES > and avfilter bump. > > -- > Clement B. > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel