It adds a new audio filter for running audio transcriptions with the whisper
model.
Documentation and examples are included into the patch.
Signed-off-by: Vittorio Palmisano
---
configure| 5 +
doc/filters.texi | 107 +
libavfilter/Makefile | 2
> Hi Vittorio
>
> On Thu, Jul 17, 2025 at 10:51:57AM +0200, Vittorio Palmisano wrote:
> > It adds a new audio filter for running audio transcriptions with the
> > whisper model.
> > Documentation and examples are included into the patch.
> >
> > Signe
It adds a new audio filter for running audio transcriptions with the whisper
model.
Documentation and examples are included into the patch.
Signed-off-by: Vittorio Palmisano
---
configure| 5 +
doc/filters.texi | 101
libavfilter/Makefile | 2
It adds a new audio filter for running audio transcriptions with the whisper
model.
Documentation and examples are included into the patch.
Signed-off-by: Vittorio Palmisano
---
configure| 5 +
doc/filters.texi | 101
libavfilter/Makefile | 2
> > +
> > +memcpy(wctx->audio_buffer, wctx->audio_buffer + end_pos,
> > + end_pos * sizeof(float));
>
> sizeof(*wctx->audio_buffer) is more robust than float
But end_pos is not necessarily equal to the audio_buffer size, it
could be lower.
>
> not sure how others think of this, but
Thanks, I've applied your suggestions.
Signed-off-by: Vittorio Palmisano
---
configure | 5 +
doc/filters.texi | 106 +
libavfilter/Makefile | 2 +
libavfilter/af_whisper.c | 454 +++
libavfilter/allfilters.c | 2 +
5 files changed, 569 inser
> Leaving out parameter names is a C++ thing, its not allowed in C.
>
Ok, I've added some modifications and fixed the empty transcription output.
--
/Vittorio Palmisano/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https:/
> While the filter provides great value, the accelerating pace of AI innovation
> raises concerns
> about its longevity. Given how rapidly newer models emerge, is there a risk
> of this filter
> becoming deprecated in the near term?
I think that the design of the whisper.cpp library allows us to
gt;next_pts = AV_NOPTS_VALUE;
> > +
> > +wctx->avio_context = NULL;
>
> arent things already initialized to 0 ?
Yes, maybe we can keep the AV_NOPTS_VALUE assignment (it is not zero).
--
/Vittorio Palmisano/
___
ffmpeg-devel ma
It adds a new audio filter for running audio transcriptions with the whisper
model.
Documentation and examples are included into the patch.
Signed-off-by: Vittorio Palmisano
---
configure| 5 +
doc/filters.texi | 105 +
libavfilter/Makefile | 2
Hi, I've added some changes to improve the VAD mechanism.
You can find the changes here too:
https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/17/files
Signed-off-by: Vittorio Palmisano
---
configure | 5 +
doc/filters.texi | 106 +
libavfilter/Makefile | 2 +
libavfilter/af_whisper.c
> > +@item gpu_device
> > +The GPU device to use.
> > +Default value: @code{"0"}
>
> is this always a number ?
> if so the documenattion could say that
Yes, it is the device index.
> > +@item destination
> > +If set, the transcription output will be sent to the specified file or URL
> > +(use one
It adds a new audio filter for running audio transcriptions with the whisper
model.
Documentation and examples are included into the patch.
Signed-off-by: Vittorio Palmisano
---
configure| 5 +
doc/filters.texi | 107 +
libavfilter/Makefile | 2
Hi,
I've applied some changes and created a pull request:
https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20022
>
> > +frames = FFMAX(0, FFMIN(frames, wctx->audio_buffer_fill_size));
>
> I would call it samples, sample_count or nb_samples
>
> why are you cliping the number of samples ?
>
> I assum
Update: the correct time base is stored inside inlink->time_base, not
in frame->time_base
On Wed, Jul 23, 2025 at 12:19 PM Vittorio Palmisano
wrote:
>
> > > To understand why this is a problem, consider some audio input device
> > > which samples at 16khz. This har
#x27;ve found that:
frame->time_base=1/48000
frame->sample_rate=16000
Using `1000 * frame->pts * frame->time_base` returns wrong results.
The only way to get the correct value seems `1000 * frame->pts /
frame->sample_rate`
--
/Vittorio Palmisano/
___
16 matches
Mail list logo