Re: [FFmpeg-user] Whisper in ffmpeg 8

MacFH - C E Macfarlane - News Thu, 14 Aug 2025 14:30:33 -0700

On 2025-08-14 22:23, Rob Hallam wrote:


On Thu, 14 Aug 2025 at 22:15, Bernhard Döbler <[email protected]> wrote:


yesterday, news made the round, that ffmpeg 8 is going to be released,
soon, and it will contain whisper, an AI software that can understand
spoken text and create subtitles.

Their github page https://github.com/ggml-org/whisper.cpp says they
offer a handful of models.

Model   Disk    Mem
tiny    75 MiB  ~273 MB
base    142 MiB         ~388 MB
small   466 MiB         ~852 MB
medium  1.5 GiB         ~2.1 GB
large   2.9 GiB         ~3.9 GB


There is a commit [1] adding Whisper support [2]. As the docs note you
will need to provide a model.

How does this work? Will all of this be compiled into the ffmpeg binary?


--enable-whisper config option is added (default: no) [3] so up to
whoever compiles your binary and you provide the model.

[1]: 
https://github.com/FFmpeg/FFmpeg/commit/13ce36fef98a3f4e6d8360c24d6b8434cbb8869b
[2]: https://ffmpeg.org/ffmpeg-filters.html#whisper-1
[3]: 
https://github.com/FFmpeg/FFmpeg/blob/47c6af7d299c96b2e65f5f10526e0f34e00b23c8/configure#L339

Enlarging the question somewhat, is there existing AI that could be usedto process existing recordings that contain both speech and music, andhighlight or extract the areas, say by creating cut points, that containmusic?


Does anyone here know if this is possible?

_______________________________________________
ffmpeg-user mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-user] Whisper in ffmpeg 8

Reply via email to