https://bugs.kde.org/show_bug.cgi?id=506100
Bug ID: 506100
Summary: Whisper Transcoding now working due to {language}
error
Classification: Applications
Product: kdenlive
Version First 25.04.1
Reported In:
Platform: Ubuntu
OS: Linux
Status: REPORTED
Severity: normal
Priority: NOR
Component: User Interface & Miscellaneous
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
SUMMARY
When trying to transcribe any sound using the installed transcribe plugin, the
plugin crashes after a short while with a Python backtrace in the logs.
STEPS TO REPRODUCE
1. Start KdenLive
2. Add a Sound file and select it
3. Click Transcribe
4. Wait
5. See Logs
OBSERVED RESULT
```
/home/d0/.var/app/org.kde.kdenlive/data/kdenlive/venv/lib/python3.12/site-packages/whisper/transcribe.py:124:
UserWarning: Performing inference on CPU when CUDA is available
warnings.warn("Performing inference on CPU when CUDA is available")
Traceback (most recent call last):
File "/app/share/kdenlive/scripts/whisper/whispertotext.py", line 192, in
<module>
sys.exit(main())
^^^^^^
File "/app/share/kdenlive/scripts/whisper/whispertotext.py", line 174, in
main
result = run_whisper(source, model, device, task, jobArgs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/share/kdenlive/scripts/whisper/whispertotext.py", line 126, in
run_whisper
result = loadedModel.transcribe(source, **transcribe_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/d0/.var/app/org.kde.kdenlive/data/kdenlive/venv/lib/python3.12/site-packages/whisper/transcribe.py",
line 155, in transcribe
tokenizer = get_tokenizer(
^^^^^^^^^^^^^^
File
"/home/d0/.var/app/org.kde.kdenlive/data/kdenlive/venv/lib/python3.12/site-packages/whisper/tokenizer.py",
line 380, in get_tokenizer
raise ValueError(f"Unsupported language: {language}")
ValueError: Unsupported language: language
```
EXPECTED RESULT
Loading bar should load and display transcription after a while
SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Kubuntu 24.04
KDE Plasma Version: 6.3.2
KDE Frameworks Version:
Qt Version:
ADDITIONAL INFORMATION
Since I am an avid Python-dev I can also found the culpit:
https://github.com/KDE/kdenlive/blob/965d9ffad5e0848bf0d6657df32489c555124d86/data/scripts/whisper/whispertotext.py#L166
On line 166 you do
language = args.language
Then on line 172
jobArgs = f"language={language} "
There are two problems here:
[1] The value of args.language isn't simply "English", it already contains the
variable name, so it is "language=English". When you then set jobArgs i nline
172 jobArgs doesn't become "language=English" but "language=language=English"
which is obviously wrong
[2] The Whisper API doesn't expect the string "English" it expects an ISO-639-1
language code (e.g. "en" for English or "de" for German). This means there
needs to be a mapping function somewhere
Adding the following below line 166 makes it work for me:
# Take the latter part of "language=English" by splitting at the equals
sign
language = language.split("=")[1]
# Map Language to two-letter ISO-639-1 code
LANGUAGE_TO_ISO = {
"English" : "en",
"German" : "de",
# More needs to be added here
}
language = LANGUAGE_TO_ISO.get(language)
Please note that this may not be usable in production, since this will fail if
language doesn't contain "=". For a production code it would probably be better
to fix this upstream in the part that sends the language string and make it
send a propper ISO-639-1 two letter code from the start. But those who just
need it to work and have that issue could edit the file...
--
You are receiving this mail because:
You are watching all bug changes.