>> can you please explain how I can recreate the files *.tiktoken?
>> There seem to be some sources missing ...
>
> The two files in question are 50k lines of ASCII text that seem to be
> some kind of index / vocabulary, and I have no idea how they were
> created.

Perhaps there is some clues to be had at the reimplementation at
https://github.com/ggerganov/whisper.cpp/ - or perhaps their authors
know?

...and perhaps you might find interest in packaging that C++
reimplementation too/instead? ;-)


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

Attachment: signature.asc
Description: signature

Reply via email to