On Thursday, September 22, 2022 9:20:46 AM MST Agustin Martin wrote: > First of all, I am curious about the reasons behind this new format, > the problems it deals with and its advantages. I assume they are valid > enough, but they imply yet another spellchecking engine/format. We > currently have goog old ispell, aspell and hunspell. vim has its own > spellchecker engine using its own format, with dicts that can be > created from old myspell2 dicts. We did not add vim format dicts (from > aspell dicts sources) since there seems to be some work to make vim > use hunspell directly. And now these bdict dicts.
The .bdic format is specified by the upstream Chromium project, and is required by anything that is based off of Chromium's code, like Qt WebEngine. I do not know why they went with a proprietary binary format, but I would assume that if they went to so much trouble to not use the standard Hunspell format there must have been something to make it worthwhile, like some performance improvement. Perhaps I am giving Google too much credit for having logical reasons instead of making arbitrary decisions. > From your info and proposed locations seems that these dicts are > arch:all, ¿is that true? I have not seen anything to indicate they are not arch:all. Although it probably depends on how the binary data is processed. There is a possibility there might be an endianess issue. > Another question is what happens with affix files, which I see are > used at build time, ¿are they used (from their path) at runtime or is > all the info (dic+aff) bundled into the bdic file? If explicit affix > files are still required at runtime, both bdic and aff files should > probably be in the same dir. Otherwise I am more for a separate > location. In this case, since bdic dicts seem to be more generic than > just a qtwebengine issue and they are indeed created from hunspell > files I would go for a rather generic name (may be something like > /usr/share/hunspell-bdic or something without the hunspell name?) The .bdic binary file contains all the information from the .dic and .aff files, so neither of them are needed by Qt WebEngine. As such, I think a dedicated directory for the .bdic files is best. My personal motivation for getting these dictionaries into Debian is that I am the developer of Privacy Browser, which is a web browser based on Qt WebEngine. The PC version is currently in a pre-alpha state. https://www.stoutner.com/privacy-browser-pc/[1] When adding spell checking functionality, I realized that these dictionaries were not already packaged. The little bit of poking around that I did showed that Arch Linux packages them, but I do not know if other distributions do so. https://archlinux.org/todo/packaging-qtwebengine-dictionaries/[2] There are a number of existing web browsers in Debian based on Qt WebEngine that could take advantage of the presence of these .bdic dictionaries. A non-exhaustive list includes: Konqueror, Falkon, qutebrowser, and angelfish. If it ends up being feasible for Chromium to also use a system-wide .bdic location, then any Chromium fork would also benefit. Once Privacy Browser reaches an alpha release, my intention is to maintain a Debian package for it. I have the option of integrating the .bdics directly into the program's personal data folders, but that seems like a suboptimal approach, because anything else on the system that wanted to use them would have to have their own copy. When the binary dictionaries are installed in the correct system-wide folder, any Qt WebEngine program can utilize them with a single line of code that specifies which dictionary to use (only one can be active at a time). Of course, the program would also probably need to establish a GUI where the user can select which dictionary they would like to be active, which GUI involves more than a single line of code. -- Soren Stoutner so...@stoutner.com -------- [1] https://www.stoutner.com/privacy-browser-pc/ [2] https://archlinux.org/todo/packaging-qtwebengine-dictionaries/
signature.asc
Description: This is a digitally signed message part.