Hi
I have no affiliation with GSoc or any other code program
I am looking at the thesaurus files (as «I am not a coder»), with a view to
providing an updated technical.dic thesaurus with many new terms and «a clear
upgrade/merge/integrate path from an external data source» (wikipedia).
603 lines of references as opposed to the current 378 lines, each term
(possibly) searchable directly on wikipedia .Then maybe I will see about
concept/design to integrate (hypothetical) web search in xml (help, thesaurus)
interface. Right click on an item in help or Bayram Cicek’s search interface
and you get an option to search the internet, or wikipedia article… (as
concept).
Anyway, I deleted the two foreign language thesaurus files hu_AkH11.dic and
sl.dic, looking at the bug reports mentioned below, but I quickly realized the
dic files were needed when compiling. But blank placeholders (empty text files)
are good too.
I have edited code references to remove hu_AkH11.dic, and it compiles OK
without even a placeholder (empty) file.
Aside from Linguistic.backup..xcs, line 217, where else is hu_AkH11.dic
referenced. I looked in the references below and I believe that it is bloat. I
will try compiling HU language support if people think it is useful.
I would expect then that even with HU language and dictionary support
installed, the (original) hu_AkH11.dic thesaurus will not exist or be called
for. I don’t speak Hungarian though.
sl.dic is integrated to the unit test, I can see. Not touching it today. I will
put the text back into the placeholder file for my next build.
«en_US or other language builds get these files unnecessarily, the only task is
fixing our packaging.» OK, how can I help with packaging?
Laszlo do you have a local repo for your lo code, the en_US spelling
dictionary? Your language code is different to this specific technical.dic
thesaurus, yes?
Thanks
Alex Tao
Tao Submarines and Systems
Chios, Aegean Sea
>Thursday, June 29, 2023 1:44 AM +03:00 from Németh László <
>nem...@numbertext.org >:
>
>Hi,
>Andras Timar < tima...@gmail.com > ezt írta (időpont: 2023. jún. 28., Sze,
>17:55):
>>Hi Alex,
>>On Wed, Jun 28, 2023 at 5:15PM Alex < taosubmari...@mail.ru > wrote:
>>>Hi everyone
>>>
>>>Today I try to determine how to remove two unwanted wordbook files from
>>>libreoffice/extras/source/wordbook:
>>>hu_AkH11.dic and sl.dic.
>>>These foreign language (incomplete) dics should be removed, unless they are
>>>used in some unit test.
>>>Bug 139961, 68576 etc
>>>
>>>Can be removed? OK?
>>>
>>
>>I'm not sure, if it's OK. We added these dictionaries for a reason. It's
>>better to ask the maintainers first (I CC-ed them).
>>From the technical point of view, if you remove the files from source, and
>>all references to them, the build should pass. Maybe you need a clean build
>>from scratch. Use "git grep sl.dic" and "git grep hu_AkH11.dic" commands,
>>they are more reliable than opengrok.
>
>You can remove hu_AkH11.dic with the following git command:
>
>$ git revert 6247c966942a0e43320a234302a67c1f92c2eea7
>
> Because this was added with that commit:
> $ git log libreoffice/extras/source/wordbook/hu_AkH11.dic
>commit 6247c966942a0e43320a234302a67c1f92c2eea7
>
>But these are not unwanted dictionaries, as András wrote.
>
>In theory, they are packaged only with their language builds, sl-SI and hu-HU.
>If not, i.e. en_US or other language builds get these files unnecessarily, the
>only task is fixing our packaging. If the packaging problem is related to some
>Linux distributions, I believe, our task is only to report that in their bug
>trackers.
>
>Is this a GSoC project? I haven't found information about the planned
>improvement of the (en_US?) thesaurus or the thesaurus code base.
>(By the way, I had an interesting improvement here: English stemming and
>affixation during thesaurus usage by adding extra language data to the en_US
>spelling dictionary. Unfortunately, by accident this was removed by the recent
>maintainer.)
>
>Best regards,
>László
>
>
>>
>>Best regards,
>>Andras
>>
>>