Am 03.07.2010 um 14:44 schrieb Jürgen Spitzmüller:

> Stephan Witt wrote:
>> I tried to incorporate some thesaurus files from OpenOffice. It seems like
>> the file format has changed and is incompatible now. So the old thesaurus
>> .idx and .dat files are found but not valid and then LyX crashes. To avoid
>> this I added the checks above.
> 
> I suppose you have checked the mythes license whether you can modify the 
> sources.

No, I haven't. So we can move the check one level up - to Thesaurus.cpp?
Would this be ok with you? (patch attached)

>> But the question is: where to get actual
>> .idx and .dat files from?
> 
> The newer OpenOffice archives are just zip files that contain, amongst 
> others, 
> the *.idx and *.dat files. Works fine here.

Yes, but one have to careful check the contents. Some are fine and others are 
not.
I collected the dictionaries and thesauri for the following languages from 
different
sources (openoffice downloads and opensuse rpms):
* US English
* German
* French
* Italian
* Russian
* Spanish
* Portugiese
* Polish

If I build a package with all these languages bundled it results in ~75 MByte 
size w/o Qt.

Stephan

Index: src/Thesaurus.cpp
===================================================================
--- src/Thesaurus.cpp   (Revision 34745)
+++ src/Thesaurus.cpp   (Arbeitskopie)
@@ -29,6 +29,7 @@
 
 #include <algorithm>
 #include <cstring>
+#include <fstream>
 
 using namespace std;
 using namespace lyx::support;
@@ -91,10 +92,21 @@
        FileNameList const data_files = base.dirList("dat");
        string idx;
        string data;
+       string basename;
 
        LYXERR(Debug::FILES, "thesaurus path: " << path);
        for (FileNameList::const_iterator it = idx_files.begin(); it != 
idx_files.end(); ++it) {
-               if (contains(it->onlyFileName(), to_ascii(lang))) {
+               basename = it->onlyFileNameWithoutExt();
+               if (contains(basename, to_ascii(lang))) {
+                       ifstream ifs(it->absFileName().c_str());
+                       if (ifs) {
+                               string s;
+                               getline(ifs,s);
+                               if (s.find_first_of(',') != string::npos) {
+                                       LYXERR(Debug::FILES, "ignore version1 
thesaurus idx file: " << it->absFileName());
+                                       continue;
+                               }
+                       }
                        idx = it->absFileName();
                        LYXERR(Debug::FILES, "selected thesaurus idx file: " << 
idx);
                        break;
@@ -104,7 +116,7 @@
                return make_pair(string(), string());
        }
        for (support::FileNameList::const_iterator it = data_files.begin(); it 
!= data_files.end(); ++it) {
-               if (contains(it->onlyFileName(), to_ascii(lang))) {
+               if (contains(it->onlyFileName(), basename)) {
                        data = it->absFileName();
                        LYXERR(Debug::FILES, "selected thesaurus data file: " 
<< data);
                        break;
@@ -164,6 +176,8 @@
 
 bool Thesaurus::thesaurusInstalled(docstring const & lang) const
 {
+       if (thesaurusAvailable(lang))
+               return true;
        pair<string, string> files = d->getThesaurus(lang);
        return (!files.first.empty() && !files.second.empty());
 }

Reply via email to