Here is the second revision to my "Portable Spell Checker Interface Library". Will someone still be willing to work on the ispell module like you said you will in the past http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg08808.html? Thanks? Portable Spell Checker Interface Library Kevin Atkinson [EMAIL PROTECTED] March 9, 2000 (Revision 2) 1 Goal The goal of the library is to provide a generic interface to Spell checker libraries installed on the system. 2 Overview The Pspell library contains two main classes and several helper classes. The two main classes are PspellConfig and PspellMaster. The PspellConfig class is used to set inital defaults and to change spell checker specific options. The PspellManager class does most of the real work. It is resposable for managing the dictionaries, checking if a word is in the dictrionary, and comming up with suggestions among other things. There are many helper classes the important ones are PspellWordList, PspellMutableWordList, Pspell*Emulation. The PspellWordList classes us used for accessing the suggestion list, as well as the personal and suggestion word list currently in use. The PspellMutableWordList is used to manage the personal, and perhapes other, word lists. The Pspell*Emulation classes are used for iterating through a list. A C interface will also be proved as well as a few STL like helper classes for those who prefer more modern C++. 3 Usage When your application first starts you should get a new configuration class with the command: PspellConfig * spell_config = new_pspell_config(); which will create a new PspellConfig class. It is allocated with new and it is your responsibility to delete it with or delete_pspell_config or the standard C++ delete. Once you have the config class you should set some variables. The most important one is the language variable. To do so use the command: spell_config->replace("lang", "en-US"); which will set the default language to use to american english. The language is expected to be the standard two letter ISO 639 language code, with an optional two letter ISO 3166 country code after a dash or underscore. Other things you might want to set is the encoding of ``char'' strings, the preferred spell checker to use, the search path for dictionary's, and the like. When ever a new document is created a new PspellManager class should also be created. There should be one manager class per document. To create a new manager class use the command. PspellManager * spell_checker = new_pspell_manager(spell_config); which will create a new PspellManager class using the defaults found in spell_config. If for some reason you want to use different defaults simply clone spell_config and change the setting like so: PspellConfig * spell_config2 = spell_config->clone(); spell_config2->replace("lang","nl"); PspellManager * spell_checker = new_pspell_manager(spell_config2); delete_pspell_config(spell_config2); Once the manager class is created you can use the check method to see if a word in the document is correct like so: bool correct = spell_checker->check(<word>); <word> can be any one of const char *, const u16int *, or const u32int * where u16int and u32int is the unsigned 16 and 32 bit integer on the current platform respectfully. Strings of const char * are expected to use iso8859-1 or some other 256 bit character set as determined by the current language in use. Other encoding are allowed such as UTF-8 but they must be explicitly set via a configuration option before its first use. Stings of const u16int * and const u32int * are expected to be in Unicode. If the word is not correct than the suggest method can be used to come up with likely replacements. PspellWordList & suggestions = suggest(<word>); PspellStringEmulation * elements = suggestions.elements(); const char * word; while ( (word = elements.next()) != NULL ) { // add to suggestion list } delete elements; (It is also possible to access elements as const u16int *, or const u32int *. See the class reference section for how to do so.) Once a replacement is made the store_repl method should be used to communicate the replacement pair back to the spell checker (see section 7.1 for why). It usage is as follows: spell_checker->store_repl(<misspelled word>, <correctly spelled word>); If the user decided to add the word to the session or personal dictionary the the word can be be added using the add_to_session or add_to_personal methods respectfully like so: spell_checker->add_to_session|personal(<word>); It is better to let the spell checker manage these words rather than doing it your self so that the words have a change of appearing in the suggestion list. Finally, when the document is closed the PspellManager class should be deleted like so. delete_pspell_manager(spell_checker); The standard C++ delete may also be used. 4 Class Reference Methods that return a bool generally return false on error and true other wise. To find out what went wrong use the error_num and error_message methods. Unless otherwise stated methods that return a const char * will return null on error. The charter string returned is only valid until the next method which returns a const char * is called. STRING is used to represent one of const char *, const u16int *, or const u32int *. All methods are virtual and abstract, thus these classes are really abstract base classes. Therefore you cannot simply store the object directly. In order to make copies of the objects use the clone and assign methods if they are provided. 4.1 PspellConfig The PspellConfig class is used to hold configuration information it has a set of keys which it will except. Inserting are even trying to look at a key that it does not know will produce an error. Extra accepted keys can be added with the set_extra. method. PspellConfig * clone() const void assign(const PspellConfig *) if the two objects are not of the exact same type the assign method is undefined. int error_num() const char * error_message() string valid until the next error void set_extra(const PspellKeyInfo * begin, const PspellKeyInfo * end) const PspellKeyInfo * keyinfo(const char * key) const PspellKeyInfoEmulation * possible_elements(bool include_extra = true) const const char * get_default(const char * key) const PspellStringPairEmulation * elements() const bool insert(const char * key, const char * value) Insert will NOT overwrite an existing entry bool replace(const char * key, const char * value) bool remove(const char * key) All the retrieve methods will 1. return the default if the value is not set 2. give an error if the key is not requested as known 3. give an error if the value is not in the right format const char * retrieve (const char * key) const const char * retrieve_list (const char * key) const bool retrieve_list (const char * key, PspellMutableContainer &) const int retrieve_bool(const char * key) const return -1 on error, 0 if false, 1 if true int retrieve_int(const char * key) const return -1 on error PspellConfig * new_pspell_config() returns a new config class for setting things up before a manager class is created delete_pspell_config(PspellConfig *) deletes a PspellConfig class. You can also use the sand C++ delete. 4.2 PspellManager This class is responsible for keeping track of the dictionaries coming up with suggestions and the like Its methods are NOT meant to be used my multiple threads and/or documents. If you wish to have more than one language per document simple have more multiple manger classes for each document but DO NOT share a manauger class between more than one document. Most all if the manipulation of options is done via the Config class, thus this class has precious few methods. int error_num() const char * error_message() string valid until the next error PspellConfig & config() const PspellConfig & config () this config returned is NOT the same object as the one you pass in. const char * lang_name() const bool check(STRING) cons bool add_to_personal(STRING) bool add_to_session(STRING) PspellWordList & master_word_list() const PspellWordList & personal_word_list() const PspellWordList & session_word_list() const because the word lists may potently have to convert from non-uni to uni or vise versa the pointer returned by the emulation is only valid to the next call. bool save_all_wls() void clear_session() PspellWordList & suggest(STRING) the suggestion list and the elements in it are only valid until the next call to suggest. bool store_repl(STRING mis, STRING cor) PspellManager * new_pspell_manager(const PspellConfig * config) returns a new manager class, allocated with new,based on the settings in config delete_pspell_manager(const PspellManager *) deletes a PspellManager class, you may also use the standard C++ delete 4.3 PspellWordList bool empty() const int size() const StringEmulation * elements() const ShortUniStringEmulation * short_uni_elements() const UniStringEmulation * uni_elements() const 4.4 PspellMutableWordList public PspellWordList int error_num() const char * error_message() string valid until the next error boll add(STRING) bool clear_all() bool save() PspellMutableWordList * new_pspell_personal_word_list(PspellConfig *) returns a new personal word list so that you can manage it delete_pspell_mutable_word_list(PspellMutableWordList *) deletes a PspellMutableWordList, you may also use standard C++ delete. 4.5 PspellEmulation PspellEmulation * clone() const void assign(const PspellEmulation *) if the two objects are not of the exact same type the assign method is undefined. delete_pspell_emulation(PspellEmulation *) deletes a PspelEmulation, you may also use standard C++ delete. 4.6 Pspell*Emulation public PspellEmulation All emulations have the following two methods. <type> next() bool at_end() const where <type> is specific to the particulate emulation given by the following table Name Type PspellStringEmulation const char * PspellShortUniStringEmulation const u16int * PspellUniStringEmulation const u32int * PspellKeyInfoEmulation PspellKeyInfo * PspellStringPairEmulation PspellStringPair 4.7 Other minor classes. class PspellMutableContainer { public: virtual void insert(const char *) = 0; virtual void remove(const char *) = 0; virtual void clear() = 0; PspellMutableContainer(); }; enum PspellKeyInfoType {Bool, String, Int, List}; struct PspellKeyInfo { const char * name; PspellKeyInfoType type; const char * def; const char * desc; // null if internal value }; struct PspellStringPair { const char * first; const char * second; }; 5 C Interface An extrern C interface will also be provided. Method will be mapped to functions in the following manner. <class name in lowercase with underscores>_<method name>([const] < Class> *, <other parameters if any>) For example ``PspellManager::lang_name() const'' would become ``pspel_manager_lang_name(const PspellManager *)''. For methods which overload based on the string type the u16int and u32int methods will be mapped the same way with a final _16 or _32 added to the function name. For example ``PspellManager::check(const 16int *)'' would have a fucntion name of pspell_manager_check_16. Methods that return a bool will instead return an int in the C interface. 6 Modern C++ Helper Classes An almost forward iterator class will be proved to wrap the Pspell* Emulation classes in. It is almost a forward iterator becuase two iterators will not be able to compared two each other unless it is to check if the iterator is at the end. I strongly recoment the use of auto_ptr with all pointers returned. All pointers returned that you are responable to free will be able to de deleted with the standard C++ delete. These helper classes will provided in seperate header files so those who do not which to use them will not have to. 7 Rational 7.1 store_repl method This method is needed because Aspell (http://aspell.sourceforge.net/) is able to learn from users misspellings. For example on the first pass a user misspells beginning as beging so aspell suggests: begging, begin, being, Beijing, bagging, .... However the user then tries "begning" and aspell suggests beginning, beaning, begging, ... so the user selects beginning. However than, latter on in the document the user misspelles it as begng (NOT beging). Normally aspell will suggest. began, begging, begin, begun, .... However becuase it knows the user mispelled beginning as beging it will instead suggest: beginning, began, begging, begin, begun ... I myself often misspelled beginning (and still do) as something close to begging and two many times wind up writing sentences such as "begging with ....". 8 Timeframe An alpha version of this interface should be available by the end of March, 2000 or the begging of April. An Aspell (http:// aspell.sourceforge.net) module will also be provided. I am hoping some one else will come up with the Ispell (http://fmg-www.cs.ucla.edu/ fmg-members/geoff/ispell.html) module. Modules for other spell checkers are more than welcome. 9 Future Future versions of the interface will provide better support for multilingual documents as well as methods for spell checking whole regions of text. Letting the spell checker check whole region of text will allow the spell checker to skip over formating commands, url, and the like. 10 Feedback As always feedback is most appreciated. I can be contacted at [EMAIL PROTECTED] 11 Other Formats This document is available in several other formats: Format Location HTML http://pspell.sourceforge.net/interface.html Text http://pspell.sourceforge.net/interface.txt TEX http://pspell.sourceforge.net/interface.tex PS http://pspell.sourceforge.net/interface.ps Dvi http://pspell.sourceforge.net/interface.dvi LyX http://pspell.sourceforge.net/interface.lyx About this document ... Portable Spell Checker Interface Library This document was generated using the LaTeX2HTML translator Version 99.2beta6 (1.42) Copyright (C) 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds. Copyright (C) 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney. The command line arguments were: latex2html -no_subdir -split 0 -no_navigation -local_icons -show_section_numbers interface.tex The translation was initiated by Kevin Atkinson on 2000-03-09 ---------------------------------------------------------------------- Kevin Atkinson 2000-03-09 [sflogo] --- Kevin Atkinson [EMAIL PROTECTED] http://metalab.unc.edu/kevina/