Oleg, like I mentioned earlier. I have a different .affix file that I got from Andrew with the stop file and I get no errors creating the dictionary using that one but I get nothing out from ts_lexize. The size on that one is : 406,219 bytes And the size on the hunspell one (first) : 406,229 bytes
Little to close, don't you think ? It might be that the arabic hunspell (ayaspell) affix file is damaged on some lines and I got the fixed one from Andrew. Just wanted to let you know. / Moe On Mon, Feb 2, 2009 at 3:25 PM, Mohamed <mohamed5432154...@gmail.com> wrote: > Ok, thank you Oleg. > I have another dictionary package which is a conversion to hunspell > aswell: > > > http://wiki.services.openoffice.org/wiki/Dictionaries#Arabic_.28North_Africa_and_Middle_East.29 > (Conversion of Buckwalter's Arabic morphological analyser) 2006-02-08 > > And running that gives me this error : (again the affix file) > > ERROR: wrong affix file format for flag > CONTEXT: line 560 of configuration file "C:/Program > Files/PostgreSQL/8.3/share/tsearch_data/arabic_utf8_alias.affix": "PFX 1013 > Y 6 > " > > / Moe > > > > On Mon, Feb 2, 2009 at 2:41 PM, Oleg Bartunov <o...@sai.msu.su> wrote: > >> Mohamed, >> >> We are looking on the problem. >> >> Oleg >> >> On Mon, 2 Feb 2009, Mohamed wrote: >> >> No, I don't. But the ts_lexize don't return anything so I figured there >>> must >>> be an error somehow. >>> I think we are using the same dictionary + that I am using the stopwords >>> file and a different affix file, because using the hunspell (ayaspell) >>> .aff >>> gives me this error : >>> >>> ERROR: wrong affix file format for flag >>> CONTEXT: line 42 of configuration file "C:/Program >>> Files/PostgreSQL/8.3/share/tsearch_data/hunarabic.affix": "PFX Aa Y 40 >>> >>> / Moe >>> >>> >>> >>> >>> On Mon, Feb 2, 2009 at 12:13 PM, Daniel Chiaramello < >>> daniel.chiarame...@golog.net> wrote: >>> >>> Hi Mohamed. >>>> >>>> I don't know where you get the dictionary - I unsuccessfully tried the >>>> OpenOffice one by myself (the Ayaspell one), and I had no arabic >>>> stopwords >>>> file. >>>> >>>> Renaming the file is supposed to be enough (I did it successfully for >>>> Thailandese dictionary) - the ".aff'" file becoming the ".affix" one. >>>> When I tried to create the dictionary: >>>> >>>> CREATE TEXT SEARCH DICTIONARY ar_ispell ( >>>> TEMPLATE = ispell, >>>> DictFile = ar_utf8, >>>> AffFile = ar_utf8, >>>> StopWords = english >>>> ); >>>> >>>> I had an error: >>>> >>>> ERREUR: mauvais format de fichier affixe pour le drapeau >>>> CONTEXTE : ligne 42 du fichier de configuration ? >>>> /usr/share/pgsql/tsearch_data/ar_utf8.affix ? : ? PFX Aa Y 40 >>>> >>>> (which means Bad format of Affix file for flag, line 42 of configuration >>>> file) >>>> >>>> Do you have an error when creating your dictionary? >>>> >>>> Daniel >>>> >>>> Mohamed a ?crit : >>>> >>>> >>>> I have ran into some problems here. >>>> I am trying to implement arabic fulltext search on three columns. >>>> >>>> To create a dictionary I have a hunspell dictionary and and arabic stop >>>> file. >>>> >>>> CREATE TEXT SEARCH DICTIONARY hunspell_dic ( >>>> TEMPLATE = ispell, >>>> DictFile = hunarabic, >>>> AffFile = hunarabic, >>>> StopWords = arabic >>>> ); >>>> >>>> >>>> 1) The problem is that the hunspell contains a .dic and a .aff file but >>>> the configuration requeries a .dict and .affix file. I have tried to >>>> change >>>> the endings but with no success. >>>> >>>> 2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing >>>> >>>> 3) How can I convert my .dic and .aff to valid .dict and .affix ? >>>> >>>> 4) I have read that when using dictionaries, if a word is not recognized >>>> by >>>> any dictionary it will not be indexed. I find that troublesome. I would >>>> like >>>> everything but the stop words to be indexed. I guess this might be a >>>> step >>>> that I am not ready for yet, but just wanted to put it out there. >>>> >>>> >>>> >>>> Also I would like to know how the process of the fulltext search >>>> implementation looks like, from config to search. >>>> >>>> Create dictionary, then a text configuration, add dic to configuration, >>>> index columns with gin or gist ... >>>> >>>> How does a search look like? Does it match against the gin/gist index. >>>> Have that index been built up using the dictionary/configuration, or is >>>> the >>>> dictionary only used on search frases? >>>> >>>> / Moe >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> Regards, >> Oleg >> _____________________________________________________________ >> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), >> Sternberg Astronomical Institute, Moscow University, Russia >> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ >> phone: +007(495)939-16-83, +007(495)939-23-83 >> > >