Hmm, I have found a small bug:
When there is a compound affix with zero length of search pattern (which
should not be!), ispell dictionary ignores all other compound affixes.
Original afix file contains
flag ~\`:
E > -E,NINGS#~ avskrive > avskrivnings-
Z Y Z Y
Norwegian (Nynorsk and Bokmaal) ispell dictionaries are available from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
I didn't test them.
Oleg
On Fri, 17 Feb 2006, Teodor Sigaev wrote:
Very strange...
~% file tsearch/dict/ispell_no/norwegian.dict
tsearch/dict/ispell_no/n
BTW, if you take norwegian dictionary from
http://folk.uio.no/runekl/dictionary.html then try to build it from OpenOffice
sources (http://lingucomponent.openoffice.org/spell_dic.html, tsearch2/my2ispell).
I found mails in my archive which says that norwegian people prefer OpenOffice's
one.
--
Very strange...
~% file tsearch/dict/ispell_no/norwegian.dict
tsearch/dict/ispell_no/norwegian.dict: ISO-8859 C program text
~% file tsearch/dict/ispell_no/norwegian.aff
tsearch/dict/ispell_no/norwegian.aff: ISO-8859 English text
Can you place that files anywhere wher I can download it
Hello,
Thanks for your efforts, I still don't get it to work.
I now tried the norwegian example. My encoding is ISO-8859 (I never
used UTF-8, because I thought it would be slower, the thread name is
a bit misleading).
So I am using an ISO-8859-9 database:
~/cvs/ssd% psql -l
Name
On 1/30/06, Oleg Bartunov wrote:
> On Fri, 27 Jan 2006, Harald Armin Massa wrote:
>
> > Teodor,
> >
> >>
> >> To all: May be, we should put all snowball's stemmers (for all available
> >> languages and encodings) to tsearch2 directory?
> >
> >
> > Yes, that would be VERY helpfull. Up to now I do n
On Fri, 27 Jan 2006, Harald Armin Massa wrote:
Teodor,
To all: May be, we should put all snowball's stemmers (for all available
languages and encodings) to tsearch2 directory?
Yes, that would be VERY helpfull. Up to now I do not dare to use tsearch2
because "get stemmer here, get dictionar
contrib_regression=# insert into pg_ts_dict values (
'norwegian_ispell',
(select dict_init from pg_ts_dict where dict_name='ispell_template'),
'DictFile="/usr/local/share/ispell/norsk.dict" ,'
'AffFile ="/usr/local/share/ispell/norsk.aff"',
(select d
Teodor,To all: May be, we should put all snowball's stemmers (for all available
languages and encodings) to tsearch2 directory?Yes, that would be VERY helpfull. Up to now I do not dare to use tsearch2 because "get stemmer here, get dictionary there..."Harald
-- GHUM Harald Massapersuadere et progra
Alexander,
could you try tsearch2 from CVS HEAD ?
tsearch2 in 8.1.X doesn't supports UTF-8 and works for someone
only by accident :)
Oleg
On Fri, 27 Jan 2006, Alexander Presber wrote:
Tsearch/isepll is not able to break this word into parts, because of the
"s" in "Produktion/s/interva
I should add that, with the minimal dictionary and .aff file,
"vertrags" gets reduced alright, dropping the trailing 's':
tstest=# SELECT tsearch2.ts_debug('vertrags');
ts_debug
-
(german,lword,"La
Tsearch/isepll is not able to break this word into parts, because
of the "s" in "Produktion/s/intervall". Misspelling the word as
"Produktionintervall" fixes it:
It should be affixes marked as 'affix in middle of compound word',
Flag is '~', example look in norsk dictionary:
flag ~\\:
[^S
On Wed, 23 Nov 2005, Hannes Dorbath wrote:
Hi,
I'm on PG 8.0.4, initDB and locale set to de_DE.UTF-8, FreeBSD.
My TSearch config is based on "Tsearch2 and Unicode/UTF-8" by Markus Wollny
(http://tinyurl.com/a6po4).
The following files are used:
http://hannes.imos.net/german.med [U
Tsearch/isepll is not able to break this word into parts, because of the
"s" in "Produktion/s/intervall". Misspelling the word as
"Produktionintervall" fixes it:
It should be affixes marked as 'affix in middle of compound word',
Flag is '~', example look in norsk dictionary:
flag ~\\:
[^S]
Hi,
I'm on PG 8.0.4, initDB and locale set to de_DE.UTF-8, FreeBSD.
My TSearch config is based on "Tsearch2 and Unicode/UTF-8" by Markus
Wollny (http://tinyurl.com/a6po4).
The following files are used:
http://hannes.imos.net/german.med [UTF-8]
http://hannes.imos.net/german.aff
15 matches
Mail list logo