Re: [GENERAL] Text search dictionary vs. the C locale

2017-07-02 Thread twoflower
Tom Lane-2 wrote > Presumably the problem is that the dictionary file parsing functionsreject > anything that doesn't satisfy t_isalpha() (unless it matchest_isspace()) > and in C locale that's not going to accept very much. That's what I also guessed and the fact that setting lc-ctype=en_US.UTF-8

Re: [GENERAL] Text search dictionary vs. the C locale

2017-07-02 Thread Gmail
> On Jul 2, 2017, at 10:06 AM, Tom Lane wrote: > > twoflower writes: >> I am having problems creating an Ispell-based text search dictionary for >> Czech language. > >> Issuing the following command: > >> create text search dictionary czech_ispell ( >> template = ispell, >> dictfile = czec

Re: [GENERAL] Text search dictionary vs. the C locale

2017-07-02 Thread Gmail
Sent from my iPad > On Jul 2, 2017, at 10:06 AM, Tom Lane wrote: > > twoflower writes: >> I am having problems creating an Ispell-based text search dictionary for >> Czech language. > >> Issuing the following command: > >> create text search dictionary czech_ispell ( >> template = ispell,

Re: [GENERAL] Text search dictionary vs. the C locale

2017-07-02 Thread Tom Lane
twoflower writes: > I am having problems creating an Ispell-based text search dictionary for > Czech language. > Issuing the following command: > create text search dictionary czech_ispell ( > template = ispell, > dictfile = czech_ispell, > affFile = czech_ispell > ); > ends with > ERROR

Re: [GENERAL] Text search dictionary vs. the C locale

2017-07-02 Thread twoflower
Initializing the cluster with initdb   --locale=C   --lc-ctype=en_US.UTF-8   --lc-messages=en_US.UTF-8   --lc-monetary=en_US.UTF-8   --lc-numeric=en_US.UTF-8   --lc-time=en_US.UTF-8   --encoding=UTF8 allows me to use my text search dictionary. Now it only remains to see whether index creation wi

Re: [GENERAL] text search synonym dictionary anomaly with numbers

2011-11-27 Thread Richard Greenwood
To answer my own question - my synonym dictionary was not be applied to '1st' because '1st' is a numword, not an asciiword, and my synonym dictionary was not mapped to numword. To map a dictionary token class: ALTER TEXT SEARCH CONFIGURATION english ALTER MAPPING FOR numword WITH my_synonym_dic

Re: [GENERAL] text search synonym dictionary anomaly with numbers

2011-11-27 Thread Richard Greenwood
Oleg, Thank you. I am sure that you have identified my problem. \dF+ english (output below) lists my dictionary which is named 'rwg_synonym' before numword so I would have thought that my dictionary would have normalized '1st' to '1' before the numword dictionary was reached. Maybe this question

Re: [GENERAL] text search synonym dictionary anomaly with numbers

2011-11-27 Thread Oleg Bartunov
Richard, you should check your mapping - '1st' belongs to 'numword' and may be processed in a different way than 'first' or '1'. Oleg On Sat, 26 Nov 2011, Richard Greenwood wrote: I am working with street address data in which 'first st' has been entered as '1 st' and so on. So I have created

Re: [GENERAL] Text search parser's treatment of URLs and emails

2011-02-01 Thread Bruce Momjian
I have added this as a TODO: * Improve handling of plus signs in email address user names, and perhaps improve URL parsing * http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php -

Re: [GENERAL] Text search parser's treatment of URLs and emails

2010-10-12 Thread Bruce Momjian
Thom Brown wrote: > Hi, > > I noticed that if I run this: > > SELECT alias, description, token FROM > ts_debug('http://www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary'); > > I get: > > alias | description | token > --+-

Re: [GENERAL] Text search parser's treatment of URLs and emails

2010-09-29 Thread Thom Brown
On 8 September 2010 21:48, Thom Brown wrote: > Hi, > > I noticed that if I run this: > > SELECT alias, description, token FROM > ts_debug('http://www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary'); > > I get: > >  alias   |  description  |                              t

Re: [GENERAL] Text search

2010-03-16 Thread Chris Roffler
Richard thanks for the pointers unfortunately its not just attribute names. Here is what I am thinking of doing; In a first step I run a query SELECT id FROM time_series WHERE to_tsvector(xml_string) @@ to_tsquery( anystring ); then I load the actual xml string into memory for each i

Re: [GENERAL] Text search

2010-03-16 Thread Richard Huxton
On 16/03/10 13:49, Richard Huxton wrote: You could run an xslt transform over the xml fragments and extract what you want and then use tsearch to index that, I suppose. Similarly, you might be able to do the same via xslt and xquery. Actually, if it's only attribute names you're interested in y

Re: [GENERAL] Text search

2010-03-16 Thread Richard Huxton
On 16/03/10 12:36, Chris Roffler wrote: Richard I tried all that and you can see it on this thread, there are some limitations on indexs on xpath work http://archives.postgresql.org/pgsql-general/2010-03/msg00270.php OK - I'v

Re: [GENERAL] Text search

2010-03-16 Thread Chris Roffler
Richard I tried all that and you can see it on this thread, there are some limitations on indexs on xpath work http://archives.postgresql.org/pgsql-general/2010-03/msg00270.php On Tue, Mar 16, 2010 at 2:21 PM, Richard Huxton wr

Re: [GENERAL] Text search

2010-03-16 Thread Richard Huxton
On 16/03/10 10:29, Chris Roffler wrote: I have a text column in a table. We store XML in this column. Now I want to search for tags and values select * from where to_tsvector('english',xml_column) @@ to_tsquery('Citi Bank') This works fine but it also works for any tag as long as the nam

Re: [GENERAL] text search in 8.1

2010-02-22 Thread Albe Laurenz
AI Rumman wrote: > I have a plan to upgrade database, but right now I have to > use text search indexing for performance improvement. > > Following is the rpm status of my server: > > [r...@vcrmdev01 ~]# rpm -qa|grep postgres > postgresql-8.1.11-1.el5_1.1 > postgresql-python-8.1.11-1.el5_1.1 >

Re: [GENERAL] text search in 8.1

2010-02-22 Thread AI Rumman
I have a plan to upgrade database, but right now I have to use text search indexing for performance improvement. Following is the rpm status of my server: [r...@vcrmdev01 ~]# rpm -qa|grep postgres postgresql-8.1.11-1.el5_1.1 postgresql-python-8.1.11-1.el5_1.1 postgresql-server-8.1.11-1.el5_1.1 po

Re: [GENERAL] text search in 8.1

2010-02-22 Thread David Fetter
On Mon, Feb 22, 2010 at 02:47:00PM +0600, AI Rumman wrote: > Does Postgresql 8.1 support Full Text Search? > If yes, please provide the link about documentation. It's available as an add-on, but since 8.1 is so close to its end of life, consider moving to 8.4 first, or if the project is out past Q

Re: [GENERAL] Text search without dictionary ?

2009-06-09 Thread Oleg Bartunov
Marc, we'll probably add option to simple dictionary for 8.5, but I think if you were able to write your own parser it'd be not difficult to write 'simplest' dictionary, which does nothing. Just take simple dictionary and remove lowercasing :) Oleg On Tue, 9 Jun 2009, Marc Mamin wrote: Hello,

Re: [GENERAL] Text search, ERROR: invalid byte sequence for encoding "UTF8": 0xe9640a

2009-02-03 Thread Oleg Bartunov
James, you forgot to convert files to UTF8. iconv -f ISO8859-1 -t utf8 en_GB.dic > en_gb.dict iconv -f ISO8859-1 -t utf8 en_GB.aff > en_gb.affix Oleg On Tue, 3 Feb 2009, James Dooley wrote: I downloaded the hunspell en_GB from http://wiki.services.openoffice.org/wiki/Dictionaries#English_.28A

Re: [GENERAL] Text search, ERROR: invalid byte sequence for encoding "UTF8": 0xe9640a

2009-02-03 Thread James Dooley
It's postgresql-8.3.5-2 (windows) On Tue, Feb 3, 2009 at 4:37 PM, Tom Lane wrote: > James Dooley writes: > > and when building the Ispell dictionary I got the following error > > > ERROR: invalid byte sequence for encoding "UTF8": 0xe9640a > > What PG version? 8.3.x before 8.3.4 had some pr

Re: [GENERAL] Text search, ERROR: invalid byte sequence for encoding "UTF8": 0xe9640a

2009-02-03 Thread Tom Lane
James Dooley writes: > and when building the Ispell dictionary I got the following error > ERROR: invalid byte sequence for encoding "UTF8": 0xe9640a What PG version? 8.3.x before 8.3.4 had some problems in this area. regards, tom lane -- Sent via pgsql-general maili

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 6:53 PM, Tommy Gildseth wrote: > Thanks a lot. Exceptional response time :D > Less than 2.5 hours from problem reported, till a patch was made. Don't > think there's many projects or commercial products that can compete with > that ;-) Oh, wait , it still has to go through

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
Teodor Sigaev wrote: I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch is attached and I'm going to commit it Thanks a lot. Exceptional response time :D Less than 2.5 hours from problem reported, till a patch was made. Don't think there's many projects or commer

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
char" issue? Does this affect the old contrib/tsearch2 code? Checked - No, that was improvement for 8.3 :). -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-general maili

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev writes: > Tom Lane wrote: >> Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned >> char" issue? > Yes, but I don't understand why it worked in 32-bit box. You were casting to unsigned int. So the offset added to the base pointer for, say, 255 in the char would

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Gregory Stark writes: > I really think he should just change all the "unsigned int" into "unsigned > char" and not do the type punning with pointer casts. That's just evil. Oh, I see. That would work too, but I don't really see that it's a huge improvement. What *would* be an improvement IMHO i

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark
Tom Lane writes: > Gregory Stark writes: >> Maybe I'm missing something but I don't understand how this fixes the >> problem. >> s is a "char*" so type punning it to an unsigned char * before dereferencing >> it is really the same as casting it to unsigned char directly > > No, it isn't. If c

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev writes: > Tom Lane wrote: >> Please try to make the commits in the next eight hours, as we have >> release wraps scheduled for tonight. > Minor versions or beta of 8.4? This is just back-branch update releases. 8.4 beta is still a good ways off :-( regards

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 4:06 PM, Gregory Stark wrote: > Gregory Stark writes: > Ah, I understand how this fixes the problem. You were casting to unsigned > *int* not unsigned char so it was sign extending first and then overflowing. :) > It still seems to me if you put a few "unsigned" in varia

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark
Gregory Stark writes: > Teodor Sigaev writes: > >> I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, >> patch >> is attached and I'm going to commit it > ... > >> !Conf->flagval[(unsigned int) *s] = (unsigned char) val; > ... >> !Conf->flagval[*(unsigned char*) s]

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
To be honest, looking through that file, I am quite worried about few points. I don't know too much about insights of ispell, but I see few suspicious things in mkSPNode too. I generally don't want to get involve in reviewing code for stuff I don't know, But if Teodor (and Oleg) don't mind, I can

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Devrim GÜNDÜZ
On Thu, 2009-01-29 at 19:00 +0300, Teodor Sigaev wrote: > > Please try to make the commits in the next eight hours, as we have > > release wraps scheduled for tonight. > > Minor versions or beta of 8.4? Minor versions. -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gun

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Gregory Stark writes: > Maybe I'm missing something but I don't understand how this fixes the problem. > s is a "char*" so type punning it to an unsigned char * before dereferencing > it is really the same as casting it to unsigned char directly No, it isn't. If char is signed then you'll get qu

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
Tom Lane wrote: Teodor Sigaev writes: I reproduced the bug with a help of Grzegorz's point for 64-bit box. Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned char" issue? Yes, but I don't understand why it worked in 32-bit box. Does this affect the old contrib/tsear

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark
Teodor Sigaev writes: > I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch > is attached and I'm going to commit it ... > ! Conf->flagval[(unsigned int) *s] = (unsigned char) val; ... > ! Conf->flagval[*(unsigned char*) s] = (unsigned char) val; Maybe I'm mis

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev writes: > I reproduced the bug with a help of Grzegorz's point for 64-bit box. Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned char" issue? Does this affect the old contrib/tsearch2 code? Please try to make the commits in the next eight hours, as we have rele

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 3:32 PM, Teodor Sigaev wrote: > > >> Than I have quite few notes about that function: >> - affix is not checked on entry, and should be unsigned, > > Could be Assert( affix>=0 && affix < Conf->nAffixData ) > wouldn't that crash pg backend too ? The structure that this file

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 3:26 PM, Teodor Sigaev wrote: > I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, > patch is attached and I'm going to commit it :) To be honest, looking through that file, I am quite worried about few points. I don't know too much about insights of

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
Than I have quite few notes about that function: - affix is not checked on entry, and should be unsigned, Could be Assert( affix>=0 && affix < Conf->nAffixData ) - for sake of safety uint32_t should be used instead of unsigned int, in the cast see patch - there should be some safety limit

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch is attached and I'm going to commit it -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ *** src/backend/tsearch/

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
if it's static uint32 makeCompoundFlags(IspellDict *Conf, int affix) { uint32 flag = 0; char *str = Conf->AffixData[affix]; while (str && *str) { flag |= Conf->flagval[(unsigned int) *str]; str++; } ret

Re: [GENERAL] Text search name and name synonims dictionary

2009-01-29 Thread Igor Katson
Oleg Bartunov wrote: On Thu, 29 Jan 2009, Igor Katson wrote: I have a column, containing the name of the user and there is a need to organize an indexed search on this column. As far as I understand, I need to use the full-text search capabilities of postgres. I would like to attach a dictio

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
Teodor Sigaev wrote: How do I make a backtrace? - if you have coredump, just execute gdb /PATH1/postgres gdb /PATH2/core and type bt. Linux doesn't make core by default, so you allow to do it by ulimit -c unlimited for postgresql user - connect to db, and attach gdb to backend process: gdb /P

Re: [GENERAL] Text search name and name synonims dictionary

2009-01-29 Thread Oleg Bartunov
On Thu, 29 Jan 2009, Igor Katson wrote: I have a column, containing the name of the user and there is a need to organize an indexed search on this column. As far as I understand, I need to use the full-text search capabilities of postgres. I would like to attach a dictionary, containing many

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
to get coredump: sudo su - postgres ulimit -c unlimited pg_ctl restart Also, Teodor - mind the fact, that his machine is 64, and you've tested it on 32bits. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mail

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
How do I make a backtrace? - if you have coredump, just execute gdb /PATH1/postgres gdb /PATH2/core and type bt. Linux doesn't make core by default, so you allow to do it by ulimit -c unlimited for postgresql user - connect to db, and attach gdb to backend process: gdb /PATH1/postgres BACKEND

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
I tried without specifying a StopWords-list as well, but same thing happens. Teodor Sigaev wrote: Could you provide a backtrace? Do you use unchanged norwegian.stop file? I'm not able to reproduce the bug - postgres just works. Tommy Gildseth wrote: While trying to create a new dictionary for

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
Yes, originaly I used a customized norwegian.stop-file, but I changed that to the one that comes with PostgreSQL, and got the same error. How do I make a backtrace? Teodor Sigaev wrote: Could you provide a backtrace? Do you use unchanged norwegian.stop file? I'm not able to reproduce the bug -

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
It works on one of my servers: SELECT version(); version - PostgreSQL 8.3.4 on i686-pc-linux-gnu, compiled by GCC cc (GCC) 3.2.3 20030502 (Red

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
Could you provide a backtrace? Do you use unchanged norwegian.stop file? I'm not able to reproduce the bug - postgres just works. Tommy Gildseth wrote: While trying to create a new dictionary for use with PostgreSQL text search, I get a segfault. My Postgres version is 8.3.5 -- Teodor Sigaev

Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Oleg Bartunov
Tommy, I tried your example and didn't find any problem. My postgresql version is 8.3.3 and I didn't use stopwords, since I don't have them. arxiv=# select version(); version PostgreSQL 8.3.

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Oleg Bartunov
On Tue, 27 Jan 2009, Tommy Gildseth wrote: sorry, I don't know norwegian, what do you mean ? Did you complain that no_ispell doesn't recognize these words ? Yes, I'm sorry, I should have explained better. The words hemsedalsdans, hengesmykke and l?rdalsbrua, are "concatenations" of the words

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Tommy Gildseth
Oleg Bartunov wrote: On Tue, 27 Jan 2009, Tommy Gildseth wrote: Tommy Gildseth wrote: Oleg Bartunov wrote: Have you read http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY We suggest to use dictionaries which come with openoffice, hunspell

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Oleg Bartunov
On Tue, 27 Jan 2009, Tommy Gildseth wrote: Tommy Gildseth wrote: Oleg Bartunov wrote: Have you read http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY We suggest to use dictionaries which come with openoffice, hunspell, probably has bette

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Tommy Gildseth
Tommy Gildseth wrote: Oleg Bartunov wrote: Have you read http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY We suggest to use dictionaries which come with openoffice, hunspell, probably has better support of composite words. Thanks, tha

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Tommy Gildseth
Oleg Bartunov wrote: Have you read http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY We suggest to use dictionaries which come with openoffice, hunspell, probably has better support of composite words. Thanks, that knocked me onto the r

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Oleg Bartunov
Have you read http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY We suggest to use dictionaries which come with openoffice, hunspell, probably has better support of composite words. On Tue, 27 Jan 2009, Tommy Gildseth wrote: Oleg Bartunov wr

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Tommy Gildseth
Oleg Bartunov wrote: On Tue, 27 Jan 2009, Tommy Gildseth wrote: I'm trying to figure out how to use PostgreSQL's fulltext search with an ispell dictionary. I'm having a bit of trouble figuring out where this norwegian.dict comes from though. When I install the norwegian ispell dictionary, i ge

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Oleg Bartunov
On Tue, 27 Jan 2009, Tommy Gildseth wrote: I'm trying to figure out how to use PostgreSQL's fulltext search with an ispell dictionary. I'm having a bit of trouble figuring out where this norwegian.dict comes from though. When I install the norwegian ispell dictionary, i get 4 files, nb.aff, nb

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Tommy Gildseth
Andreas Wenk wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tommy Gildseth schrieb: I'm trying to figure out how to use PostgreSQL's fulltext search with an ispell dictionary. I'm having a bit of trouble figuring out where this norwegian.dict comes from though. When I install the norwegia

Re: [GENERAL] Text search with ispell

2009-01-27 Thread Andreas Wenk
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tommy Gildseth schrieb: > I'm trying to figure out how to use PostgreSQL's fulltext search with an > ispell dictionary. I'm having a bit of trouble figuring out where this > norwegian.dict comes from though. > When I install the norwegian ispell dictio

Re: [GENERAL] Text search configuration

2008-09-01 Thread Oleg Bartunov
On Tue, 2 Sep 2008, Pedro Stavrinides wrote: Hi All, This is my first post to this mailing list, so apologies if I am sending my message to the incorrect place... My question: I have configured a very basic text search for our application, but got stuck at the point where I need the search vec

Re: [GENERAL] Text search with multiple tables

2008-05-01 Thread Mont Rothstein
I found a way to do this but I don't know if there is a better way. What I did was to create a separate index on each table and construct a query like: SELECT * FROM a WHERE (to_tsvector(...) @@ to_tsquery(...)) OR primaryKey IN (SELECT distinct(foreign_key) FROM b WHERE to_tsvector(...) @@ to_tsq

Re: [GENERAL] Text Search Configuration Problem

2008-04-05 Thread Oleg Bartunov
Kevin, it looks like you use UTF-8, so the problem in .aff file, which contains cyrillic comments :) I converted files into UTF-8 encoding using iconv. Oleg On Thu, 3 Apr 2008, Kevin Reynolds wrote: I'm using Postgresql version 8.3.1 on CentOS 5 and am following the steps in section 12.7 of

Re: [GENERAL] Text Search Configuration Problem

2008-04-05 Thread Tom Lane
Kevin Reynolds <[EMAIL PROTECTED]> writes: > I get the following error: > ERROR: invalid byte sequence for encoding "UTF8": 0xe0c020 > HINT: This error can also happen if the byte sequence does not match the > encoding expected by the server, which is controlled by "client_encoding".

Re: [GENERAL] Text Search zero padding

2008-02-29 Thread Oleg Bartunov
On Fri, 29 Feb 2008, Richard Huxton wrote: Oleg Bartunov wrote: On Thu, 28 Feb 2008, Richard Greenwood wrote: So far my best idea is to create a tsvector column containing both padded and non-padded versions of the value. i.e. put both R1234 and R0001234 into the tsvector column. This seems p

Re: [GENERAL] Text Search zero padding

2008-02-29 Thread Richard Huxton
Oleg Bartunov wrote: On Thu, 28 Feb 2008, Richard Greenwood wrote: So far my best idea is to create a tsvector column containing both padded and non-padded versions of the value. i.e. put both R1234 and R0001234 into the tsvector column. This seems pretty brute force, and I am pretty new to tex

Re: [GENERAL] Text Search zero padding

2008-02-29 Thread Oleg Bartunov
On Thu, 28 Feb 2008, Richard Greenwood wrote: I am using text search across multiple columns. Two of the columns have values that have zero padding - sort of. The values look like R0001234 (1 char followed by 7 digits, zero padded). Users are accustom to searching with and without the zero paddi

Re: [GENERAL] Text Search zero padding

2008-02-28 Thread Tom Lane
"Richard Greenwood" <[EMAIL PROTECTED]> writes: > I am using text search across multiple columns. Two of the columns > have values that have zero padding - sort of. The values look like > R0001234 (1 char followed by 7 digits, zero padded). Users are > accustom to searching with and without the zer

Re: [GENERAL] Text Search vs MYSQL vs Lucene

2004-09-09 Thread David Garamond
Steve Atkins wrote: What would be performance of pgSQL text search vs MySQL vs Lucene (flat file) for a 2 terabyte db? thanks for any comments. My experience with tsearch2 has been that indexing even moderately large chunks of data is too slow to be feasible. Moderately large meaning tens of megab

Re: [GENERAL] Text Search vs MYSQL vs Lucene

2004-09-09 Thread Vic Cekvenich
It be at least dual opteron 64 w 4 gigs of ram runing fedora with a huge raid striped drives as single volume. A similar system and types of querries would be this: http://marc.theaimsgroup.com So I guess a table scan. .V Shridhar Daithankar wrote: On Thursday 09 Sep 2004 6:26 pm, Vic Cekvenich w

Re: [GENERAL] Text Search vs MYSQL vs Lucene

2004-09-09 Thread Steve Atkins
On Thu, Sep 09, 2004 at 07:56:20AM -0500, Vic Cekvenich wrote: > What would be performance of pgSQL text search vs MySQL vs Lucene (flat > file) for a 2 terabyte db? > thanks for any comments. My experience with tsearch2 has been that indexing even moderately large chunks of data is too slow to

Re: [GENERAL] Text Search vs MYSQL vs Lucene

2004-09-09 Thread Shridhar Daithankar
On Thursday 09 Sep 2004 6:26 pm, Vic Cekvenich wrote: > What would be performance of pgSQL text search vs MySQL vs Lucene (flat > file) for a 2 terabyte db? Well, it depends upon lot of factors. There are few questions to be asked here.. - What is your hardware and OS configuration? - What type o

Re: [GENERAL] Text search

2001-03-14 Thread Richard Huxton
From: "Jan Ploski" <[EMAIL PROTECTED]> > Hi, > > Thanks for all your remarks regarding 7.1's stability. I'm going to get > it running as our mail/news database backend and come back with bug > reports ;-) > > Another question which comes to my mind: what would be the best way to > implement a se