The following bug has been logged online: Bug reference: 5219 Logged by: Kenaniah Cerny Email address: kenan...@gmail.com PostgreSQL version: 8.4.1 Operating system: Centos5.2 -- Linux 2.6.18-92.1.10.el5 #1 SMP i686 athlon i386 GNU/Linux Description: Segfault in to_tsvector Details:
Full backtrace: http://pgsql.privatepaste.com/5411abf8f3 The issue takes place running this query: http://pgsql.privatepaste.com/35064cbba8 Crash is attributed to this index definition: CREATE INDEX "anime_titles_idx_name_simple_text" ON "public"."anime_titles" USING gin ((to_tsvector('simple'::regconfig, name))); I believe the issue is caused by possibly non-UTF-8 data. Both the server and the client (a PHP script using PDO's pgsql driver) are using UTF-8. The string causing this issue is stored in the database in a text field and looks like this: http://s801.photobucket.com/albums/yy299/kenaniah972/?action=view¤t=is sue.png After output into an HTML input field and resubmission through firefox, the string that is passed through to the DB looks like this: http://s801.photobucket.com/albums/yy299/kenaniah972/?action=view¤t=su bmitted.png (The &# characters were manually omitted in submission) I don't profess to know anything about encodings, but I don't think this is valid UTF-8 input. I might be wrong. All I do know is that this causes the to_tsvector part of the gin index to throw a segfault in the insert statement, rather than returning an invalid UTF-8 input error or just plain working. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs