Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

Bruce Momjian Fri, 29 May 2009 11:17:20 -0700

Tom Lane wrote:
> Josh Berkus <j...@agliodbs.com> writes:
> > Bruce,
> >> The ordering of the lexems was changed:
> 
> > What does that get us in terms of performance etc.?
> 
> It was changed to support partial-match tsvector queries.  Without it,
> a partial match query would have to scan entire tsvectors instead
> of applying binary search.  I don't know if Oleg and Teodor did any
> actual performance tests on the size of the hit, but it seems like
> it could be pretty awful for large documents.


I started thinking about the performance issues of the tsvector changes.
Teodor gave me this code for conversion that basically does:

        qsort_arg((void *) ARRPTR(t), t->size, sizeof(WordEntry), cmpLexeme, 
(void*) t);

So, basically, every time there is a cast we have to do a sort, which
for a large document would yield poor performance, and because we are
not storing the sorted result, it happens for every access;  this might
be an unacceptable performance burden.

So, one idea would be, instead of a cast, have pg_migrator rebuild the
tsvector columns with ALTER TABLE, so then the 8.4 index code could be
used.  But then we might as well just tell the users to migrate the
tsvector tables themselves, which is how pg_migrator behaves now.

Obviously we are still trying to figure out the best way to handle data
type changes;  I think as soon as we figure out a plan for tsvector we
can use that method for future changes.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

Reply via email to