On Thu, 14 Jun 2007, Tom Lane wrote:
Oleg Bartunov <[EMAIL PROTECTED]> writes:
You're correct. But we can't defend users from all possible errors.
Other side, that we need somehow to help user to identify what fts
configuration was used to produce tsvector. For example, comment on
tsvector column would be useful, but we don't know how to do this
automatically.
Yeah, I was wondering about that too. The only way we could relax the
superuser, you-better-know-what-you're-doing restriction on changing
configurations would be if we had a way to identify which tsvector
columns needed to be updated. Right now that's pretty hard to find out
because the references to configurations are buried in the bodies of
trigger functions. That whole trigger-function business is not the
nicest part of tsearch2, either ... it'd be better if we could automate
tsvector maintenance more.
yes, trigger function is a complex stuff, our tsearch() trigger is an
example of automated stuff. It could be written very easy on plpgsql,
for example.
=# create function my_update() returns trigger as
$$
BEGIN
NEW.fts=
setweight( to_tsvector('english',NEW.t1),'A') || ' ' ||
setweight( to_tsvector('english',NEW.t2),'B');
RETURN NEW;
END;
$$
language plpgsql;
One thing I was thinking about is that rather than storing a physical
tsvector column, people might index a "virtual" column using functional
indexes:
create index ... (to_tsvector('english', big_text_col))
which could be queried
select ... where to_tsvector('english', big_text_col) @@ tsquery
this is already possible for gin index
create index gin_text_idx on test using gin (
( coalesce(to_tsvector(title),'') || coalesce(to_tsvector(body),'') )
);
apod=# select title from test where
(coalesce(to_tsvector(title),'') || coalesce(to_tsvector(body),'') ) @@
to_tsquery('supernovae') order by sdate desc limit 10;
Assuming that the index is lossy, the index condition would have to be
rechecked, so to_tsvector() would have to be recomputed, but only at the
rows identified as candidate matches by the index. The I/O savings from
eliminating the heap's tsvector column might counterbalance the extra
CPU for recomputing tsvectors. Or not, but in any case this is
attractive because it doesn't need any handmade maintenance support like
a trigger --- the regular index maintenance code does it all.
I'm afraid it wouldn't work for all cases. We already have headline() function
which had to reparse document to produce text snippet and it's very slow
and eats most select time.
ALso, trigger stuff is a normal machinery for databases.
It strikes me that we could play the same kind of game we played to make
nextval() references to sequences be recognized as dependencies on
sequences. Invent a "regconfig" OID type that's just like regclass
except it handles OIDs of ts_config entries instead of pg_class entries,
and make the first argument of to_tsvector be one of those:
create index ... (to_tsvector('english'::regconfig, big_text_col))
Now dependency.c can be taught to recognize the regconfig Const as
depending on the referenced ts_config entry, and voila we have a
pg_depend entry showing that the index depends on the configuration.
What we actually do about it is another question, but this at least
gets the knowledge into the system.
interesting. And \di could display all configuration stuff for text search
indexes ?
[ thinks some more... ] If we revived the GENERATED AS patch,
you could imagine computing tsvector columns via "GENERATED AS
to_tsvector('english'::regconfig, big_text_col)" instead of a
trigger, and then again you've got the dependency exposed where
the system can see it. I don't wanna try to do that for 8.3,
but it might be a good path to pursue in future, instead of assuming
that triggers will be the way forevermore.
Thoughts?
No way with standard. GENERATED AS says that "all columns references in an
expression associated with a generated column must be to columns of the base
table containing that generated column."
tsvector could be result of rather complex select involving several tables.
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq