We'll present in Prague some improvements in FTS. Unfortunately, we have
only several minutes during lighting talk. In short, we improved GIN to store additional information, coordinates for fts, for example and return ordered by rank search results, which gave us performance better than sphynx. It's just a prototype, but we already got median at 8 msec for 6 mln classifieds.

We didn't tested for long documents yet.

Regards,
Oleg

On Wed, 10 Oct 2012, Fran?ois Beausoleil wrote:


Le 2012-10-09 ? 17:38, Shane Hathaway a ?crit :

Hello,

The database has a text index of around 200,000 documents. Investigation 
revealed that text queries are slow only when using ts_rank or ts_rank_cd.  
Without a ts_rank function, any query is answered within 200ms or so; with 
ts_rank function, queries take up to 30 seconds.  Deeper investigation using 
gprof showed that the problem is probably not ts_rank or ts_rank_cd, but the 
fact that those functions retrieve thousands of TOASTed tsvectors.

Is the query perhaps doing something like this:

SELECT ...
FROM table
WHERE tsvectorcol @@ plainto_tsquery('...')
ORDER BY ts_rank(...)

If so, ts_rank() is run for every document. What you should do instead is:

SELECT *
FROM (
   SELECT ...
   FROM table
   WHERE tsvectorcol @@ plainto_tsquery('...')) AS t1
ORDER BY ts_rank(...)

Notice the ts_rank() is on the outer query, which means it'll only run on the 
subset of documents which match the query. This is explicitly mentioned in the 
docs:

"""Ranking can be expensive since it requires consulting the tsvector of each matching 
document, which can be I/O bound and therefore slow. Unfortunately, it is almost impossible to avoid since 
practical queries often result in large numbers of matches."""

(last paragraph of) 
http://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKING

Hope that helps!
Fran?ois Beausoleil



        Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to