On Thu, Feb 28, 2019 at 1:50 PM Nicolas Grilly <nico...@gardentechno.com> wrote:
> On Thu, Feb 28, 2019 at 1:24 PM Chris Travers <chris.trav...@gmail.com> > wrote: > >> 1. a) TB-scale full text search systems. >> b) PostgreSQL's full text search is quite capable but not so >> powerful that it can completely replace Lucene-based systems. So you have >> to consider complexity vs functionality if you are tying with other data >> that is already in PostgreSQL. Note further that my experience with at >> least ElasticSearch is that it is easier to scale something built on >> multiple PostgreSQL instances into the PB range than it is to scale >> ElasticSearch into the PB range. >> c) Solr or ElasticSearch >> > > One question about your use of PostgreSQL for a TB-scale full-text search > system: Did you order search results using ts_rank or ts_rank_cd? I'm > asking because in my experience, PostgreSQL full-text search is extremely > efficient, until you need ranking. It's because the indexes don't contain > the necessary information for ranking, and because of this the heap has to > be consulted, which implies a lot of random IO. > > I'd be curious to know a bit more about your experience in this regard. > Where I did this on the TB scale, we had some sort of ranking but it was not based on ts_rank. On the PB scale systems I work on now, it is distributed, and we don't order in PostgreSQL (or anywhere else, though if someone wants to write to disk and sort, they can do this I guess) > > Regards, > > Nicolas Grilly > > PS: A potential solution to the performance issue I mentioned is this PG > extension: https://github.com/postgrespro/rum > > -- Best Wishes, Chris Travers Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in. http://www.efficito.com/learn_more