Bruce Momjian <br...@momjian.us> writes: > Is this a TODO? AFAIR nothing's been done about the speed issue, so yes. I didn't like the idea of creating a user-visible knob when the speed issue might be fixable with internal algorithm improvements, but we never followed up on this in either fashion.
regards, tom lane > --------------------------------------------------------------------------- > On Tue, Aug 23, 2011 at 10:31:42PM -0400, Tom Lane wrote: >> Sushant Sinha <sushant...@gmail.com> writes: >>> Doesn't this force the headline to be taken from the first N words of >>> the document, independent of where the match was? That seems rather >>> unworkable, or at least unhelpful. >> >>> In headline generation function, we don't have any index or knowledge of >>> where the match is. We discover the matches by first tokenizing and then >>> comparing the matches with the query tokens. So it is hard to do >>> anything better than first N words. >> >> After looking at the code in wparser_def.c a bit more, I wonder whether >> this patch is doing what you think it is. Did you do any profiling to >> confirm that tokenization is where the cost is? Because it looks to me >> like the match searching in hlCover() is at least O(N^2) in the number >> of tokens in the document, which means it's probably the dominant cost >> for any long document. I suspect that your patch helps not so much >> because it saves tokenization costs as because it bounds the amount of >> effort spent in hlCover(). >> >> I haven't tried to do anything about this, but I wonder whether it >> wouldn't be possible to eliminate the quadratic blowup by saving more >> state across the repeated calls to hlCover(). At the very least, it >> shouldn't be necessary to find the last query-token occurrence in the >> document from scratch on each and every call. >> >> Actually, this code seems probably flat-out wrong: won't every >> successful call of hlCover() on a given document return exactly the same >> q value (end position), namely the last token occurrence in the >> document? How is that helpful? >> >> regards, tom lane >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers > -- > Bruce Momjian <br...@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > + It's impossible for everything to be true. + > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers