FYI, I have kept this email from 2011 about poor performance of parsed
words in headline generation. If someone wants to research it, please
do so:
http://www.postgresql.org/message-id/1314117620.3700.12.camel@dragflick
---
On Wed, Aug 15, 2012 at 11:09:18PM +0530, Sushant Sinha wrote:
> I will do the profiling and present the results.
Sushant, do you have any profiling results on this issue from August?
---
>
> On Wed, 2012-08-15 at 12:45 -0
I will do the profiling and present the results.
On Wed, 2012-08-15 at 12:45 -0400, Tom Lane wrote:
> Bruce Momjian writes:
> > Is this a TODO?
>
> AFAIR nothing's been done about the speed issue, so yes. I didn't
> like the idea of creating a user-visible knob when the speed issue
> might be f
Bruce Momjian writes:
> Is this a TODO?
AFAIR nothing's been done about the speed issue, so yes. I didn't
like the idea of creating a user-visible knob when the speed issue
might be fixable with internal algorithm improvements, but we never
followed up on this in either fashion.
This might indicate that the hlCover() item is resolved.
---
On Wed, Aug 24, 2011 at 10:08:11AM +0530, Sushant Sinha wrote:
>
>
> Actually, this code seems probably flat-out wrong: won't every
> successful call of
Is this a TODO?
---
On Tue, Aug 23, 2011 at 10:31:42PM -0400, Tom Lane wrote:
> Sushant Sinha writes:
> >> Doesn't this force the headline to be taken from the first N words of
> >> the document, independent of where the ma
>
> Actually, this code seems probably flat-out wrong: won't every
> successful call of hlCover() on a given document return exactly the same
> q value (end position), namely the last token occurrence in the
> document? How is that helpful?
>
>regards, tom lane
>
There is
Sushant Sinha writes:
>> Doesn't this force the headline to be taken from the first N words of
>> the document, independent of where the match was? That seems rather
>> unworkable, or at least unhelpful.
> In headline generation function, we don't have any index or knowledge of
> where the match
> > Here is a simple patch that limits the number of words during the
> > tokenization phase and puts an upper-bound on the headline generation.
>
> Doesn't this force the headline to be taken from the first N words of
> the document, independent of where the match was? That seems rather
> unwor
Excerpts from Tom Lane's message of mar ago 23 15:59:18 -0300 2011:
> Sushant Sinha writes:
> > Given a document and a query, the goal of headline generation is to
> > produce text excerpts in which the query appears.
>
> ... right ...
>
> > Here is a simple patch that limits the number of words
Sushant Sinha writes:
> Given a document and a query, the goal of headline generation is to
> produce text excerpts in which the query appears.
... right ...
> Here is a simple patch that limits the number of words during the
> tokenization phase and puts an upper-bound on the headline generatio
Given a document and a query, the goal of headline generation is to
produce text excerpts in which the query appears. Currently the headline
generation in postgres follows the following steps:
1. Tokenize the documents and obtain the lexemes
2. Decide on lexemes that should be the part of the head
12 matches
Mail list logo