Weird ranking results with ts_rank

2019-11-14 Thread Javier Ayres
Hi everybody.

I'm implementing a solution that uses PostgreSQL's full text search
capabilities and I have come across a particular set of results for ts_rank
that don't seem to make sense according to the documentation. I have tried
the following queries in PostgreSQL 10, 11 and 12.
In both cases only the word "box" is matching, but adding a non-matching
word with OR to the query increases the ranking. If I keep adding more
non-matching words with OR the ranking starts to decrease again, but I
would imagine that the second option should have the highest score and it
would start decreasing from there the more non-matching words I add.
Is there something I'm not understanding?

Thanks.

postgres=# select ts_rank(to_tsvector('search for a text box'),
to_tsquery('circle | lot <-> box'));
   ts_rank
-
 0.020264236
(1 row)

postgres=# select ts_rank(to_tsvector('search for a text box'),
to_tsquery('lot <-> box'));
 ts_rank
-
   1e-20
(1 row)

-- 
Javier Ayres
Data Engineer
+1 855 636 5811 <+18556365811> - sophilabs.co


Re: Weird ranking results with ts_rank

2019-11-18 Thread Javier Ayres
Oh I see. I was working as if no match was the same as ts_rank=0.

Great advice. Thank you very much.

On Sat, Nov 16, 2019 at 2:22 PM Jeff Janes  wrote:

> On Fri, Nov 15, 2019 at 1:31 AM Javier Ayres  wrote:
>
>> Hi everybody.
>>
>> I'm implementing a solution that uses PostgreSQL's full text search
>> capabilities and I have come across a particular set of results for ts_rank
>> that don't seem to make sense according to the documentation.
>>
>
> While the documentation doesn't come out and say, my interpretation is
> that ts_rank assumes there is a match in the first place, and by
> implication is undefined/unspecified if there is no match.
>
> select to_tsvector('search for a text box') @@ to_tsquery('circle | lot
> <-> box');
>  ?column?
> --
>  f
> (1 row)
>
> Cheers,
>
> Jeff
>


-- 
Javier Ayres
Data Engineer
+1 855 636 5811 <+18556365811> - sophilabs.co