Rich Shepard <[email protected]> wrote:

> select p.person_nbr, p.company_nbr, c.next_contact
> from people as p, contacts as c
> where c.next_contact >= '2025-11-01'
> group by p.person_nbr, p.company_nbr, c.next_contact
> order by p.person_nbr, p.company_nbr, max(c.next_contact);
>
> returns all contacts rather than only the latest one.

DISTINCT ON might be what you're looking for. It's an extremely useful
feature in postgres and well worth understanding. Here's a nice
explanation, that's similar to your case:
https://www.geekytidbits.com/postgres-distinct-on/

select distinct on (p.person_nbr) p.person_nbr, p.company_nbr, c.next_contact
from people as p join contacts as c using (person_nbr)
where c.next_contact >= '2025-11-01' order by  p.person_nbr, c.next_contact;

Using the following test data:
create table people(person_nbr, company_nbr) as values (1, 1), (2, 1), (3, 2);

create table contacts(person_nbr, next_contact) as values
(1, '2025-10-31'::date), (1, '2025-11-01'), (1, '2025, 11, 02'),
(3, '2025-11-02'::date), (3, '2025-11-03');

Here's the results:
person_nbr | company_nbr | next_contact
------------+-------------+--------------
          1 |           1 | 2025-11-01
          3 |           2 | 2025-11-02

DISTINCT ON can use an index matching the ORDER BY.
create index on contacts(person_nbr, next_contact);

Add some data to make the index worth it to the planner:
insert into contacts select i, '2025-11-05'::date + j from
generate_series(4, 100) i, generate_series(1, 100) j;
analyze contacts;

Check the query plan:
explain select distinct on (p.person_nbr) ...

This is what you want to see:
         ->  Index Only Scan using
contacts_person_nbr_next_contact_idx on contacts c
               Index Cond: (next_contact >= '2025-11-01'::date)

Thanks, Bernice


Reply via email to