Thank you for your help, Tom. You are right. I added an index on employee.name (by making it unique), and then postgres can visit employee table in a pre-sorted manner, and can exit early without joining more rows.
Just sharing the tweak I did to the example, if anyone else is interested in a quick test. I also populated 1 million rows so the example is no longer a toy demo. ```sql drop table if exists department; drop table if exists employee; create table department( id int primary key, name text); create table employee( id int primary key, name text unique, department_id int); INSERT INTO department (id, name) SELECT i+1, 'department' || i+1 FROM generate_series(0, 9) AS i; INSERT INTO employee (id, name, department_id) SELECT i+1, 'name' || i+1, i % 10 +1 FROM generate_series(0, 999999) AS i; analyze department; analyze employee; explain analyze select * from employee left outer join department on employee.department_id = department.id order by employee.name limit 10; ``` And here is the plan: ``` QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.57..1.36 rows=10 width=34) (actual time=0.017..0.030 rows=10 loops=1) -> Nested Loop Left Join (cost=0.57..78630.06 rows=1000000 width=34) (actual time=0.016..0.028 rows=10 loops=1) -> Index Scan using employee_name_key on employee (cost=0.42..54855.68 rows=1000000 width=18) (actual time=0.008..0.015 rows=10 loops=1) -> Memoize (cost=0.15..0.16 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=10) Cache Key: employee.department_id Cache Mode: logical Hits: 6 Misses: 4 Evictions: 0 Overflows: 0 Memory Usage: 1kB -> Index Scan using department_pkey on department (cost=0.14..0.15 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=4) Index Cond: (id = employee.department_id) Planning Time: 0.189 ms Execution Time: 0.045 ms (11 rows) ``` Personally I still wish someday postgres can push down `limit` node together with `sort` node when certain conditions are met, so that there's no need to add an index :D Thank you again for your help! On Mon, 17 Feb 2025 at 18:01, Tom Lane <t...@sss.pgh.pa.us> wrote: > WU Yan <4wu...@gmail.com> writes: > > Hello everyone, I am still learning postgres planner and performance > > optimization, so please kindly point out if I missed something obvious. > > An index on employee.name would likely help here. Even if we had > an optimization for pushing LIMIT down through a join (which you > are right, we don't) it could not push the LIMIT through a sort step. > So you need presorted output from the scan of "employee". I think > this example would behave better with that. You may also need to > test with non-toy amounts of data to get the plan you think is > better: an example with only half a dozen rows is going to be > swamped by startup costs. > > regards, tom lane >