Thanks for the explanation. Tom Lane <t...@sss.pgh.pa.us> 于2024年1月14日周日 23:46写道:
> Ron Johnson <ronljohnso...@gmail.com> writes: > > On Sun, Jan 14, 2024 at 6:18 AM Yongtao Huang <yongtaoh2...@gmail.com> > > wrote: > >> gpadmin=# create table t1 (c1 int, c2 text); > >> CREATE TABLE > >> gpadmin=# explain (costs off, verbose) select distinct c1 from t1; > >> QUERY PLAN > >> ----------------------------- > >> HashAggregate > >> Output: c1 > >> Group Key: t1.c1 > >> -> Seq Scan on public.t1 > >> Output: c1, c2 <---- pay attention <---- !!! > >> (5 rows) > >> > >> My question is why scan all columns in PG 16.01? > > > You can't scan just one column of a row-oriented table. > > The real question is why it mentions c2. > > The planner did that so that the SeqScan step doesn't have to > perform a projection: it can just return (a pointer to) > the physical tuple it found in the table, without doing extra > work to form a tuple containing only c1. The upper HashAgg > step won't really care. See use_physical_tlist() in createplan.c. > > What I'm confused about is why 9.4 didn't do the same. > That optimization heuristic is very old, and certainly > would be applied by 9.4 in some circumstances. Testing > says the behavior in this specific case changed at 9.6. > I'm not quite interested enough to drill down further... > > regards, tom lane >