On Fri, Mar 13, 2020 at 12:09 PM Dmitry Dolgov <9erthali...@gmail.com> wrote: > I've also noticed that for partitioned tables every partition is > evaluated separately. IIRC they structure cannot differ, does it > makes sense then?
Good point. At some point, we had discussed only collecting the columns for one of the child partitions and then using that for all partitions. It is definitely worthwhile to try that optimization. Another interesting > example is Values Scan (e.g. in an insert statements with multiple > records), can an abstract table AM user leverage information about > columns in it? > For ValuesScan, I can't see a use case yet for including those in scanCols, since it is not required to scan the existing columns in the table, but I am open to a use case. > > One case, where I believe columns were missing, is statements with > returning: > > INSERT INTO foo (col1) > VALUES ('col1'), ('col2'), ('col3') > RETURNING *; > > Looks like in this situation there is only expression in reltarget is > for col1, but returning result contains all columns. > > So, you are correct, for INSERT and DELETE queries with RETURNING, the scanCols should only include columns needed to INSERT or DELETE data. For DELETE RETURNING, the RETURNING expression is executed separately in ExecDelete, so scanCols should basically reflect only what needs to be scanned to evaluate the qual. For INSERT RETURNING, the scanCols should only reflect those columns needing to be scanned to perform the INSERT. There are several reasons why different kinds of RETURNING queries have too many scanCols. Soumyadeep and I did some investigation into this. Given: t1(a, b) t2(a, b, c) For: INSERT INTO t1(a) VALUES (1) RETURNING *; There is no need to scan t1(a) in order to satisfy the RETURNING expression here. Because this INSERT doesn't go through make_one_rel(), scanCols shouldn't be populated. For: INSERT INTO t2 SELECT a,a,a FROM t1 RETURNING *; For this INSERT, the patch correctly identifies that no columns from t2 need be scanned and only t1(a) needs be scanned. For: DELETE FROM t1 WHERE a = 2 RETURNING *; scanCols correctly has only t1(a) which is needed to evaluate the qual. For: DELETE FROM t1 USING t2 where a = a RETURNING *; scanCols should have t1(a) and t2(a), however, it has t1(a) and t2(a,b,c). This is because in preprocess_targetlist(), all of the returningList vars from t2 are added to the query tree processed_tlist to make sure the RETURNING expression can be evaluated later. However, the query tree processed_tlist items for each rel seem to be added to the RelOptInfo for t2 later, so, in extract_scan_columns(), we mistakenly add these to the scanCols. This is tricky to change because we don't want to change what gets added to the RelOptInfo->reltarget->exprs (I think), so, we'd need some way to know which columns are from the RETURNING expression, which are from the qual, etc. And, we'd need to know the query was a DELETE (since an UPDATE would need all the columns anyway with the current API, for example). This is pretty different than the current logic in extract_scan_cols(). A separate issue with DELETE USING RETURNING queries scanCols arises with partition tables. Given this additional table: t(a, b, c) partition table with partitions tp1(a, b, c) and tp2(a, b, c) the same query above with different relations DELETE FROM t USING t1 WHERE a = a RETURNING *; scanCols will say it requires t(a,b,c) and t1(a,b) (all of the columns in both relations). t1 columns are wrong for the same reason as in the non-partition example query described above. However, the partition table scanCols are wrong for a related but different reason. This, however, is much more easily fixed. The same code in to add returningList to processed_tlist for a querytree in preprocess_targetlist() doesn't handle the case of an inherited table or partition child (since their result_relation is set to imitate a SELECT instead of an UPDATE/DELETE/INSERT). I started a different thread on this topic [1]. Lastly is UPDATE...RETURNING queries. UPDATE queries will require scan of all of the columns with the current tuple_table_lock() API (mentioned upthread). > And just out of curiosity, what do you think about table AM specific > columns e.g. ctid, xmin/xmax etc? I mean, they're not included into > scanCols and should not be since they're heap AM related. But is there a > chance that there would be some AM specific columns relevant to e.g. > the columnar storage that would also make sense to put into scanCols? > Maybe tid? But, I'm not sure. What do you think? [1] https://www.postgresql.org/message-id/caakru_y+7qx4jzvfovsbe9t9_2c4kk1bdda139oq6ca5b-l...@mail.gmail.com