Konstantin Orlov created IGNITE-25366: -----------------------------------------
Summary: Documentation. Describe output of EXPLAIN command Key: IGNITE-25366 URL: https://issues.apache.org/jira/browse/IGNITE-25366 Project: Ignite Issue Type: Improvement Components: documentation ai3 Reporter: Konstantin Orlov Let's update documentation with page describing general concept of EXPLAIN output, which is provided below. Every relational operator is described with a name and set of attributes: {code} NameOfTheRelationalOperator attribute1: value1 attribute2: value2 {code} Let’s take a look at few examples: {code} TableScan // Full table access table: PUBLIC.EMP1 // Name of the table in question fields: [NAME, SALARY] // List of columns to return est: (rows=1) // Estimated number of rows returned IndexScan // Index scan table: PUBLIC.TEST_TBL // Name of the table in question index: IDX_DESC // Name of the index in question type: SORTED // Type of the index fields: [C1] // List of columns to return collation: [C1 DESC] // Collation of the index aka order of sorting est: (rows=1) // Estimated number of rows returned Sort collation: [C1 DESC NULLS LAST] // Collation to sort input rows est: (rows=1) // Estimated number of rows returned {code} Name represents a particular algorithm used to execute relation operators ({{TableScan}} vs {{IndexScan}}, {{HashJoin}} vs {{MergeJoin}} vs {{NestedLoopJoin}}, etc). The set of attributes depends on the particular relational operator. A query plan is represented by a tree-like structure which is the composition of nodes described above. This tree describes a data flow, where rows are passed from leaves to a plan root node (root node is the topmost node, it also has no indentation). Let's take a look at few examples: {code} // simple ordered select where desired order matches collation of existing index EXPLAIN PLAN FOR SELECT c1 FROM test_tbl ORDER BY c1 DESC NULLS FIRST Exchange distribution: single est: (rows=1) IndexScan table: PUBLIC.TEST_TBL index: IDX_DESC type: SORTED fields: [C1] collation: [C1 DESC] est: (rows=1) // similar query, but desired order doesn't match index collation. Mind the additional // SORT node. EXPLAIN PLAN FOR SELECT c1 FROM test_tbl ORDER BY c1 DESC NULLS LAST Exchange distribution: single est: (rows=1) Sort collation: [C1 DESC NULLS LAST] est: (rows=1) TableScan table: PUBLIC.TEST_TBL fields: [C1] est: (rows=1) // Mind the "fetch" attribute of Sort node. It denotes TopN sort algorithm, // implying that only N nodes will be kept in memory. This also implies // that only N rows from every node will be transferred over Exchange. SELECT * FROM test ORDER BY pk FETCH FIRST ? ROWS ONLY Limit fetch: ?0 est: (rows=1) Exchange distribution: single est: (rows=1) Sort collation: [PK ASC] fetch: ?0 est: (rows=1) TableScan table: PUBLIC.TEST fields: [PK, COL0] est: (rows=1) // Similar query, but Limit node wasn't pushed down the exchange. This implies // that the whole dataset will be transferred over an Exchange. SELECT * FROM test OFFSET ? ROWS FETCH FIRST ? ROWS ONLY Limit offset: ?0 fetch: ?1 est: (rows=1) Exchange distribution: single est: (rows=1) TableScan table: PUBLIC.TEST fields: [PK, COL0] est: (rows=1) // More complex plan EXPLAIN PLAN FOR SELECT U.UserName, P.ProductName, R.ReviewText, R.Rating FROM Users U, Reviews R, Products P WHERE U.UserID = R.UserID AND R.ProductID = P.ProductID AND P.ProductName = 'Product_' || ?::varchar; Project fields: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] exprs: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING] est: (rows=16650) HashJoin condition: =(USERID0, USERID) joinType: inner est: (rows=16650) HashJoin condition: =(PRODUCTID, PRODUCTID0) joinType: inner est: (rows=16650) Exchange distribution: single est: (rows=50000) TableScan table: PUBLIC.REVIEWS fields: [PRODUCTID, USERID, REVIEWTEXT, RATING] est: (rows=50000) Exchange distribution: single est: (rows=1)665 TableScan table: PUBLIC.PRODUCTS filters: =(PRODUCTNAME, ||(_UTF-8'Product_', CAST(?0):VARCHAR CHARACTER SET "UTF-8")) fields: [PRODUCTID, PRODUCTNAME] est: (rows=1665) Exchange distribution: single est: (rows=10000) TableScan table: PUBLIC.USERS fields: [USERID, USERNAME] est: (rows=10000) {code} We also should provide exhaustive list of relational operators their attributes. -- This message was sent by Atlassian Jira (v8.20.10#820010)