Re: [PR] IGNITE-25366 Documentation. Describe output of EXPLAIN command [ignite-3]

via GitHub Thu, 03 Jul 2025 00:52:37 -0700


xtern commented on code in PR #6163:
URL: https://github.com/apache/ignite-3/pull/6163#discussion_r2182110337



##########
docs/_docs/sql-reference/explain-operators-list.adoc:
##########
@@ -0,0 +1,505 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+= List Of Operators
+
+This section enumerates all operators with their semantic and supported 
attributes.
+
+== ColocatedHashAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Colocated aggregate assumes that the data is already distributed according to 
grouping keys, therefore aggregation can be completed locally in a single pass.
+The hash aggregate operation maintains a hash table for each grouping set to 
coalesce equivalent tuples.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== ColocatedSortAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Colocated aggregate assumes that the data is already distributed according to 
grouping keys, therefore aggregation can be completed locally in a single pass.
+The sort aggregate operation leverages data ordered by the grouping 
expressions to calculate data each grouping set tuple-by-tuple in streaming 
fashion.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `collation`: List of columns and expected order of sorting this operator is 
rely on.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== MapHashAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Map aggregate is a first phase of 2-phase aggregation.
+During first phase, data is pre-aggregated, and result is sent to the where 
REDUCE is executed.
+The hash aggregate operation maintains a hash table for each grouping set to 
coalesce equivalent tuples.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== ReduceHashAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Reduce aggregate is a second phase of 2-phase aggregation.
+During second phase, all pre-aggregated data is merged together, and final 
result is returned.
+The hash aggregate operation maintains a hash table for each grouping set to 
coalesce equivalent tuples.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== MapSortAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Map aggregate is a first phase of 2-phase aggregation.
+During first phase, data is pre-aggregated, and result is sent to the where 
REDUCE is executed.
+The sort aggregate operation leverages data ordered by the grouping 
expressions to calculate data each grouping set tuple-by-tuple in streaming 
fashion.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `collation`: List of columns and expected order of sorting this operator is 
rely on.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== ReduceSortAggregate
+
+The aggregate operation groups input data on one or more sets of grouping 
keys, calculating each aggregation function for each combination of grouping 
key.
+Reduce aggregate is a second phase of 2-phase aggregation.
+During second phase, all pre-aggregated data is merged together, and final 
result is returned.
+The sort aggregate operation leverages data ordered by the grouping 
expressions to calculate data each grouping set tuple-by-tuple in streaming 
fashion.
+The output rows are composed as follow: first come columns participated in 
grouping keys in the order they enumerated in `group` attribute, then come 
results of accumulators in the order they enumerated in `aggregation` attribute.
+
+Attributes:
+
+- `group`: Set of grouping columns.
+- `groupSets`: List of group key definitions for advanced grouping, like CUBE 
or ROLLUP.
+Optional.
+- `aggregation`: List of accumulators.
+- `collation`: List of columns and expected order of sorting this operator is 
rely on.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== ColocatedIntersect
+
+Returns all records from the primary input that are present in every secondary 
input.
+If `all` is `true`, then for each specific record returned, the output 
contains min(m, n1, n2, …, n) copies.
+Otherwise duplicates are eliminated.
+
+Attributes:
+
+- `all`: If `true`, then output may contains duplicates.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== ColocatedMinus
+
+Returns all records from the primary input excluding any matching records from 
secondary inputs.
+If `all` is `true`, then for each specific record returned, the output 
contains max(0, m - sum(n1, n2, …, n)) copies.
+Otherwise duplicates are eliminated.
+
+Attributes:
+
+- `all`: If `true`, then output may contains duplicates.
+- `fieldNames`: List of names of columns in produced rows.
+Optional.
+- `est`: Estimated number of output rows.
+
+== MapIntersect
+
+Returns all records from the primary input that are present in every secondary 
input.
+Map intersect is a first phase of 2-phase computation.
+During first phase, data is pre-aggregated, and result is sent to the where 
REDUCE is executed.
+
+Attributes:
+
+- `all`: If `true`, then output may contains duplicates.

Review Comment:
   Here and below
   ```suggestion
   - `all`: If `true`, then output may contain duplicates.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@ignite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] IGNITE-25366 Documentation. Describe output of EXPLAIN command [ignite-3]

Reply via email to