xtern commented on code in PR #6163: URL: https://github.com/apache/ignite-3/pull/6163#discussion_r2182110337
########## docs/_docs/sql-reference/explain-operators-list.adoc: ########## @@ -0,0 +1,505 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. += List Of Operators + +This section enumerates all operators with their semantic and supported attributes. + +== ColocatedHashAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Colocated aggregate assumes that the data is already distributed according to grouping keys, therefore aggregation can be completed locally in a single pass. +The hash aggregate operation maintains a hash table for each grouping set to coalesce equivalent tuples. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== ColocatedSortAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Colocated aggregate assumes that the data is already distributed according to grouping keys, therefore aggregation can be completed locally in a single pass. +The sort aggregate operation leverages data ordered by the grouping expressions to calculate data each grouping set tuple-by-tuple in streaming fashion. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `collation`: List of columns and expected order of sorting this operator is rely on. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== MapHashAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Map aggregate is a first phase of 2-phase aggregation. +During first phase, data is pre-aggregated, and result is sent to the where REDUCE is executed. +The hash aggregate operation maintains a hash table for each grouping set to coalesce equivalent tuples. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== ReduceHashAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Reduce aggregate is a second phase of 2-phase aggregation. +During second phase, all pre-aggregated data is merged together, and final result is returned. +The hash aggregate operation maintains a hash table for each grouping set to coalesce equivalent tuples. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== MapSortAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Map aggregate is a first phase of 2-phase aggregation. +During first phase, data is pre-aggregated, and result is sent to the where REDUCE is executed. +The sort aggregate operation leverages data ordered by the grouping expressions to calculate data each grouping set tuple-by-tuple in streaming fashion. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `collation`: List of columns and expected order of sorting this operator is rely on. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== ReduceSortAggregate + +The aggregate operation groups input data on one or more sets of grouping keys, calculating each aggregation function for each combination of grouping key. +Reduce aggregate is a second phase of 2-phase aggregation. +During second phase, all pre-aggregated data is merged together, and final result is returned. +The sort aggregate operation leverages data ordered by the grouping expressions to calculate data each grouping set tuple-by-tuple in streaming fashion. +The output rows are composed as follow: first come columns participated in grouping keys in the order they enumerated in `group` attribute, then come results of accumulators in the order they enumerated in `aggregation` attribute. + +Attributes: + +- `group`: Set of grouping columns. +- `groupSets`: List of group key definitions for advanced grouping, like CUBE or ROLLUP. +Optional. +- `aggregation`: List of accumulators. +- `collation`: List of columns and expected order of sorting this operator is rely on. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== ColocatedIntersect + +Returns all records from the primary input that are present in every secondary input. +If `all` is `true`, then for each specific record returned, the output contains min(m, n1, n2, …, n) copies. +Otherwise duplicates are eliminated. + +Attributes: + +- `all`: If `true`, then output may contains duplicates. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== ColocatedMinus + +Returns all records from the primary input excluding any matching records from secondary inputs. +If `all` is `true`, then for each specific record returned, the output contains max(0, m - sum(n1, n2, …, n)) copies. +Otherwise duplicates are eliminated. + +Attributes: + +- `all`: If `true`, then output may contains duplicates. +- `fieldNames`: List of names of columns in produced rows. +Optional. +- `est`: Estimated number of output rows. + +== MapIntersect + +Returns all records from the primary input that are present in every secondary input. +Map intersect is a first phase of 2-phase computation. +During first phase, data is pre-aggregated, and result is sent to the where REDUCE is executed. + +Attributes: + +- `all`: If `true`, then output may contains duplicates. Review Comment: Here and below ```suggestion - `all`: If `true`, then output may contain duplicates. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@ignite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org