Great! Thanks for pushing this work.
Looking forward to the pull requests.
Best,
Jark
On Fri, 8 Jan 2021 at 17:57, Sebastian Liu wrote:
> Hi Jark,
>
> Cool, following your suggestions I have created three related subtasks
> under Flink-20791.
> Hope to assign these subtasks to me too, when you
Hi Jark,
Cool, following your suggestions I have created three related subtasks
under Flink-20791.
Hope to assign these subtasks to me too, when you have time. And I
will push forward the relevant implementation.
Jark Wu 于2021年1月8日周五 下午12:30写道:
> Hi Sebastian,
>
> I assigned the issue to you. B
Hi Sebastian,
I assigned the issue to you. But I suggest creating sub-tasks under this
issue. Because I think this would be a big contribution.
For example, you can split it into:
1. Introduce SupportsAggregatePushDown interface
2. Support SupportsAggregatePushDown in planner
3. Support SupportsAg
Hi Jark,
Seems that we have reached the agreement on the proposal. Could you please
help to assign the below jira ticket to me?
https://issues.apache.org/jira/browse/FLINK-20791
Jark Wu 于2021年1月7日周四 上午10:25写道:
> Thanks for updating the design doc.
> It looks good to me.
>
> Best,
> Jark
>
> On
Thanks for updating the design doc.
It looks good to me.
Best,
Jark
On Thu, 7 Jan 2021 at 10:16, Jingsong Li wrote:
> Sounds good to me.
>
> We don't have to worry about future changes, because it has covered all
> the capabilities of calcite aggregation.
>
> Best,
> Jingsong
>
> On Thu, Jan 7,
Sounds good to me.
We don't have to worry about future changes, because it has covered all the
capabilities of calcite aggregation.
Best,
Jingsong
On Thu, Jan 7, 2021 at 12:14 AM Sebastian Liu wrote:
> Hi Jark,
>
> Sounds good to me. For better scalability in the future, we could add the
> Agg
Hi Jark,
Sounds good to me. For better scalability in the future, we could add the
AggregateExpression.
```
public class AggregateExpression implements ResolvedExpression {
private final FunctionDefinition functionDefinition;
private final List args;
private final @Nullable CallExpres
Hi Liu, Jingsong,
Regarding the agg with filter, I think in theory we can support pushing
such a pattern into source.
We don't need to support it in the first version, but in the long term, we
can support it.
The designed interface should be future proof.
Considering filter arg and distinct flag
Hi Jingsong, Jark,
Thx so much for our discussion, and the cases mentioned above are really
worthy for further discussion.
1. For aggregate with filter expressions: eg: select COUNT(1) FILTER(WHERE
cc_call_center_sk > 3) from call_center;
For the current Blink Planner, the optimized plan will be:
> I think filter expressions and grouping sets are semantic arguments
instead of utilities. If we want to push them into sources, the connector
developers should be aware of them.Wrapping them in a context implicitly is
error-prone that the existing connector will produce wrong results when
upgradi
I think filter expressions and grouping sets are semantic arguments instead
of utilities.
If we want to push them into sources, the connector developers should be
aware of them.
Wrapping them in a context implicitly is error-prone that the existing
connector will produce wrong results
when upgradi
Hi,
I'm also curious about aggregate with filter (COUNT(1) FILTER(WHERE d >
1)). Can we push it down? I'm not sure that a single call expression can
express it, and how we should embody it and convey it to users.
Best,
Jingsong
On Wed, Jan 6, 2021 at 1:36 PM Jingsong Li wrote:
> Hi Jark,
>
> I
Hi Jark,
I don't want to limit this interface to LocalAgg Push down. Actually,
sometimes, we can push whole aggregation to source too.
So, this rule can do something more advanced. For example, we can push down
group sets to source too, for the SQL: "GROUP BY GROUPING SETS (f1, f2)".
Then, we nee
I think this may be over designed. We should have confidence in the
interface we design, the interface should be stable.
Wrapping things in a big context has a cost of losing user convenience.
Foremost, we don't see any parameters to add in the future. Do you know any
potential parameters?
Best,
J
Hi Sebastian,
Well, I mean:
`boolean applyAggregates(int[] groupingFields, List
aggregateExpressions, DataType producedDataType);`
VS
```
boolean applyAggregates(Aggregation agg);
interface Aggregation {
int[] groupingFields();
List aggregateExpressions();
DataType producedDataType();
}
``
Hi Jinsong,
Thx a lot for your suggestion. These points really need to be clear in the
proposal.
For the semantic problem, I think the main point is the different returned
data types
for the target aggregate function and the row format returned by the
underlying storage.
That's why we provide the
Thanks for the update.
The proposal looks good to me now.
Best,
Jark
On Tue, 5 Jan 2021 at 14:44, Jingsong Li wrote:
> Thanks for your proposal! Sebastian.
>
> +1 for SupportsAggregatePushDown. The above wonderful discussion has
> solved many of my concerns.
>
> ## Semantic problems
>
> We may
Thanks for your proposal! Sebastian.
+1 for SupportsAggregatePushDown. The above wonderful discussion has solved
many of my concerns.
## Semantic problems
We may need to add some mechanisms or comments, because as far as I know,
the semantics of each database is actually different, which may nee
Hi Jark,
Thx for your further feedback and help. The interface of
SupportsAggregatePushDown may indeed need some adjustments.
For (1) Agree: Yeah, the upstream only need to know if the TableSource can
handle all of the aggregates.
It's better to just return a boolean type to indicate whether all
Thanks for the clarification. I have resolved all of the comments and added
a conclusion section.
Looking forward to the further feedback from our community. If we get
consensus on the design doc, I can push the implementation related work.
Kurt Young 于2021年1月5日周二 上午11:04写道:
> Sorry for the typ
I'm also +1 for idea#2.
Regarding to the updated interface,
Result applyAggregates(List aggregateExpressions,
int[] groupSet, DataType aggOutputDataType);
final class Result {
private final List acceptedAggregates;
private final List remainingAggregates;
}
I have following co
Sorry for the typo -_-!
I meant idea #2.
Best,
Kurt
On Tue, Jan 5, 2021 at 10:59 AM Sebastian Liu wrote:
> Hi Kurt,
>
> Thx a lot for your feedback. If local aggregation is more like a physical
> operator rather than logical
> operator, I think your suggestion should be idea #2 which handle a
Hi Kurt,
Thx a lot for your feedback. If local aggregation is more like a physical
operator rather than logical
operator, I think your suggestion should be idea #2 which handle all in the
physical optimization phase?
Looking forward for the further discussion.
Kurt Young 于2021年1月5日周二 上午9:52写道:
Local aggregation is more like a physical operator rather than logical
operator. I would suggest going with idea #1.
Best,
Kurt
On Wed, Dec 30, 2020 at 8:31 PM Sebastian Liu wrote:
> Hi Jark, Thx a lot for your quick reply and valuable suggestions.
> For (1): Agree: Since we are in the period
Hi Jark, Thx a lot for your quick reply and valuable suggestions.
For (1): Agree: Since we are in the period of upgrading the new table
source api,
we really should consider the new interface for the new optimize rule. If
the new rule
doesn't use the new api, we'll have to upgrade it sooner or late
Hi Sebastian,
Thanks for the proposal. I think this is a great improvement for Flink SQL.
I went through the design doc and have following thoughts:
1) Flink has deprecated the legacy TableSource in 1.11 and proposed a new
set of DynamicTableSource interfaces. Could you update your proposal to
u
26 matches
Mail list logo