Hi forideal,

Currently, dimension table join requires at least one join key. That is the
`u_id` in your example. The join key will be used as lookup key.
If you have some additional filters on dimension table, that's fine, Flink
will help to filter the fetched data.
That means Flink supports following dimension join query:

select score_a ... left join ... source_table.u_id=dim_u_score.u_id
where dim_u_score.score_b > 1;


At present, dimension table join doesn't pushdown filters into source, so
if the data associated to the given lookup key is very large, it will have
a high IO cost.
However, filter pushdown into lookup table is in the roadmap.

Best,
Jark







On Mon, 27 Apr 2020 at 20:08, forideal <fszw...@163.com> wrote:

> Hello, my friend.
>
> I have a dimension table.
>
> create table dim_u_score( u_id bigint, varchar, score_a double, score_b 
> double) with {xxx}In a scene
>
> The condition of lookup is fliter score_a > 0.9
>
> In another scenario
>
> The condition of lookup is fliter score_b > 1
>
> In Flink, at present, lookup join can use on to pass key values, such as
>
> select score_a ... left join ... source_table.u_id=dim_u_score.u_id
>
> If so, what should I do?
>
> If not, can I say that I can create multiple tables with conditions to use
> when it comes?
>
> such as
>
> create table dim_u_score_filter_a( u_id bigint, varchar, score_a double, 
> score_b double) with{"filter_condition"="score_a > 0.9 "}create table 
> dim_u_score_filter_b( u_id bigint, varchar, score_a double, score_b 
> double)with {"filter_condition"="fliter score_b > 1 "}
>
> Then, in the process of lookup, push down to the specific execution engine
> to complete the lookup filter.
>
>
>
>

Reply via email to