Re: how to query the output of the scalar table function

Darshan Singh Wed, 04 Apr 2018 10:09:43 -0700

Thanks Fabian

We are going to replace all scalar functions with the table functions.


Thanks

On Wed, Apr 4, 2018 at 12:16 PM, Fabian Hueske <[email protected]> wrote:

> Hi Darshan,
>
> What you observe is the result of what's supposed to be an optimization.
> By fusing the two select() calls, we reduce the number of operators in the
> resulting plan (one MapFunction less).
> This optimization is only applied for ScalarFunctions but not for
> TableFunctions.
> With a better cost-modeling that increases the cost of user-defined
> ScalarFunctions it should be possible to "convince" the optimizer to not
> fuse the operators.
>
> For now, you could either use the TableFunction approach or convert the
> result of the first select() into a DataStream (or DataSet depending on
> your setup) and register that again as a table.
> That would split the plan into two parts which are independently optimized
> and hence the select() operators would not be merged.
>
> Best, Fabian
>
>
>
> 2018-03-30 22:07 GMT+02:00 Darshan Singh <[email protected]>:
>
>> Hi,
>>
>> I am not able to find what is best way to query the output of a scalar
>> table function.
>>
>> Suppose I have table which has column col1 which is string.
>>
>> I have a scalar function and returns a POJO
>> {col1V1 String, col1V2 String , col1V3 String}.
>>
>>
>> I am using following.
>>
>> so table.select("sf(col1) as sfc")
>> .select("sfc.get('col1V1') as v1, sfc.get('col1V2') as v2 ,
>> sfc.get('col1V13') as v3 ")
>>
>> It is working fine but strangely enough it calls the sf 3 times(each per
>> get) for same row. However, If I use table function it calls just once.
>>
>> So to me it seems that scalar function is very expensive so if I have 10
>> columns and the computation is expensive it will do 10 times. Thats why I
>> thought maybe there is a better way to return pojo from scalar function
>> rather than what I have been using.
>>
>> If this is the best way then I wonder if scalar functions should be used
>> to return single value only.
>>
>> Thanks
>>
>
>

Re: how to query the output of the scalar table function

Reply via email to