Hi,
did you try to use a different order? Core module first and then Hive
module?
The compatibility layer should work sufficiently for regular Hive UDFs
that don't aggregate data. Hive aggregation functions should work well
in batch scenarios. However, for streaming pipeline the aggregate
functions need to be able to consume updates (such as retraction in your
case).
In summary: Ideally, for simply stuff such as SUM or COUNT, you should
use the core functions instead of Hive one. Using Hive agg functions in
streaming could lead to issues if the input operator is not insert-only.
Regards,
Timo
On 08.09.21 06:47, vtygoss wrote:
Hi, Flink Community!
i met a problem using flink 1.12.0 standalone cluster with hive catalog.
scene 1:
- module: hive module
- execute sql: select sum(1) from xxx
- exception: *org.apache.flink.table.api.TableException: Required
built-in function [plus] could not be found in any catalog.*
scene 2:
- module: hive module and core module
- execute sql: select sum(1)
- exception: *org.apache.flink.table.api.ValidationException: Could not
find an implementation method 'retract' in class 'class
org.apache.flink.table.functions.hive.HiveGenericUDAF' for function
'sum' that matches the following signature:*
*void
retract(org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationBuffer,
java.lang.Integer)*
scene 3:
- module: core module
- execute sql: select sum(1)
- no exception, but hive udf is invalid.
so is there a solution to use both hive udf and avoid these exceptions?
Thank you for any suggestions.
Best Regards!