Hi All, I am researching some ways to combine array data with a UDAF. The raw data table schema is listed here:
CREATE TABLE IF NOT EXISTS array_data (session_id string, properties array<struct<name : string, value : string>>); I would like to do such operation for it with a UDAF "array_combine": SELECT session_id, array_combine(properties) as combined_properties FROM array_data GROUP BY session_id; For example, array_data table has two records: session_id1, [{"name":"aaa","value":"111"}, {"name":"bbb","value":"222"}] session_id1, [{"name":"ccc","value":"333"}, {"name":"ddd","value":"444"}] Then with the combination, the result should be one record: session_id1, [{"name":"aaa","value":"111"}, {"name":"bbb","value":"222"}, {"name":"ccc","value":"333"}, {"name":"ddd","value":"444"}] But when I debug the UDAF, the "iterate" and "merge" functions will pass LazyArray type object as parameter, public void iterate(AggregationBuffer agg, Object[] parameters) public void merge(AggregationBuffer agg, Object partial) There are two questions here: (1) Why the object is not ArrayList? I checked the input ObjectInspector which is StandardListObjectInspector in "init" function, public ObjectInspector init(Mode m, ObjectInspector[] parameters) (2) And how to combine two LazyArray objects into one with easy way in "iterate" and "merge" functions? It seems that I have to create a new LazyArray object, but I don't know the values of separator, nullSequence, escapeChar in original LazyArray object, and I also have less knowledge to build a LazyArray with the complex type (array<struct<name : string, value : string>>). Does anyone give me a help? Thanks in advance. Best Regards, Eric