Hi Arvid, Hi Sergey,
thanks for your feedback. I updated the FLIP accordingly but let me
answer your questions
here as well:
> Are we going to enforce that the name is a valid class name? What is
> happening if it's not a correct name?
> What are the implications of using a class that is not in the
> classpath in Table API? It looks to me that the name is metadata-only
> until we try to access the objects directly in Table/DataStream API.
Names are not enforced or validated. They are pure metadata as mentioned
in Section 2.1. We fallback to Row as the conversion class if the name
cannot be resolved in the current classpath. So when staying in the SQL
ecosystem (i.e. not switching to Table API, DataStream API, or UDFs),
the class must not be present.
> Should Expressions.objectOf(String, Object... kv); also have an
> overload where you can put in the StructuredType in case where
> the class is not in the CP?
That makes a lot of sense. I added a DataTypes.STRUCTURED(String,
Field...) method and a Expressions.objectOf(String, Object...).
> What is the expected outcome of supplying fewer keys than defined
> in the structured type? Are we going to make use of nullability here?
> If so, *_INSERT and *_REMOVE may have some use.
Currently, we go with the most conservative approach, which means that
all keys need to be present. Maybe we can reserve this feature to future
work and make the logic more lenient.
> Talking about nullability: Is there some option to make the declared
> fields NOT NULL? If so, could you amend one example to show that?
> (Grammar? implies that it's not possible)
NOT NULL is supported similar to ROW<i INT NOT NULL>. I adjusted one of
the examples.
> One bigger concern is around the naming. For me, OBJECT is used for
> semi-structured types that are open. Your FLIP implies a closed design
> and that you want to add an open OBJECT later. I asked ChatGPT about
> other DB implementations and it seems like STRUCT is used more often
> (see below). So, I'd propose to call it STRUCT<...>, STRUCT_OF, >
> structOf, UPDATE_STRUCT, and updateStruct respectively.
Naming is hard. I was also torn between STRUCT, STRUCTURED, or OBJECT.
In Flink, the ROW type is rather our STRUCT type, because it works fully
position based. Structured types might be name-based in the future for
better schema evolution, so they rather model an OBJECT type. This was
my reason for choosing OBJECT_OF (typed to class name and fixed fields)
vs. OBJECT (semi-structured without fixed fields). Snowflake also uses
OBJECT(i INT) (for structured types) and OBJECT (for semi structured types).
Also, both structured and semi-structured types can then share functions
such as UPDATE_OBJECT().
What do others think?
Thanks,
Timo
On 22.04.25 12:08, Sergey Nuyanzin wrote:
Thanks for driving this Timo
The FLIP seems reasonable to me
I have one minor question/clarification
do I understand it correct that after this FLIP we can execute of
`typeof` against result of `OBJECT_OF`
for instance
SELECT typeof(OBJECT_OF(
'com.example.User',
'name', 'Bob',
'age', 42
));
should return `STRUCTURED<'com.example.User', name STRING, age INT>`
?
On Tue, Apr 22, 2025 at 10:57 AM Timo Walther <twal...@apache.org> wrote:
Hi everyone,
I would like to ask again for feedback on this FLIP. It is a rather
small change but with big impact on usability for structured data.
Are there any objections? Otherwise I would like to continue with voting
soon.
Thanks,
Timo
On 10.04.25 07:54, Timo Walther wrote:
Hi everyone,
I would like to start a discussion about FLIP-520: Simplify
StructuredType handling [1].
Flink SQL already supports structured types in the engine, serializers,
UDFs, and connector interfaces. However, currently only Table API was
able to make use of them. While UDFs can take objects as input and
return types, it is actually quite inconvenient to use them in
transformations.
This FLIP fixes some immediate blockers in the use of structured types.
Looking forward to feedback.
Cheers,
Timo
[1] https://cwiki.apache.org/confluence/display/FLINK/
FLIP-520%3A+Simplify+StructuredType+handling