Hi Andreas, First of all I would highly recommend converting a non-structured types to structured types as soon as possible as it opens more possibilities to optimize the plan.
Have you tried:
Table users =
batchTableEnvironment.fromDataSet(usersDataset).select("getField(f0,
userName) as userName", "f0")
Table other =
batchTableEnvironment.fromDataSet(otherDataset).select("getField(f0,
userName) as user", "f1")
Table result = other.join(users, "user = userName")
You could also check how the
org.apache.flink.formats.avro.AvroRowDeserializationSchema class is
implemented which internally converts an avro record to a structured Row.
Hope this helps.
Best,
Dawid
On 03/01/2020 23:16, Hailu, Andreas wrote:
>
> Hi folks,
>
>
>
> I’m trying to join two Tables which are composed of complex types,
> Avro’s GenericRecord to be exact. I have to use a custom UDF to
> extract fields out of the record and I’m having some trouble on how to
> do joins on them as I need to call this UDF to read what I need.
> Example below:
>
>
>
> batchTableEnvironment.registerFunction("getField", new
> GRFieldExtractor()); // GenericRecord field extractor
>
> Table users = batchTableEnvironment.fromDataSet(usersDataset); //
> Converting from some pre-existing DataSet
>
> Table otherDataset = batchTableEnvironment.fromDataSet(someOtherDataset);
>
> Table userNames = t.select("getField(f0, userName)"); // This is how
> the UDF is used, as GenericRecord is a complex type requiring you to
> invoke a get() method on the field you’re interested in. Here we get a
> get on field ‘userName’
>
>
>
> I’d like to do something using the Table API similar to the query
> “SELECT * from otherDataset WHERE otherDataset.userName =
> users.userName”. How is this done?
>
>
>
> Best,
>
> Andreas
>
>
>
> *The Goldman Sachs Group, Inc. All rights reserved*.
>
> See http://www.gs.com/disclaimer/global_email for important risk
> disclosures, conflicts of interest and other terms and conditions
> relating to this e-mail and your reliance on information contained in
> it. This message may contain confidential or privileged information.
> If you are not the intended recipient, please advise us immediately
> and delete this message. See http://www.gs.com/disclaimer/email for
> further information on confidentiality and the risks of non-secure
> electronic communication. If you cannot access these links, please
> notify us by reply message and we will send the contents to you.
>
>
>
>
> ------------------------------------------------------------------------
>
> Your Personal Data: We may collect and process information about you
> that may be subject to data protection laws. For more information
> about how we use and disclose your personal data, how we protect your
> information, our legal basis to use your information, your rights and
> who you can contact, please refer to: www.gs.com/privacy-notices
> <http://www.gs.com/privacy-notices>
signature.asc
Description: OpenPGP digital signature
