Dear XQ and Ahmed, We plan to use dynamic schemas and I tried to implement it by myself. But I found maybe this is a blocker, Beam's Coder system generally requires a *fixed schema* to serialize and deserialize data efficiently. When working with dynamic schemas that are determined at runtime, this can become a blocker. Is my understanding correct? Could you give some suggestions on how to solve this?
Best regards, Luyao On Thu, Jan 23, 2025 at 10:01 AM XQ Hu <x...@google.com> wrote: > +kaikua...@gmail.com explicitly. Please check the response. > > On Thu, Jan 23, 2025 at 10:45 AM Ahmed Abualsaud < > ahmedabuals...@apache.org> wrote: > >> Hey Luyao! >> >> Thanks for reaching out. Beam's Iceberg connector supports dynamic >> destinations so long as they have equivalent Schemas. There's no support >> for dynamic schemas yet, i.e. there's no `getDataSchema(<destination>)` >> method. >> >> There's a feature request on Github: >> https://github.com/apache/beam/issues/33724. >> >> > I don't think Managed.write support with DynamicDestinations >> >> ManagedIO does support dynamic destinations, in the format described >> here: >> https://cloud.google.com/dataflow/docs/guides/managed-io#dynamic-destinations. >> It also does not support dynamic schemas yet. >> >> Best, >> Ahmed >> >> On 2025/01/23 04:02:57 Ling Li wrote: >> > Hi team, >> > >> > Happy New Year! I have a question regarding >> > the "org.apache.beam:beam-sdks-java-io-iceberg:2.61.0". >> > >> > Does the org.apache.beam.sdk.schemas.Schema getDataSchema(String >> > destination) of class DynamicDestinations already exist? >> > >> > Our team tried to use DynamicDestinations to decide which iceberg table >> to >> > write in the runtime but failed. >> > >> > >> > Attached is my code class DynamicIcebergDestinations. Our problem is >> that >> > we don't have a universal schema that can match all events to be >> > implemented for getDataSchema(). We need to use a parameter to get the >> > correct schema for each event, so we want to use: getDataSchema(String >> > destination). But seems getDataSchema(String destination) is not >> > implemented for org.apache.beam.sdk.io.iceberg.DynamicDestinations. >> Since >> > getDataSchema() will be automatically called and it cannot return a >> > universal schema, there is no way for us to >> > use IcebergIO.writeRows(icebergCatalogConfig).to(dynamicDestinations). I >> > also tried to use managed I/O connector, but I don't think >> > Managed.write support >> > with DynamicDestinations. Could you give some urgent guidance? >> > >> > Thank you so much. >> > >> > Sincerely, >> > Luyao >> > >> >