Hi JB, Thanks for sharing. Got a few questions:
1. Does Apache Camel rely on other engines, e.g., Spark or Flink for any processing, or is it fully self-contained? 2. What are the potential challenges or limitations you foresee? For example, does it generate too many commits and/or small files considering its use cases(IoT, Event streaming)? Can Camel cache ingestion data, and write it to the Iceberg table as a batch? 3. How do you recommend handling schema evolution in Iceberg tables when integrating with Camel routes? Yufei On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi folks, > > I'm working on a Iceberg component for Apache Camel: > https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main > > Apache Camel is an integration framework, supporting a lot of > components and EIPs (Enterprise Integration Patterns, like Content > Based Router, Splitter, Aggregator, Content Enricher, ...). > Camel is very popular in a lot of use cases, like IoT, system > integration, event streamings, ... > > This component provides a Camel component with: > - a Camel consumer endpoint (from) to read data from Iceberg > tables/views (scan) and create a Camel exchange > - a Camel producer endpoint (to) to write data (from Camel exchange) > to Iceberg tables/views > > For instance, you can write a Camel route like this (using the > spring/blueprint DSL for instance): > > <from uri="jms:queue:foo"/> > <process ref="#convertToIcebergRecords"/> <!-- optional depending on > the exchange message body --> > <to uri="iceberg:my_table?catalog=#ref"/> > > This route is event driven, consuming messages from the foo JMS queue > (from Apache ActiveMQ for instance), and writing a message body to > my_table iceberg table (it's possible to use a router or multicast > EIPs to send the exchange to different tables). > NB: for the from (consumer endpoint), you can use any Camel component > (https://camel.apache.org/components/4.4.x/). > > You can also consume (scan) data from an Iceberg table, and send the > generated Exchange to any endpoint/route: > > <from uri="iceberg:my_table?catalog=#ref"/> > <process ref="#convertFromIcebergRecords"/> <!-- optional depending on > the next steps in the route --> > <wireTap uri="direct:tap"/> > <to uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/> > > This route generates exchanges from my_table Iceberg table, uses the > wiretap EIP and stores the data into a mongoDB database/collection. > > If I started the component in the Iceberg repo, I think it would make > more sense to have it at camel (as Apache Beam contains the Iceberg > IO). > Thoughts ? > > Comments are welcome ! > > NB: on a related topic, I created > https://github.com/apache/iceberg/pull/10365 > > Regards > JB >