Hi JB,

Thanks for sharing. Got a few questions:

   1. Does Apache Camel rely on other engines, e.g., Spark or Flink for any
   processing, or is it fully self-contained?
   2. What are the potential challenges or limitations you foresee? For
   example, does it generate too many commits and/or small files
   considering its use cases(IoT, Event streaming)? Can Camel cache ingestion
   data, and write it to the Iceberg table as a batch?
   3. How do you recommend handling schema evolution in Iceberg tables when
   integrating with Camel routes?

Yufei


On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi folks,
>
> I'm working on a Iceberg component for Apache Camel:
> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main
>
> Apache Camel is an integration framework, supporting a lot of
> components and EIPs (Enterprise Integration Patterns, like Content
> Based Router, Splitter, Aggregator, Content Enricher, ...).
> Camel is very popular in a lot of use cases, like IoT, system
> integration, event streamings, ...
>
> This component provides a Camel component with:
> - a Camel consumer endpoint (from) to read data from Iceberg
> tables/views (scan) and create a Camel exchange
> - a Camel producer endpoint (to) to write data (from Camel exchange)
> to Iceberg tables/views
>
> For instance, you can write a Camel route like this (using the
> spring/blueprint DSL for instance):
>
> <from uri="jms:queue:foo"/>
> <process ref="#convertToIcebergRecords"/> <!-- optional depending on
> the exchange message body -->
> <to uri="iceberg:my_table?catalog=#ref"/>
>
> This route is event driven, consuming messages from the foo JMS queue
> (from Apache ActiveMQ for instance), and writing a message body to
> my_table iceberg table (it's possible to use a router or multicast
> EIPs to send the exchange to different tables).
> NB: for the from (consumer endpoint), you can use any Camel component
> (https://camel.apache.org/components/4.4.x/).
>
> You can also consume (scan) data from an Iceberg table, and send the
> generated Exchange to any endpoint/route:
>
> <from uri="iceberg:my_table?catalog=#ref"/>
> <process ref="#convertFromIcebergRecords"/> <!-- optional depending on
> the next steps in the route -->
> <wireTap uri="direct:tap"/>
> <to uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/>
>
> This route generates exchanges from my_table Iceberg table, uses the
> wiretap EIP and stores the data into a mongoDB database/collection.
>
> If I started the component in the Iceberg repo, I think it would make
> more sense to have it at camel (as Apache Beam contains the Iceberg
> IO).
> Thoughts ?
>
> Comments are welcome !
>
> NB: on a related topic, I created
> https://github.com/apache/iceberg/pull/10365
>
> Regards
> JB
>

Reply via email to