Re: [DISCUSS] camel-iceberg component

Manish Malhotra Tue, 21 May 2024 22:09:55 -0700

Is Camel can be used as an alternate to Flink?


On Tue, May 21, 2024 at 10:17 AM Ryan Blue <b...@tabular.io> wrote:

> This is an interesting idea. What is the use case and where should this
> live? I'm unfamiliar with Camel and I'm not sure what the normal thing is.
> At least in the Iceberg community, we generally avoid adding connectors
> unless there is a clear use case and demand for them. We don't want to add
> code that needs to be maintained but isn't used.
>
> On Tue, May 21, 2024 at 10:15 AM Yufei Gu <flyrain...@gmail.com> wrote:
>
>> Hi JB,
>>
>> Thanks for sharing. Got a few questions:
>>
>>    1. Does Apache Camel rely on other engines, e.g., Spark or Flink for
>>    any processing, or is it fully self-contained?
>>    2. What are the potential challenges or limitations you foresee? For
>>    example, does it generate too many commits and/or small files
>>    considering its use cases(IoT, Event streaming)? Can Camel cache ingestion
>>    data, and write it to the Iceberg table as a batch?
>>    3. How do you recommend handling schema evolution in Iceberg tables
>>    when integrating with Camel routes?
>>
>> Yufei
>>
>>
>> On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>>> Hi folks,
>>>
>>> I'm working on a Iceberg component for Apache Camel:
>>>
>>> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main
>>>
>>> Apache Camel is an integration framework, supporting a lot of
>>> components and EIPs (Enterprise Integration Patterns, like Content
>>> Based Router, Splitter, Aggregator, Content Enricher, ...).
>>> Camel is very popular in a lot of use cases, like IoT, system
>>> integration, event streamings, ...
>>>
>>> This component provides a Camel component with:
>>> - a Camel consumer endpoint (from) to read data from Iceberg
>>> tables/views (scan) and create a Camel exchange
>>> - a Camel producer endpoint (to) to write data (from Camel exchange)
>>> to Iceberg tables/views
>>>
>>> For instance, you can write a Camel route like this (using the
>>> spring/blueprint DSL for instance):
>>>
>>> <from uri="jms:queue:foo"/>
>>> <process ref="#convertToIcebergRecords"/> <!-- optional depending on
>>> the exchange message body -->
>>> <to uri="iceberg:my_table?catalog=#ref"/>
>>>
>>> This route is event driven, consuming messages from the foo JMS queue
>>> (from Apache ActiveMQ for instance), and writing a message body to
>>> my_table iceberg table (it's possible to use a router or multicast
>>> EIPs to send the exchange to different tables).
>>> NB: for the from (consumer endpoint), you can use any Camel component
>>> (https://camel.apache.org/components/4.4.x/).
>>>
>>> You can also consume (scan) data from an Iceberg table, and send the
>>> generated Exchange to any endpoint/route:
>>>
>>> <from uri="iceberg:my_table?catalog=#ref"/>
>>> <process ref="#convertFromIcebergRecords"/> <!-- optional depending on
>>> the next steps in the route -->
>>> <wireTap uri="direct:tap"/>
>>> <to uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/>
>>>
>>> This route generates exchanges from my_table Iceberg table, uses the
>>> wiretap EIP and stores the data into a mongoDB database/collection.
>>>
>>> If I started the component in the Iceberg repo, I think it would make
>>> more sense to have it at camel (as Apache Beam contains the Iceberg
>>> IO).
>>> Thoughts ?
>>>
>>> Comments are welcome !
>>>
>>> NB: on a related topic, I created
>>> https://github.com/apache/iceberg/pull/10365
>>>
>>> Regards
>>> JB
>>>
>>
>
> --
> Ryan Blue
> Tabular
>

Re: [DISCUSS] camel-iceberg component

Reply via email to