Re: [DISCUSS] camel-iceberg component

Jean-Baptiste Onofré Wed, 22 May 2024 00:58:08 -0700

Hi Manish

No, Camel is not an alternative to Spark or Flink: Camel is not a
query engine. It's more a "complement" to Kafka Connect.


Regards
JB

On Wed, May 22, 2024 at 7:09 AM Manish Malhotra
<manish.malhotra.w...@gmail.com> wrote:
>
> Is Camel can be used as an alternate to Flink?
>
>
> On Tue, May 21, 2024 at 10:17 AM Ryan Blue <b...@tabular.io> wrote:
>>
>> This is an interesting idea. What is the use case and where should this 
>> live? I'm unfamiliar with Camel and I'm not sure what the normal thing is. 
>> At least in the Iceberg community, we generally avoid adding connectors 
>> unless there is a clear use case and demand for them. We don't want to add 
>> code that needs to be maintained but isn't used.
>>
>> On Tue, May 21, 2024 at 10:15 AM Yufei Gu <flyrain...@gmail.com> wrote:
>>>
>>> Hi JB,
>>>
>>> Thanks for sharing. Got a few questions:
>>>
>>> Does Apache Camel rely on other engines, e.g., Spark or Flink for any 
>>> processing, or is it fully self-contained?
>>> What are the potential challenges or limitations you foresee? For example, 
>>> does it generate too many commits and/or small files considering its use 
>>> cases(IoT, Event streaming)? Can Camel cache ingestion data, and write it 
>>> to the Iceberg table as a batch?
>>> How do you recommend handling schema evolution in Iceberg tables when 
>>> integrating with Camel routes?
>>>
>>> Yufei
>>>
>>>
>>> On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
>>> wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> I'm working on a Iceberg component for Apache Camel:
>>>> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main
>>>>
>>>> Apache Camel is an integration framework, supporting a lot of
>>>> components and EIPs (Enterprise Integration Patterns, like Content
>>>> Based Router, Splitter, Aggregator, Content Enricher, ...).
>>>> Camel is very popular in a lot of use cases, like IoT, system
>>>> integration, event streamings, ...
>>>>
>>>> This component provides a Camel component with:
>>>> - a Camel consumer endpoint (from) to read data from Iceberg
>>>> tables/views (scan) and create a Camel exchange
>>>> - a Camel producer endpoint (to) to write data (from Camel exchange)
>>>> to Iceberg tables/views
>>>>
>>>> For instance, you can write a Camel route like this (using the
>>>> spring/blueprint DSL for instance):
>>>>
>>>> <from uri="jms:queue:foo"/>
>>>> <process ref="#convertToIcebergRecords"/> <!-- optional depending on
>>>> the exchange message body -->
>>>> <to uri="iceberg:my_table?catalog=#ref"/>
>>>>
>>>> This route is event driven, consuming messages from the foo JMS queue
>>>> (from Apache ActiveMQ for instance), and writing a message body to
>>>> my_table iceberg table (it's possible to use a router or multicast
>>>> EIPs to send the exchange to different tables).
>>>> NB: for the from (consumer endpoint), you can use any Camel component
>>>> (https://camel.apache.org/components/4.4.x/).
>>>>
>>>> You can also consume (scan) data from an Iceberg table, and send the
>>>> generated Exchange to any endpoint/route:
>>>>
>>>> <from uri="iceberg:my_table?catalog=#ref"/>
>>>> <process ref="#convertFromIcebergRecords"/> <!-- optional depending on
>>>> the next steps in the route -->
>>>> <wireTap uri="direct:tap"/>
>>>> <to uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/>
>>>>
>>>> This route generates exchanges from my_table Iceberg table, uses the
>>>> wiretap EIP and stores the data into a mongoDB database/collection.
>>>>
>>>> If I started the component in the Iceberg repo, I think it would make
>>>> more sense to have it at camel (as Apache Beam contains the Iceberg
>>>> IO).
>>>> Thoughts ?
>>>>
>>>> Comments are welcome !
>>>>
>>>> NB: on a related topic, I created 
>>>> https://github.com/apache/iceberg/pull/10365
>>>>
>>>> Regards
>>>> JB
>>
>>
>>
>> --
>> Ryan Blue
>> Tabular

Re: [DISCUSS] camel-iceberg component

Reply via email to