Re: [DISCUSS] camel-iceberg component

Jean-Baptiste Onofré Wed, 22 May 2024 07:02:35 -0700

Hi Omar,

That's the plan (see the last section in my previous email). Just
wanted to bring some attention in the Iceberg community :)


Regards
JB

On Wed, May 22, 2024 at 10:01 AM Omar Al-Safi <o...@oalsafi.com> wrote:
>
> IMO the Camel iceberg component should live in the camel repo. it can be part 
> of the camel components registry in camel
>
> On Wed, May 22, 2024 at 9:58 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
> wrote:
>>
>> Hi Manish
>>
>> No, Camel is not an alternative to Spark or Flink: Camel is not a
>> query engine. It's more a "complement" to Kafka Connect.
>>
>> Regards
>> JB
>>
>> On Wed, May 22, 2024 at 7:09 AM Manish Malhotra
>> <manish.malhotra.w...@gmail.com> wrote:
>> >
>> > Is Camel can be used as an alternate to Flink?
>> >
>> >
>> > On Tue, May 21, 2024 at 10:17 AM Ryan Blue <b...@tabular.io> wrote:
>> >>
>> >> This is an interesting idea. What is the use case and where should this 
>> >> live? I'm unfamiliar with Camel and I'm not sure what the normal thing 
>> >> is. At least in the Iceberg community, we generally avoid adding 
>> >> connectors unless there is a clear use case and demand for them. We don't 
>> >> want to add code that needs to be maintained but isn't used.
>> >>
>> >> On Tue, May 21, 2024 at 10:15 AM Yufei Gu <flyrain...@gmail.com> wrote:
>> >>>
>> >>> Hi JB,
>> >>>
>> >>> Thanks for sharing. Got a few questions:
>> >>>
>> >>> Does Apache Camel rely on other engines, e.g., Spark or Flink for any 
>> >>> processing, or is it fully self-contained?
>> >>> What are the potential challenges or limitations you foresee? For 
>> >>> example, does it generate too many commits and/or small files 
>> >>> considering its use cases(IoT, Event streaming)? Can Camel cache 
>> >>> ingestion data, and write it to the Iceberg table as a batch?
>> >>> How do you recommend handling schema evolution in Iceberg tables when 
>> >>> integrating with Camel routes?
>> >>>
>> >>> Yufei
>> >>>
>> >>>
>> >>> On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
>> >>> wrote:
>> >>>>
>> >>>> Hi folks,
>> >>>>
>> >>>> I'm working on a Iceberg component for Apache Camel:
>> >>>> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main
>> >>>>
>> >>>> Apache Camel is an integration framework, supporting a lot of
>> >>>> components and EIPs (Enterprise Integration Patterns, like Content
>> >>>> Based Router, Splitter, Aggregator, Content Enricher, ...).
>> >>>> Camel is very popular in a lot of use cases, like IoT, system
>> >>>> integration, event streamings, ...
>> >>>>
>> >>>> This component provides a Camel component with:
>> >>>> - a Camel consumer endpoint (from) to read data from Iceberg
>> >>>> tables/views (scan) and create a Camel exchange
>> >>>> - a Camel producer endpoint (to) to write data (from Camel exchange)
>> >>>> to Iceberg tables/views
>> >>>>
>> >>>> For instance, you can write a Camel route like this (using the
>> >>>> spring/blueprint DSL for instance):
>> >>>>
>> >>>> <from uri="jms:queue:foo"/>
>> >>>> <process ref="#convertToIcebergRecords"/> <!-- optional depending on
>> >>>> the exchange message body -->
>> >>>> <to uri="iceberg:my_table?catalog=#ref"/>
>> >>>>
>> >>>> This route is event driven, consuming messages from the foo JMS queue
>> >>>> (from Apache ActiveMQ for instance), and writing a message body to
>> >>>> my_table iceberg table (it's possible to use a router or multicast
>> >>>> EIPs to send the exchange to different tables).
>> >>>> NB: for the from (consumer endpoint), you can use any Camel component
>> >>>> (https://camel.apache.org/components/4.4.x/).
>> >>>>
>> >>>> You can also consume (scan) data from an Iceberg table, and send the
>> >>>> generated Exchange to any endpoint/route:
>> >>>>
>> >>>> <from uri="iceberg:my_table?catalog=#ref"/>
>> >>>> <process ref="#convertFromIcebergRecords"/> <!-- optional depending on
>> >>>> the next steps in the route -->
>> >>>> <wireTap uri="direct:tap"/>
>> >>>> <to uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/>
>> >>>>
>> >>>> This route generates exchanges from my_table Iceberg table, uses the
>> >>>> wiretap EIP and stores the data into a mongoDB database/collection.
>> >>>>
>> >>>> If I started the component in the Iceberg repo, I think it would make
>> >>>> more sense to have it at camel (as Apache Beam contains the Iceberg
>> >>>> IO).
>> >>>> Thoughts ?
>> >>>>
>> >>>> Comments are welcome !
>> >>>>
>> >>>> NB: on a related topic, I created 
>> >>>> https://github.com/apache/iceberg/pull/10365
>> >>>>
>> >>>> Regards
>> >>>> JB
>> >>
>> >>
>> >>
>> >> --
>> >> Ryan Blue
>> >> Tabular

Re: [DISCUSS] camel-iceberg component

Reply via email to