seems reasonable to keep camel-iceberg inside camel project, which already has many integration components. +1 for that.
On Wed, May 22, 2024 at 8:58 AM Ajantha Bhat <ajanthab...@gmail.com> wrote: > +1, > > It is always good to have new ways to ingest data as an Iceberg table. > > - Ajantha > > On Wed, May 22, 2024 at 7:32 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> Hi Omar, >> >> That's the plan (see the last section in my previous email). Just >> wanted to bring some attention in the Iceberg community :) >> >> Regards >> JB >> >> On Wed, May 22, 2024 at 10:01 AM Omar Al-Safi <o...@oalsafi.com> wrote: >> > >> > IMO the Camel iceberg component should live in the camel repo. it can >> be part of the camel components registry in camel >> > >> > On Wed, May 22, 2024 at 9:58 AM Jean-Baptiste Onofré <j...@nanthrax.net> >> wrote: >> >> >> >> Hi Manish >> >> >> >> No, Camel is not an alternative to Spark or Flink: Camel is not a >> >> query engine. It's more a "complement" to Kafka Connect. >> >> >> >> Regards >> >> JB >> >> >> >> On Wed, May 22, 2024 at 7:09 AM Manish Malhotra >> >> <manish.malhotra.w...@gmail.com> wrote: >> >> > >> >> > Is Camel can be used as an alternate to Flink? >> >> > >> >> > >> >> > On Tue, May 21, 2024 at 10:17 AM Ryan Blue <b...@tabular.io> wrote: >> >> >> >> >> >> This is an interesting idea. What is the use case and where should >> this live? I'm unfamiliar with Camel and I'm not sure what the normal thing >> is. At least in the Iceberg community, we generally avoid adding connectors >> unless there is a clear use case and demand for them. We don't want to add >> code that needs to be maintained but isn't used. >> >> >> >> >> >> On Tue, May 21, 2024 at 10:15 AM Yufei Gu <flyrain...@gmail.com> >> wrote: >> >> >>> >> >> >>> Hi JB, >> >> >>> >> >> >>> Thanks for sharing. Got a few questions: >> >> >>> >> >> >>> Does Apache Camel rely on other engines, e.g., Spark or Flink for >> any processing, or is it fully self-contained? >> >> >>> What are the potential challenges or limitations you foresee? For >> example, does it generate too many commits and/or small files considering >> its use cases(IoT, Event streaming)? Can Camel cache ingestion data, and >> write it to the Iceberg table as a batch? >> >> >>> How do you recommend handling schema evolution in Iceberg tables >> when integrating with Camel routes? >> >> >>> >> >> >>> Yufei >> >> >>> >> >> >>> >> >> >>> On Tue, May 21, 2024 at 6:06 AM Jean-Baptiste Onofré < >> j...@nanthrax.net> wrote: >> >> >>>> >> >> >>>> Hi folks, >> >> >>>> >> >> >>>> I'm working on a Iceberg component for Apache Camel: >> >> >>>> >> https://github.com/jbonofre/iceberg/tree/CAMEL/camel/camel-iceberg/src/main >> >> >>>> >> >> >>>> Apache Camel is an integration framework, supporting a lot of >> >> >>>> components and EIPs (Enterprise Integration Patterns, like Content >> >> >>>> Based Router, Splitter, Aggregator, Content Enricher, ...). >> >> >>>> Camel is very popular in a lot of use cases, like IoT, system >> >> >>>> integration, event streamings, ... >> >> >>>> >> >> >>>> This component provides a Camel component with: >> >> >>>> - a Camel consumer endpoint (from) to read data from Iceberg >> >> >>>> tables/views (scan) and create a Camel exchange >> >> >>>> - a Camel producer endpoint (to) to write data (from Camel >> exchange) >> >> >>>> to Iceberg tables/views >> >> >>>> >> >> >>>> For instance, you can write a Camel route like this (using the >> >> >>>> spring/blueprint DSL for instance): >> >> >>>> >> >> >>>> <from uri="jms:queue:foo"/> >> >> >>>> <process ref="#convertToIcebergRecords"/> <!-- optional depending >> on >> >> >>>> the exchange message body --> >> >> >>>> <to uri="iceberg:my_table?catalog=#ref"/> >> >> >>>> >> >> >>>> This route is event driven, consuming messages from the foo JMS >> queue >> >> >>>> (from Apache ActiveMQ for instance), and writing a message body to >> >> >>>> my_table iceberg table (it's possible to use a router or multicast >> >> >>>> EIPs to send the exchange to different tables). >> >> >>>> NB: for the from (consumer endpoint), you can use any Camel >> component >> >> >>>> (https://camel.apache.org/components/4.4.x/). >> >> >>>> >> >> >>>> You can also consume (scan) data from an Iceberg table, and send >> the >> >> >>>> generated Exchange to any endpoint/route: >> >> >>>> >> >> >>>> <from uri="iceberg:my_table?catalog=#ref"/> >> >> >>>> <process ref="#convertFromIcebergRecords"/> <!-- optional >> depending on >> >> >>>> the next steps in the route --> >> >> >>>> <wireTap uri="direct:tap"/> >> >> >>>> <to >> uri="mongodb:myDB?database=mydb&collection=foo&operation=insert"/> >> >> >>>> >> >> >>>> This route generates exchanges from my_table Iceberg table, uses >> the >> >> >>>> wiretap EIP and stores the data into a mongoDB >> database/collection. >> >> >>>> >> >> >>>> If I started the component in the Iceberg repo, I think it would >> make >> >> >>>> more sense to have it at camel (as Apache Beam contains the >> Iceberg >> >> >>>> IO). >> >> >>>> Thoughts ? >> >> >>>> >> >> >>>> Comments are welcome ! >> >> >>>> >> >> >>>> NB: on a related topic, I created >> https://github.com/apache/iceberg/pull/10365 >> >> >>>> >> >> >>>> Regards >> >> >>>> JB >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Ryan Blue >> >> >> Tabular >> >