Re: REST catalog high availability

2024-12-18 Thread Fokko Driesprong
Hey Vladimir, Thanks for raising this thread. I'm also reluctant to add this to the application layer. We would also need to support this with the other clients that are out there. Did you give JB's suggestion around the PoolingHttpClientConnectionManager a try? Kind regards, Fokko Op di 17 dec

Re: REST catalog high availability

2024-12-17 Thread Vladimir Ozerov
Hi Jean, Thanks for the response, I agree with all points. For reference, you mentioned Apache Ignite - I worked on it for many years, and used to be an active committer/PMC there. This project is a very good example of how multiple failures to keep the complexity under control significantly slow

Re: REST catalog high availability

2024-12-17 Thread Jean-Baptiste Onofré
Hi Vladimir As I said in my previous email, I can already "inject" the PoolingHttpClientConnectionManager in the client. So, technically speaking, I think it's do-able. So, we can always document how to use that with several endpoints. I understand your points and they make sense. However, implem

Re: REST catalog high availability

2024-12-17 Thread Vladimir Ozerov
Hi, Thank you for the feedback. I understand the concerns about adding more and more features to the protocol, especially if they might be implemented elsewhere. And every added bit of complexity should have clear cost/benefit ratio. Iceberg is becoming the de-facto standard for multiple workload

Re: REST catalog high availability

2024-12-09 Thread Yufei Gu
Load balancing operates at a different layer than APIs, with various implementations available, such as etcd and Zookeeper. I’d prefer to avoid introducing additional complexity at the web service API level. Yufei On Mon, Dec 9, 2024 at 8:35 AM Jean-Baptiste Onofré wrote: > Hi Vladimir, > > As

Re: REST catalog high availability

2024-12-09 Thread Jean-Baptiste Onofré
Hi Vladimir, As you said, today, it's possible to use a LB in front of multiple instances (using nginx, ELB, ...). I think it's pretty easy to setup and at "infrastructure" level. As it's possible to plug the HTTP5 client in Iceberg REST client, I think it's possible to inject PoolingHttpClientCo

Re: REST catalog high availability

2024-12-09 Thread Samrose Ahmed
In my opinion, this is unnecessary, it's well solved by load balancers/proxies. On Mon, Dec 9, 2024, 8:12 AM Vladimir Ozerov wrote: > Hi, > > Catalog is a critical part of Iceberg infrastructure and may require > highly available setup. In similar services (e.g., HMS, etc) this is often > done a

REST catalog high availability

2024-12-09 Thread Vladimir Ozerov
Hi, Catalog is a critical part of Iceberg infrastructure and may require highly available setup. In similar services (e.g., HMS, etc) this is often done as follows: 1. Start several service instances 2. Decide which one is coordinator via etcd, Zookeper, Ratis, etc 3. Expose HA endpoint