Le mer. 13 déc. 2023 à 18:03, Matteo Merli <mme...@apache.org> a écrit : > > -- > Matteo Merli > <mme...@apache.org> > > > On Wed, Dec 13, 2023 at 8:20 AM Christophe Bornet <bornet.ch...@gmail.com> > wrote: > > > Thanks Matteo for bringing this subject. > > > > I share the concerns of Lari regarding the move from glibc to musl in > > terms of security, performance, compatibility with the JVM. Extensive > > performance tests will have to be done. > > > > Alpine is the *most* used base image across the board, thousands of > projects are using it with Java. > > Barring the fact that, yes, extensive performance/stress/compatibility > tests will be performed, can you share any specific security, performance > or JVM compatibility issue? > > All the native libraries we are using, from Netty, RocksDB, to BookKeeper > performance tricks are already providing muls versions or are compatible. >
That's good to know. For Netty, I think netty-transport-native-epoll is only built against glibc (https://netty.io/wiki/native-transports.html#using-the-linux-native-transport). Is there a workaround ? Other than that, there is the DNS caching issue Lari mentioned. > > > > Also, last time I tried to use alpine with a Python project, it was a > > nightmare as the support was very poor (libs not working with musl, no > > wheels for musl). So we decided to move back to glibc. It was some > > years ago so maybe the situation is better now but it's worth checking > > before imposing it to Python Function developers. > > > > Pulsar Python client (which is included in the image) already has pre-built > binaries for Alpine that we publish both for x86-64 as well as for arm64. > Same goes for all the dependencies. > My concern is for user Pulsar Functions. For instance numpy got wheels on Pypi only very recently. The first release to have wheels for musl aired in June this year (https://pypi.org/project/numpy/1.25.0/#files) Compiling Python libraries can be a very tedious process when there are no existing wheels for an environment. Other popular libraries may not have their wheels yet. Or maybe they do, I don't know. I just want to point this aspect to consider. > > > Maybe a debian-slim image could be considered as a thiner image than > > ubuntu, even if not as thin as alpine ? > > > > There are literally 100 CVEs open on the current debian-slim base image > (just the base with nothing else installed). Including HIGH and CRITICAL > > debian:11-slim (debian 11.8) > Total: 100 (UNKNOWN: 0, LOW: 72, MEDIUM: 15, HIGH: 11, CRITICAL: 2) > > Full list: https://gist.github.com/merlimat/65407426e1d1b1be5afdff62555470c2 Let's forget debian-slim 😄! Note that I'm not at all against using Alpine. I just want to point some difficulties I had in past experiences. As I said, this was some years ago and maybe now Alpine can be considered as the goto for linux docker images. > > > > Regards > > > > Christophe > > > > Le mer. 13 déc. 2023 à 14:16, Lari Hotari <lhot...@apache.org> a écrit : > > > > > > +1 > > > > > > Before switching to Alpine completely, it would be worth running > > extensive system tests in production-like environments. > > > > > > Alpine comes with musl, which makes the JVM behave slightly differently. > > > > > > One of the common DNS issues with Alpine was fixed in May 2023 with the > > Alpine 3.18 release. Alpine finally got full DNS protocol support that > > impacts usage when there are DNS responses larger than 512 bytes [1]. > > > > > > Alpine 3.18 comes with musl 1.2.4 with TCP fallback in the DNS resolver. > > The official Kubernetes docs also contain the recommendation [2] to upgrade > > Alpine to 3.18+ (newest is currently 3.19) on Kubernetes. I'm no longer > > concerned about possible DNS resolution issues with Alpine. > > > > > > However, one remaining concern related to DNS is the lack of local DNS > > caching in Alpine. In Pulsar, most of the DNS resolution happens with > > Netty's DNS resolver that has caching. I'm not sure what the broader impact > > could be when switching to Alpine that doesn't have DNS caching at the OS > > level. In Kubernetes environments, most DNS lookups go through a lot of > > search domains and it puts a lot of load on the DNS server unless clients > > do caching. It is possible to have a local caching DNS server in Alpine > > [1], but that doesn't seem to be very convenient. > > > > > > The third area where there are differences in musl is in malloc. It's > > hard to know beforehand how the different malloc algorithm impacts the > > actual resident memory (RSS) usage. Different malloc algorithms handle > > memory fragmentation in different ways and there are behavioral > > differences. System testing could help verify the actual impact. > > > > > > -Lari > > > > > > 1 - > > https://bell-sw.com/blog/how-to-deal-with-alpine-dns-issues/#mcetoc_1gtd8v3lt2b > > > 2 - > > https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues > > > > > > On 2023/12/12 18:58:49 Matteo Merli wrote: > > > > Hello, > > > > > > > > I've created a new proposal to switch Pulsar base docker images from > > Ubuntu > > > > to Alpine Linux. > > > > > > > > Details and motivation in the PIP: > > > > https://github.com/apache/pulsar/pull/21716 > > > > > > > > Matteo > > > > > > > > -- > > > > Matteo Merli > > > > <mme...@apache.org> > > > > > >