Le mer. 13 déc. 2023 à 18:03, Matteo Merli <mme...@apache.org> a écrit :
>
> --
> Matteo Merli
> <mme...@apache.org>
>
>
> On Wed, Dec 13, 2023 at 8:20 AM Christophe Bornet <bornet.ch...@gmail.com>
> wrote:
>
> > Thanks Matteo for bringing this subject.
> >
> > I share the concerns of Lari regarding the move from glibc to musl in
> > terms of security, performance, compatibility with the JVM. Extensive
> > performance tests will have to be done.
> >
>
> Alpine is the *most* used base image across the board, thousands of
> projects are using it with Java.
>
> Barring the fact that, yes, extensive performance/stress/compatibility
> tests will be performed, can you share any specific security, performance
> or JVM compatibility issue?
>
> All the native libraries we are using, from Netty, RocksDB, to BookKeeper
> performance tricks are already providing muls versions or are compatible.
>

That's good to know.
For Netty, I think netty-transport-native-epoll is only built against
glibc 
(https://netty.io/wiki/native-transports.html#using-the-linux-native-transport).
Is there a workaround ?
Other than that, there is the DNS caching issue Lari mentioned.

>
>
> > Also, last time I tried to use alpine with a Python project, it was a
> > nightmare as the support was very poor (libs not working with musl, no
> > wheels for musl). So we decided to move back to glibc. It was some
> > years ago so maybe the situation is better now but it's worth checking
> > before imposing it to Python Function developers.
> >
>
> Pulsar Python client (which is included in the image) already has pre-built
> binaries for Alpine that we publish both for x86-64 as well as for arm64.
> Same goes for all the dependencies.
>
My concern is for user Pulsar Functions. For instance numpy got wheels
on Pypi only very recently. The first release to have wheels for musl
aired in June this year (https://pypi.org/project/numpy/1.25.0/#files)
Compiling Python libraries can be a very tedious process when there
are no existing wheels for an environment.
Other popular libraries may not have their wheels yet. Or maybe they
do, I don't know. I just want to point this aspect to consider.
>
> > Maybe a debian-slim image could be considered as a thiner image than
> > ubuntu, even if not as thin as alpine ?
> >
>
> There are literally 100 CVEs open on the current debian-slim base image
> (just the base with nothing else installed). Including HIGH and CRITICAL
>
> debian:11-slim (debian 11.8)
> Total: 100 (UNKNOWN: 0, LOW: 72, MEDIUM: 15, HIGH: 11, CRITICAL: 2)
>
> Full list: https://gist.github.com/merlimat/65407426e1d1b1be5afdff62555470c2

Let's forget debian-slim 😄!


Note that I'm not at all against using Alpine. I just want to point
some difficulties I had in past experiences.
As I said, this was some years ago and maybe now Alpine can be
considered as the goto for linux docker images.

> >
> > Regards
> >
> > Christophe
> >
> > Le mer. 13 déc. 2023 à 14:16, Lari Hotari <lhot...@apache.org> a écrit :
> > >
> > > +1
> > >
> > > Before switching to Alpine completely, it would be worth running
> > extensive system tests in production-like environments.
> > >
> > > Alpine comes with musl, which makes the JVM behave slightly differently.
> > >
> > > One of the common DNS issues with Alpine was fixed in May 2023 with the
> > Alpine 3.18 release. Alpine finally got full DNS protocol support that
> > impacts usage when there are DNS responses larger than 512 bytes [1].
> > >
> > > Alpine 3.18 comes with musl 1.2.4 with TCP fallback in the DNS resolver.
> > The official Kubernetes docs also contain the recommendation [2] to upgrade
> > Alpine to 3.18+ (newest is currently 3.19) on Kubernetes. I'm no longer
> > concerned about possible DNS resolution issues with Alpine.
> > >
> > > However, one remaining concern related to DNS is the lack of local DNS
> > caching in Alpine. In Pulsar, most of the DNS resolution happens with
> > Netty's DNS resolver that has caching. I'm not sure what the broader impact
> > could be when switching to Alpine that doesn't have DNS caching at the OS
> > level. In Kubernetes environments, most DNS lookups go through a lot of
> > search domains and it puts a lot of load on the DNS server unless clients
> > do caching. It is possible to have a local caching DNS server in Alpine
> > [1], but that doesn't seem to be very convenient.
> > >
> > > The third area where there are differences in musl is in malloc. It's
> > hard to know beforehand how the different malloc algorithm impacts the
> > actual resident memory (RSS) usage. Different malloc algorithms handle
> > memory fragmentation in different ways and there are behavioral
> > differences. System testing could help verify the actual impact.
> > >
> > > -Lari
> > >
> > > 1 -
> > https://bell-sw.com/blog/how-to-deal-with-alpine-dns-issues/#mcetoc_1gtd8v3lt2b
> > > 2 -
> > https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues
> > >
> > > On 2023/12/12 18:58:49 Matteo Merli wrote:
> > > > Hello,
> > > >
> > > > I've created a new proposal to switch Pulsar base docker images from
> > Ubuntu
> > > > to Alpine Linux.
> > > >
> > > > Details and motivation in the PIP:
> > > > https://github.com/apache/pulsar/pull/21716
> > > >
> > > > Matteo
> > > >
> > > > --
> > > > Matteo Merli
> > > > <mme...@apache.org>
> > > >
> >

Reply via email to