According to Red Hat their latest tagged release for UBI9.3, 9.3-1552, has four 
moderate CVE's 
(https://catalog.redhat.com/software/containers/ubi9/ubi/615bcf606feffc5384e8452e).
 There is also the option of basing the Pulsar image on the UBI9-minimal image 
(https://catalog.redhat.com/software/containers/ubi9/ubi-minimal/615bd9b4075b022acc111bf5).
 That may have a better security footprint.

Thank You,

Alex Hall
<ah...@teknoluxion.com>

-----Original Message-----
From: Matteo Merli <matteo.me...@gmail.com>
Sent: Thursday, February 15, 2024 12:55 PM
To: dev@pulsar.apache.org
Subject: ''Re: Re: [DISCUSS] PIP-324: Alpine Docker images

[You don't often get email from matteo.me...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi Alex,

the situation for UBI9 doesn't look much different from Ubuntu:

registry.access.redhat.com/ubi9/ubi (redhat 9.3)
Total: 166 (UNKNOWN: 0, LOW: 138, MEDIUM: 28, HIGH: 0, CRITICAL: 0)

Full list: https://gist.github.com/merlimat/ba96b91ea49709bb218ddc3906bb9e95


--
Matteo Merli
<matteo.me...@gmail.com>


On Thu, Feb 15, 2024 at 9:10 AM Alexander Hall <ah...@teknoluxion.com.invalid> 
wrote:

> Reviving a previous tangent from this discussion. Using UBI9 as a base
> is also a great option. Some end-users use that as a base and copy the
> files from the pulsar and pulsar-all containers as an upstream source.
>
> -Alex H
>
> -----Original Message-----
> From: Matteo Merli <matteo.me...@gmail.com>
> Sent: Wednesday, February 14, 2024 2:01 PM
> To: david.chris...@discordapp.com.invalid
> Cc: dev@pulsar.apache.org
> Subject: ''Re: Re: [DISCUSS] PIP-324: Alpine Docker images
>
> [You don't often get email from *REDACTED*. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Reviving the discussion thread.
>
>
> > For Netty, I think netty-transport-native-epoll is only built
> > against glibc (
>
> https://nett/
> y.io%2Fwiki%2Fnative-transports.html%23using-the-linux-native-transpor
> t&data=05%7C02%7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4
> f53d5%7Cfcceb892218c4d6f9e27223a522b9791%7C0%7C0%7C638436165473628082%
> 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik
> 1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=kJQqQ5o6ZnuIlqO6Chq0P0Z6axc6Ji
> WSP%2F5Qd7bN7xw%3D&reserved=0
> ).
> > Is there a workaround ?
>
> Yes, there is a workaround for Netty. It works perfectly fine by
> including the GLibc compatibility library. Same for Kinesis producer (side 
> note:
> Kinesis SDK is the worst train wreck I've seen in many many years:
> it's a
> C++ binary that it spawned from Java and communicates through a pipe...
> anyway it works fine with the GLibc compatibility lib).
>
> > Other than that, there is the DNS caching issue Lari mentioned.
>
> I think the DNS issue was already solved a few releases ago. In any
> case, it wouldn't affect Pulsar/BK since we use the Netty DNS client.
> In the same way, I believe that JDK also doesn't use the glibc provided DNS 
> client:
> that's why we configure the DNS cache directly in the JVM configuration.
>
> >> - Using a smaller base image like Alpine can save space. The
> >> relative
> size of the JRE image for Alpine is about 45% smaller than the
> equivalent Ubuntu slim image.
> >> - The Ubuntu image has a few tens of CVEs in it, as reported by an
> automated container CVE scan tool, compared to 0 in Alpine.
> > These seem reasonable, but the true magnitude of benefit is likely
> > lower
> in practice. The pulsar-all images are 2.7GB in size, so saving 166MB
> on the base + JRE install translates to just a 6% smaller image.
> Unless we expect other installed packages part of pulsar-all to gain
> additional space savings on Alpine, this difference seems very marginal in 
> practice.
>
> `pulsar-all` is ready for separate discussion (I actually think we
> should discontinue that image).
>
> For `pulsar` image:
>  * apache/pulsar:3.2.0 (which already does not include Presto anymore):
> 919 MB
>  * alpine image wip: 505 MB
>
> There are additional ways we should explore to further reduce the
> image size (eg: removing unused JDK modules, Python packages, etc...)
>
> > Security-wise, I took a cursory look at the CVEs, and many of them
> > are in
> libraries that aren’t used in a Pulsar deployment/are difficult to
> envision a practical exploit scenario. Automated scanning tool results
> should be taken with a grain of salt - they generate a lot of alerts,
> and many public container images throw off these CVE alerts nowadays.
> The counterargument is that only a fraction of the libraries indicated
> are even loaded at runtime, only some fraction of those end up
> potentially being exploitable, and only a smaller fraction have no
> fix/workaround. This isn’t to say reducing the vulnerability surface
> by using an image with less cruft in it is not a worthwhile endeavor —
> I do think we should try to tackle it
> -- but I’m simply trying to be realistic about what our actual gains
> will be from switching to Alpine.
>
> Even though the CVEs might not be a "real" security issue, or not be
> exploitable in the context of Pulsar, it is really not how any
> security team would look at it. From their perspective, it becomes
> unmanageable to check and understand every single CVE to assess the
> potential specific threat.
>
> This is a real problem that is causing a lot of headaches to have
> Pulsar distribution taken seriously from a security posture perspective.
>
> Just have a glance at the security CVE issues in our last Pulsar
> release, released just a few days ago:
>
> apachepulsar/pulsar:3.2.0 (ubuntu 22.04)
> Total: 243 (UNKNOWN: 0, LOW: 146, MEDIUM: 93, HIGH: 4, CRITICAL: 0)
>
> Compare with Pulsar image based on Alpine:
>
> merlimat/pulsar:3.3.0-SNAPSHOT-f2a91a1 (alpine 3.19.1)
> Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
>
> Full list here:
> https://gist/
> .github.com%2Fmerlimat%2Fee7534992b21cae0b04c8c63f64456ff&data=05%7C02
> %7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcceb8
> 92218c4d6f9e27223a522b9791%7C0%7C0%7C638436165473633950%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> Mn0%3D%7C0%7C%7C%7C&sdata=y3NzrPg14CBnD2nsr1Devtc5w2Ki0EPKeigKRT5piMI%
> 3D&reserved=0 The above are all issues coming from Ubuntu base image.
>
> > It’s also worth mentioning we’d be moving away from other large
> open-source big data projects in a way. Spark [2], Flink [3], Kafka
> [4], Elasticsearch [5], and Trino [6] are based on Temurin/Ubuntu/ubi.
> In my brief search, I didn’t find familiar names of tools in the big
> data ecosystem with official images based on Alpine.
> > Distroless would also remove almost everything from our base images,
> minimizing space, reducing the vulnerability surface, and by
> extension, reducing the CVE alerts from automated tooling. Apache
> Druid [7] has used Distroless for a while in their official images. We
> could achieve the same aims without any risk from musl/glibc, DNS
> quirks, or other hiccups that Alpine may have.
>
>
> Regarding the OpenJDK distribution, the team from Amazon Corretto,
> publishes well tested and supported Alpine packages. See
> https://aws/.
> amazon.com%2Fcorretto&data=05%7C02%7Cahall%40teknoluxion.com%7C079cd39
> e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6f9e27223a522b9791%7C0%7C0%7
> C638436165473638107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIj
> oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=%2FTa7LLrx
> LHOwdHOIjP%2BIiFON%2FEjQdTH0cTZFtFaQkgA%3D&reserved=0
>
> I have created a WIP/draft PR to show the potential changes:
> https://gith/
> ub.com%2Fapache%2Fpulsar%2Fpull%2F22054&data=05%7C02%7Cahall%40teknolu
> xion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6f9e27223a
> 522b9791%7C0%7C0%7C638436165473641886%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
> MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7
> C&sdata=6eKLufeZ%2BsqcKJNp01PUYLYQKBLOSHGsDtbN831QQzM%3D&reserved=0
>
> The image already passes all the integration tests and has been tested
> for few weeks in a test cluster.
>
> I have pushed a Docker image for preview purposes:
> merlimat/pulsar/3.3.0-SNAPSHOT-f2a91a1
>
>
> https://hub/.
> docker.com%2Flayers%2Fmerlimat%2Fpulsar%2F3.3.0-SNAPSHOT-f2a91a1%2Fima
> ges%2Fsha256-2d94832618bf30c02baa269bdf943c8f37aa5430258b7b4018f37ed12
> 0abb17a%3Fcontext%3Dexplore&data=05%7C02%7Cahall%40teknoluxion.com%7C0
> 79cd39e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6f9e27223a522b9791%7C0
> %7C0%7C638436165473645634%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=B6z5
> xdq%2FF%2BLLasB7MM1OrqupF3vqullwEjZzOQT7ekQ%3D&reserved=0
>
> Thanks,
> Matteo
>
> --
> Matteo Merli
> <matteo.me...@gmail.com>
>
>
> On Wed, Dec 20, 2023 at 12:49 PM David Christle
> <david.chris...@discordapp.com.invalid> wrote:
>
> > Are we sure the move to Alpine is worth the extensive performance
> > testing and the risk of issues? Sticking with a popular glibc image
> > like Temurin, Ubuntu/Debian, or ubi-minimal (mentioned also in this
> > discussion) seems like a better path to me, without the risk of
> > glibc vs musl issues. Using Distroless seems like another good
> > potential option, as it would achieve the same aims as the Alpine
> > move, with less
> potential risk.
> >
> > The DNS issues seen with Alpine are worth paying strong attention to.
> > Someone running a Pulsar deployment using the images could have a
> > very difficult time debugging library/glibc vs musl/DNS issues, due
> > to their low-level nature. A fix for the DNS issue only landed less
> > than a year ago [1]. Unless we have a compelling reason for Alpine,
> > it may be safer to wait for more adoption/testing before choosing it
> > for the
> official Pulsar images.
> >
> > The two main arguments in the PIP are:
> >
> > - Using a smaller base image like Alpine can save space. The
> > relative size of the JRE image for Alpine is about 45% smaller than
> > the equivalent Ubuntu slim image.
> >
> > - The Ubuntu image has a few tens of CVEs in it, as reported by an
> > automated container CVE scan tool, compared to 0 in Alpine.
> >
> >
> > These seem reasonable, but the true magnitude of benefit is likely
> > lower in practice. The pulsar-all images are 2.7GB in size, so
> > saving 166MB on the base + JRE install translates to just a 6% smaller 
> > image.
> > Unless we expect other installed packages part of pulsar-all to gain
> > additional space savings on Alpine, this difference seems very
> > marginal
> in practice.
> >
> > Security-wise, I took a cursory look at the CVEs, and many of them
> > are in libraries that aren’t used in a Pulsar deployment/are
> > difficult to envision a practical exploit scenario. Automated
> > scanning tool results should be taken with a grain of salt - they
> > generate a lot of alerts, and many public container images throw off these 
> > CVE alerts nowadays.
> > The counterargument is that only a fraction of the libraries
> > indicated are even loaded at runtime, only some fraction of those
> > end up potentially being exploitable, and only a smaller fraction
> > have no fix/workaround. This isn’t to say reducing the vulnerability
> > surface by using an image with less cruft in it is not a worthwhile
> > endeavor — I do think we should try to tackle it -- but I’m simply
> > trying to be realistic about what our actual gains will be from switching 
> > to Alpine.
> >
> > It’s also worth mentioning we’d be moving away from other large
> > open-source big data projects in a way. Spark [2], Flink [3], Kafka
> > [4], Elasticsearch [5], and Trino [6] are based on Temurin/Ubuntu/ubi.
> > In my brief search, I didn’t find familiar names of tools in the big
> > data ecosystem with official images based on Alpine.
> >
> > Distroless would also remove almost everything from our base images,
> > minimizing space, reducing the vulnerability surface, and by
> > extension, reducing the CVE alerts from automated tooling. Apache
> > Druid [7] has used Distroless for a while in their official images.
> > We could achieve the same aims without any risk from musl/glibc, DNS
> > quirks, or other hiccups that Alpine may have.
> >
> > Regards,
> > David
> >
> >
> > [1]
> > https://gi/
> > tlab.alpinelinux.org%2Falpine%2Ftsc%2F-%2Fissues%2F43%23note_295556&
> > data=05%7C02%7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4
> > f53d5%7Cfcceb892218c4d6f9e27223a522b9791%7C0%7C0%7C63843616547365006
> > 8%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi
> > I6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=tvzXRU7tQWR8TkRjzKEk0PtW
> > TJj8aenzYqjDGWSok3E%3D&reserved=0
> > [2] Apache Spark - Temurin -
> > https://gi/
> > thub.com%2Fapache%2Fflink-docker%2Ftree%2Fmaster%2F1.18&data=05%7C02
> > %7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcce
> > b892218c4d6f9e27223a522b9791%7C0%7C0%7C638436165473654182%7CUnknown%
> > 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
> > JXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AUS85hOOnewk88bm7J4zDsr5F1cACzcj6ULv
> > 2%2BGGEzk%3D&reserved=0
> > [3] Apache Flink - Temurin -
> > https://gi/
> > thub.com%2Fapache%2Fflink-docker%2Ftree%2Fmaster%2F1.18&data=05%7C02
> > %7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcce
> > b892218c4d6f9e27223a522b9791%7C0%7C0%7C638436165473658191%7CUnknown%
> > 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
> > JXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=L%2BUzvHzoENFmwQUeAgv0lFZlpDKIAncFu7
> > %2FA3YeQPeQ%3D&reserved=0 [4] KIP-975: Docker Image for Apache Kafka
> > - Temurin -
> >
> https://cwik/
> i.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-975%253A%2BDocker%2B
> Image%2Bfor%2BApache%2BKafka&data=05%7C02%7Cahall%40teknoluxion.com%7C
> 079cd39e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6f9e27223a522b9791%7C
> 0%7C0%7C638436165473662148%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDA
> iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=090
> vsrQjkBCqK%2FVoPpVbBQ9Uo4LcmxSJY1dOzsuEwCw%3D&reserved=0
> > [5] Elasticsearch - Ubuntu & ubi-minimal -
> >
> https://gith/
> ub.com%2Felastic%2Felasticsearch%2Fblob%2Fbdde29720a9e37224a90e5f186ab
> bcbc73ff9351%2Fdistribution%2Fdocker%2FREADME.md&data=05%7C02%7Cahall%
> 40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6
> f9e27223a522b9791%7C0%7C0%7C638436165473666193%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 0%7C%7C%7C&sdata=jKPpyOs5ZgrqHE5i%2FscNIDp3c8f8iXHjqsjcSgodmwQ%3D&rese
> rved=0 [6] Trino - ubi, after moving from Ubuntu -
> >
> https://hub/.
> docker.com%2Flayers%2Ftrinodb%2Ftrino%2F435%2Fimages%2Fsha256-9540a785
> c31c4ba9ad099ad99ae06ccd5ccca506e39b7d557effe1482309e05d&data=05%7C02%
> 7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcceb89
> 2218c4d6f9e27223a522b9791%7C0%7C0%7C638436165473669894%7CUnknown%7CTWF
> pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
> n0%3D%7C0%7C%7C%7C&sdata=kS9hu%2FXVA4fHP3dL23LBQ1yY5rzIvPZ40V73UBbN7cA
> %3D&reserved=0
> > [7] Apache Druid - Distroless -
> >
> https://gith/
> ub.com%2Fapache%2Fdruid%2Fblob%2Fe373f6269251655f5be93ce895aee8dee8cc6
> 7dd%2Fdistribution%2Fdocker%2FDockerfile%23L4&data=05%7C02%7Cahall%40t
> eknoluxion.com%7C079cd39e87b240332d3108dc2e4f53d5%7Cfcceb892218c4d6f9e
> 27223a522b9791%7C0%7C0%7C638436165473676661%7CUnknown%7CTWFpbGZsb3d8ey
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7
> C%7C%7C&sdata=xqjkAQD0%2FQM%2BQ%2BWc1Gfs6KQ%2FOm1RsGeKfaYijvc2ogg%3D&r
> eserved=0
> >
> >
> > On 2023/12/13 17:06:12 Matteo Merli wrote:
> > > I don't think the compatibility for downstream users is going to
> > > be a big
> > > problem:
> > >  1. Most users don't need to modify the Pulsar image in
> > > significant way  2. If they do, they won't be using the "latest"
> > > tag, but rather a
> > specific
> > > version
> > >  3. Users who are dependent on the Ubuntu base image can stay on
> > > the
> > > 3.0 LTS release branch for the entire LTS lifespan
> > >
> > > I would avoid supporting 2 images at the same time because it
> > > would make
> > it
> > > very hard to properly test them both.
> > >
> > >
> > > --
> > > Matteo Merli
> > > <mm...@apache.org>
> > >
> > >
> > > On Tue, Dec 12, 2023 at 8:57 PM Zixuan Liu <zi...@apache.org> wrote:
> > >
> > > > +1.
> > > >
> > > > It is a good idea to use the Alpine image to run the Pulsar, as
> > > > it is
> > more
> > > > secure.
> > > >
> > > > However, switching images may affect downstream users, and I am
> > wondering
> > > > if it is possible to provide multiple docker tags:
> > > >   - latest: using the Ubuntu image
> > > >   - alpine: using the Alpine image
> > > >
> > > > Thanks,
> > > > Zixuan
> > > >
> > > > Yunze Xu <xy...@apache.org> 于2023年12月13日周三 12:24写道:
> > > >
> > > > > +1 to me. The Alpine Linux is much more light-weight than Ubuntu.
> > > > >
> > > > > Thanks,
> > > > > Yunze
> > > > >
> > > > > On Wed, Dec 13, 2023 at 3:00 AM Matteo Merli
> > > > > <mm...@apache.org>
> > wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've created a new proposal to switch Pulsar base docker
> > > > > > images
> > from
> > > > > Ubuntu
> > > > > > to Alpine Linux.
> > > > > >
> > > > > > Details and motivation in the PIP:
> > > > > > https://nam12.safelinks.protection.outlook.com/?url=https%3A
> > > > > > %2F%2Fgithub.com%2Fapache%2Fpulsar%2Fpull%2F21716&data=05%7C
> > > > > > 02%7Cahall%40teknoluxion.com%7C079cd39e87b240332d3108dc2e4f5
> > > > > > 3d5%7Cfcceb892218c4d6f9e27223a522b9791%7C0%7C0%7C63843616547
> > > > > > 3681456%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> > > > > > V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=B0
> > > > > > vOlZVAdBUQV3qdkxAIJe5G4OLQsnCAkntD1tsSEVk%3D&reserved=0
> > > > > >
> > > > > > Matteo
> > > > > >
> > > > > > --
> > > > > > Matteo Merli
> > > > > > <mm...@apache.org>
> > > > >
> > > >
> > >
>

Reply via email to