I think we can start with our own, and if this attempt is stabilized pretty well, it might be considered as best practice and being spread automatically. Since we have a 1 week buffer by ourselves, at least we can avoid getting into the issue on our own, and if this approach sounds attractive to other projects, they will adopt it regardless of whether the approach is escalated or not.
That said, I'm not proactively working on sub-projects of Spark, but I wonder what's the main difference for sub-projects to adopt this policy? It's just a question and never a thing of pushing for enforcement. I mainly care about the main repository. On Thu, Apr 23, 2026 at 7:53 AM Tian Gao via dev <[email protected]> wrote: > So are you suggesting that we don't enforce this 1-week buffer for all > Apache projects? I agree that a legitimate Apache project release is > well-vetted and generally safe, but there could be situations where a > release is maliciously executed by stealing identities of people who have > access to make releases - that's where many supply chain attacks occur. > Moreover, it would be more difficult to enforce this (whether for LLM or > for human) to treat Apache projects differently. Also I think a 7-day delay > to accept an Apache project release is not a big deal for us. > > Regarding the Spark-related projects, we don't need to enforce the policy > for them. > > I think for supply chain attacks, we are defending ourselves not only > against package developers, but more importantly, we are defending > ourselves against potential loopholes in the release process. We must > assume that there could be something wrong during the release process of > any project. > > Tian > > On Wed, Apr 22, 2026 at 3:32 PM Dongjoon Hyun <[email protected]> wrote: > >> To be clear, this discussion should be applied to Apache Spark main >> repository only. >> >> https://github.com/apache/spark >> >> It's because subprojects need to consume Apache Spark releases ASAP. For >> example, Apache Spark K8s Operator will upgrade its dependency on the same >> day of Apache Spark release because we trust our release process (including >> vote). >> >> In addition, probably, we may want to extend our exceptions to include >> all ASF project releases (Apache Hadoop, Avro, Parquet, ORC, Kafka, ...) >> which have established community vote process. >> >> Dongjoon. >> >> On 2026/04/22 22:21:41 Dongjoon Hyun wrote: >> > Thank you for the suggestion. >> > >> > +1 for the general predefined (1-week) grace-period policy sounds good >> to me. >> > >> > For the exception cases, I believe we can let the PMC members make the >> final decision on merge timing like the PMC members decides the `Blocker` >> level priority of JIRA issues already. >> > >> > If we have a voted policy, it would be great if we can add the policy >> to AGENTS.md explicitly to apply the policy from the PR steps. >> > >> > Best, >> > Dongjoon. >> > >> > On 2026/04/22 20:47:24 Steve Loughran wrote: >> > > 7 days is long enough to catch most (all?) malicious attacks. >> > > >> > > Regarding developers, there's a strong case to be made for only doing >> > > builds and especially tests in isolated containers, even though >> artifacts >> > > will leak across shared containers through a shared maven repo. It >> still >> > > limits the damage malicious binaries can do. >> > > >> > > On Tue, 21 Apr 2026 at 23:58, Jungtaek Lim < >> [email protected]> >> > > wrote: >> > > >> > > > +1 >> > > > >> > > > We tend to consider that merging to master branch gives some time >> to bake >> > > > before releasing. But we (Spark devs) are people who build Spark and >> > > > run some tests against the master branch almost day to day. For us, >> there >> > > > is literally no time for these library upgrades to be baked - we are >> > > > exposed to any kind of potential CVE from these library upgrades. >> > > > >> > > > It's arguable whether we should stay up to date with the recent >> release >> > > > version for dependencies, but that'd probably be uneasy to make >> consensus; >> > > > there is a clear trade-off. The current proposal sounds to me as a >> good >> > > > compromise - IMHO delaying by 2 weeks (14 days) seems reasonable, >> but >> > > > strict 1 week (7 days) is better than nothing if anyone is >> concerned 2 >> > > > weeks is too long. >> > > > >> > > > On Tue, Apr 21, 2026 at 9:45 PM Szehon Ho <[email protected]> >> wrote: >> > > > >> > > >> +1 make sense to me as well. We should of course be fast for >> security >> > > >> upgrades, but make sense to avoid such eager upgrades for the rest >> of >> > > >> the hundreds of Spark dependencies, due to the increased supply >> chain >> > > >> attack risks in the ecosystem. >> > > >> >> > > >> Thanks >> > > >> Szehon >> > > >> >> > > >> On Tue, Apr 21, 2026 at 3:32 AM Wenchen Fan <[email protected]> >> wrote: >> > > >> >> > > >>> Thanks for starting this discussion! I did a data analysis a >> while ago >> > > >>> but didn't have time to act on it. The analysis shows: >> > > >>> >> > > >>> *58* maven dep upgrades in the last 3 months. >> > > >>> *46%* (27/58) within 7 days of release >> > > >>> ≤7d : 27 / 58 (47%) >> > > >>> 8d–30d : 12 / 58 (21%) >> > > >>> >30d : 19 / 58 (32%) >> > > >>> >> > > >>> You can find the raw data in the attached file. This does look a >> bit >> > > >>> aggressive. I build Spark locally everyday, and I believe I'm not >> the only >> > > >>> one. Having a couple of weeks as the buffer time is a good idea >> to protect >> > > >>> developers like me from potential supply chain attacks. >> > > >>> >> > > >>> On Tue, Apr 21, 2026 at 6:24 AM Hyukjin Kwon < >> [email protected]> >> > > >>> wrote: >> > > >>> >> > > >>>> SGTM I think it's good practice to give a couple of weeks before >> the >> > > >>>> upgrade >> > > >>>> >> > > >>>> On Tue, 21 Apr 2026 at 07:13, Tian Gao via dev < >> [email protected]> >> > > >>>> wrote: >> > > >>>> >> > > >>>>> Hi, I want to start a discussion about our dependency upgrade >> policy >> > > >>>>> for active development. >> > > >>>>> >> > > >>>>> Our current dependency upgrade (mostly for Java, but Python >> should be >> > > >>>>> included too) is a bit spontaneous. People find that a >> dependency has a new >> > > >>>>> version available and we just do the upgrade. >> > > >>>>> >> > > >>>>> This raises concerns about potential supply chain attacks. We >> already >> > > >>>>> established a few sets of rules (including pinning the github >> action >> > > >>>>> versions) to avoid the supply chain attack, but manually >> upgrading the >> > > >>>>> dependency version too eagerly could also be risky. >> > > >>>>> >> > > >>>>> It normally takes time for a bad release to be recognized, so I >> think >> > > >>>>> we should set a buffer time before upgrading to the latest >> version. For >> > > >>>>> example, we can wait a week or two after the latest release >> before we set >> > > >>>>> our development dependency to it. This could reduce the >> possibility of >> > > >>>>> being impacted by malicious releases, or just give them enough >> time to fix >> > > >>>>> their own severe bugs. >> > > >>>>> >> > > >>>>> The cost for this policy is very low - it barely impacts us if >> we >> > > >>>>> can’t use the “latest” version of dependencies. >> > > >>>>> >> > > >>>>> Of course, there should be exceptions when dependency upgrades >> include >> > > >>>>> security fixes for known vulnerabilities; we should upgrade as >> fast as >> > > >>>>> possible. >> > > >>>>> >> > > >>>>> Tian >> > > >>>>> >> > > >>>> >> > > >>> >> --------------------------------------------------------------------- >> > > >>> To unsubscribe e-mail: [email protected] >> > > >> >> > > >> >> > > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe e-mail: [email protected] >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: [email protected] >> >>
