Yep. All for moving forward fast even if it means some of the things that are none-MUST will be deferred. I tried to move things to the "MUST/SHOULD/NICE" in the way to make it clear what is the highest priority - I think you get it mostly right. the "_" thing is really making sure that the right tools (flit in this case) have some min-versions - recently enough to produce good naming).
This is one more thing that I should have stressed - I think **some** part of it is the way you make sure the tooling is recent enough. Approach for Airflow is that we have ALWAYS min version of a dependency set. "Released 6 months ago" is a good "rule of thumb". I think that if you follow the "env setup" and the env has some "min versions" - that solves at least some part of the issues I found in the release. The "_" in the name is definitely going to be fixed if min-version is set appropriately. J. On Sun, Nov 30, 2025 at 8:27 PM Elijah ben Izzy < [email protected]> wrote: > @Jarek -- thanks, this is very clear (and absolutely worth getting right > even if it means a delay in release). > > I want to sum up just to make sure I understand the high-level -- here are > the themes I'm picking up: > > 1. License on everything -- JSON is the exception but that's why we have > .rat excludes (tpl looked like another JSON which I think is why I missed > it) > 2. Underscores versus dashes for consistency to avoid trouble later on > 3. A consistent, documented internal opinion on what *is* and *isn't* > source > 4. Clean up weird stuff (I.E. the bento burr submodule, that probably > should just be removed from the repo) > 5. Anything to make development + verification easier on the developer (and > any downstream consumer of the source) > 6. More documentation overall > > > I think (3) and (6) are pretty big value-adds, I.E. where I should focus > some time. High-level, for this project, I want to throw this out: > 1. Docs are *not* source -- not included in distribution > 2. Tests *are* source -- why? These let the developer download + run in a > self-verification attempt > > This is pretty justifiable IMO: > > *Our best practice across the various projects I maintain is to always run > the tests from the source on the installed wheel. For each Python version > and platform in our CI matrix, we do a clean sdist and wheel build (with > build, of course), install the wheel in a clean env, and then run the tests > from the source against that, using either a src dir, tox and/or python -I, > plus pytest --import-mode=importlib, to ensure isolation from the source > tree and we’re always using the installed copy. It isn’t as important that > the tests themselves work when packaged for end-distribution, but rather > than the code under test works.* > > Going to take a bit of time later/this week to prep this and might reach > out with more questions. Otherwise I'll also be reading over the resources > to ensure that nothing slipped up. > > Cheers, > Elijah > > On Sun, Nov 30, 2025 at 8:31 AM Jarek Potiuk <[email protected]> wrote: > > > -1 for now, sorry. > > > > Reviewed: > > > > * signatures OK > > * checksums OK > > * licences NOK > > * reproducibility from sources > > > > > > I think there is the .gitmodule problem that should be solved, also lack > of > > -source.tar.gz explicitly is not really good I think.. > > > > Several reasons: > > > > 1) Lack of explicit source package (this is "almost -1" for me, because > > formally speaking the .sdist package is fulfilling the letter of the > source > > package, but IMHO it does not necessarily fulfills the spirit. > > > > I think it's not very clear which package is "source" and which are > > "convenience/binary" packages. From what I see, the .tar.gz is > **something > > between** source package and the .sdist. It **looks** like an sdist > package > > (with PKG_INFO) - but also it contains "tests" - which is unusual for > sdist > > packages (however there is a big debate about it [1]). The requirement > for > > "source" packages published by the ASF is that it contains all the > sources > > needed to build code and tests [2] (which your .sdist file has, so that's > > cool) - it seems to some extent it follows the expectation. I think it > must > > be clear which of the packages is "-source" one and naming it like that > and > > keeping it separate from .sdist is a good idea. > > > > We also in Airflow - for quite a while - took some of our .sdist files as > > "source" releases when we released only some of the distributions that > are > > part of the monorepo. When we did it in the past - in Airflow we > > explicitly mentioned in our emails that those .sdist packages are the > > "source" packages as expected by the ASF [3] . But eventually we > entirely > > gave up on it (a few weeks ago) , because we opted in to include > > essentially **everything** that is in the source repo of ours (we are > > essentially using git archive to produce the source-tar.gz). The main > > reason was that if we **only** release .sdist, some of our important code > > (such as sources for docs) were not published when we released only > > .sdists. > > > > The .sdist of yours misses quite a number of files from the repo: > > > > * big number of examples > > * docs sources - I think this is an important miss - while docs are > > * telemetry folder > > * .github and .gitmodules (are those gitmodules necessary to build the > > project?) > > > > It's likely that those files are excluded deliberately and something that > > you do not **want** to release at all, but I find it a bit strange to > > remove docs and many examples, It seems that those who unpack sources > from > > the official source package, cannot do all the same things as people who > > check it out from repo TAG . If someone takes it as "source" and never > > looks at the GitHub repo - they will miss important sources (like docs > > sources) that IMHO is something that the users **should** have. Generally > > users should be able to do the same with the "-source.tar.gz" as what > they > > can when they do `git checkout TAG` in your repo. > > > > The AI-generated (undoubtedly but that's ok ;) doducmentation in > README.md > > describes what goes in and out but it does not explain WHY. I think if > you > > **really** want to exclude some files from your source distribution you > > should explain WHY in the documentation. > > > > Just to add a bit of context. You might think that the "-source.tar.gz" > > file is not that important, as nearly nobody will use it. Which is a fair > > assesment ("nearly nobody") - but those who do are the important users - > > those are downstream packagers, who might want to include burr in distros > > for example. Many of the distros that are out there use the officially > > signed and checksummed packages to build and install their packages. For > > example this is what conda might want to do. Or Debian maintainers. Those > > are important users and we need to make sure that they can do it easily. > > That's the safest bet to produce explicitly "-source.tar.gz" as a "git > > archive" result IMHO - and not exclude things that you would normally > > commit to the repo (note that you can have generated code committed to > your > > repo - and there is "no compiled code in your repo" - so that would > > probably be the only thing to exclude (if your build process rebuilds > those > > generated files automatically). This can be done via .gitattributes [4] > in > > airflow. > > > > 2) The .gitmodules thing is the final reason why I gave -1. I am not > sure - > > it's not clear- if BentoBurr mentioned whether it is needed to build the > > project or not. This project is not only archived, but also misses > LICENCE > > information, so while it is actually **excluded** from .sdist package, I > > think it should be either removed from the repo or included in > > -sources.tar.gz - generally ASF project should not depend on any project > > which has unknown licence. > > > > 3) At least in Airflow we are using `shasum -a 512 FILE` and it produces > > SHASUM + name of the file, which I think is a good idea to have in .asc > > file. Also something that can be improved in the future. > > > > The Shasum are good, but when I diff on what shasum produces, we have > this: > > > > < > > > > > 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111 > > apache_burr-0.41.0-py3-none-any.whl > > --- > > > > > > > > 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111 > > Checking apache-burr-0.41.0-incubating.tar.gz.sha512 > > 1c1 > > < > > > > > 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a > > apache-burr-0.41.0-incubating.tar.gz > > --- > > > > > > > > 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a > > > > 4) files with unknown licences in the .sdist file (since it looks like > > -sources). This is also quite hard -1 because of the .tpl file. > > > > There are a number of files with unapproved licenses (I unpacket the > > .tar.gz and downloaded and ran the > > https://dist.apache.org/repos/dist/release/creadur/apache-rat-0.17/ on > the > > directory). While I understand why .jsonl files do not have licence (json > > cannot contain comments), the best way to deal with that is to add > > .rat-excludes file in your repo - see Airflow one [5] and make it part of > > the source package. This way you can add -E .rat-excludes and it will > > exclude those files from check. The .tpl file seems to be a JINJA > template > > and those files allow for comments and can easily embed license > information > > that will be excluded in the final generated json file. > > > > ! Unapproved: 23 A count of unapproved licenses. > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-1-giraffe/log.jsonl > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-2-geography/log.jsonl > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-3-physics/log.jsonl > > ! > /burr/tracking/server/demo_data/demo_chatbot/chat-4-philosophy/log.jsonl > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-5-jokes/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot/chat-6-demonstrate-errors/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-1-giraffe/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-2-geography/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-3-physics/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-4-philosophy/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-5-jokes/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-6-demonstrate-errors/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-1-food/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-2-work-history/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-3-activities/log.jsonl > > ! > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-4-everything/log.jsonl > > ! /burr/tracking/server/demo_data/demo_counter/count-to-1/log.jsonl > > ! /burr/tracking/server/demo_data/demo_counter/count-to-10/log.jsonl > > ! /burr/tracking/server/demo_data/demo_counter/count-to-100/log.jsonl > > ! /burr/tracking/server/demo_data/demo_counter/count-to-42/log.jsonl > > ! /burr/tracking/server/demo_data/demo_counter/count-to-50/log.jsonl > > ! > > > > > /burr/tracking/server/s3/deployment/terraform/templates/ecs/burr_app.json.tpl > > > > 5) Bad naming of `sdist` file. > > > > I am not sure how you produced the .sdist file (again - no release > > instructions) but when I tried to build it and compare what's in my > .sdist > > and your .sdist, I got it quite different because name of my package > (tried > > it with flit, hatch and build packages) is (correctly) > > *apache_burr-0.41.0-incubating.tar.gz* and yours was > > *apache-burr-0.41.0-incubating.tar.gz*. We used to have the same in > Airflow > > and it caused us some serious problems when it comes to links to our > .sdist > > packages, and general difference of .whl vs. sdist. **Some** old tooling > > used to produce such names (old setuptools and old filt) but this since > has > > been properly implemented by both. The thing is that the .sdist package > > name SHOULD be normalized to contain the distribution name normalized - > > which replaces all sequences of "_-." with a single "_" and lowercase [6] > > (unlike package names in PyPI, this follows the Binary wheel naming > > normalization which uses "_" rather than "-" in package name [7]. > > > > 6) Easier setup of the env > > > > I noticed some small issue with the env when preparing the release - > > missing `cli` extra when setting up the venv to build release. I fixed it > > in [8] - also proposed a small addition of dev dependency group (might > > split it if needed) and proposed that you might use some more modern > > standardised features of packaging like dependency groups and inline > script > > metadata. See details in the PR - we can discuss it there. > > > > 6) Reproducibility from sources: > > > > I tried to rebuild both .sdist and .whl package following the > instructions > > and initially I have not compiled the UI and got them missing (of > course) - > > I understand that full automation with custom build hook is deferred for > > later (which is OK) - but (as expected) the files in the package have > > different mtime. This can be easily fixed with hard-coding the > > SOURCE_DATE_EPOCH variable before the build [9] and since you are already > > using instructions and scripts, that should be an easy addition in your > > docs. In airflow we have a prek commit that automatically regenerates the > > date when release notes change but at the beginning the mtime to be used > > can be simply hard-coded to basically any date. This way whoever follows > > your release process will have it closer to a truly reproducible package > > and diffoscope will start showing useful diffs in case there are some > [10] > > > > Summary of things: > > > > MUST > > * .tpl licence adding - 4) > > * explain (or likely remove) the .gitmodule BentoBurr reference - 2) > > * explicit rules in docs about why you exclude certain files from source > > package - 4) > > * separate -source.tar.gz package with all files including docs and > likely > > all files (subject to rules about exclusion above) 1) > > > > SHOULD: > > * proper naming of sdist artifacts (with _) (needs newer flit simply and > > doc update) - 5) > > * add .rat-excludes that will allow to use RAT to verify the official > > source packages 5) > > > > NICE TO HAVE: > > * shasum with filename - 3) > > * simplify the env setup with inline metadata, dev dependency groups > > (support for those already in uv, hatch and others) - 6) > > * reproducibility setup 7) > > > > > > > > [1] Debate about whether "tests" and "docs" should be included in .sdist > > > https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578/26 > > [2] What should be included in source packages of ASF - > > https://www.apache.org/legal/release-policy.html#source-packages > > [3] Example email where Airflow PMC explicitly pointed to .sdist packages > > being "source" packages (see the description of .sdist files) > > https://lists.apache.org/thread/8ob972qkd7sy6k1pn5nskc2x0yjx2t2y > > [4] The .gitattributes file in Airflow repo > > https://github.com/apache/airflow/blob/main/.gitattributes > > [5] RAT excludes in Airflow repo > > https://github.com/apache/airflow/blob/main/.rat-excludes > > [6] PEP-625 Filename of a Source Distribution - > > https://peps.python.org/pep-0625/ > > [7] Binary packages distribution name normalization - > > > > > https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode > > [8] PR to fix missing cli extra and improving dev-env to use it > > https://github.com/apache/burr/pull/604 > > [9] Flit reproducibility > https://flit.pypa.io/en/stable/reproducible.html > > [10] Diffoscope - tool to show reproducibility issues > > https://diffoscope.org/ > > > > J. > > > > > > > > > > On Sun, Nov 30, 2025 at 5:02 AM Elijah ben Izzy < > > [email protected]> wrote: > > > > > Hi all! Trying again! > > > > > > > > > This is a call for a vote on releasing Apache Burr 0.41.0-incubating > > > Release Candidate 2. > > > > > > This release includes the following changes (see CHANGELOG for > details). > > > See all commits since prior release: > > > - https://github.com/apache/burr/compare/burr-0.40.2...main > > > > > > Key changes include: > > > - pool-based async PG persister > > > - multiple UI updates > > > - Apache compatible licenses/build processes > > > - bug fixes, typing, etc... > > > > > > The artifacts for this release candidate can be found at: > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/burr/0.41.0-incubating-RC2/ > > > > > > The Git tag to be voted upon is: v0.41.0 > > > > > > The release hash is 11783ba58f8c5bd161118976ced791a2f5bd78f3 > > > > > > Release artifacts are signed with the following key: > > > BB8B72B34AB9A664A109AA17A76CF4C80E4E5355 > > > The KEYS file is available at: > > > https://downloads.apache.org/incubator/burr/KEYS > > > > > > Please download, verify, and test the release candidate. For testing > use > > > your best judgement. The following may suffice: > > > > > > 1. Build/run the UI following the instructions in scripts/README.md > > > 2. Run the tests in tests/ > > > 3. Import into a jupyter notebook and play around > > > > > > Highly encourage you to pip install from source, run `burr` and play > with > > > the UI (some UI bugs I recently discovered will be filed) > > > > > > The vote will run for a minimum of 72 hours. > > > Please vote: > > > > > > [ ] +1 Release this package as Apache Burr 0.41.0-incubating > > > [ ] +0 No opinion > > > [ ] -1 Do not release this package because... (Please provide a reason) > > > > > > Checklist for reference: > > > [ ] Download links are valid. > > > [ ] Checksums and signatures. > > > [ ] LICENSE/NOTICE files exist > > > [ ] No unexpected binary files > > > [ ] All source files have ASF headers > > > [ ] Can compile from source > > > > > > On behalf of the Apache Burr PPMC, > > > > > > Elijah ben Izzy ([email protected]) > > > > > >
