Looks good to me! On Wed, Dec 24, 2025, 7:53 AM Elijah ben Izzy < [email protected]> wrote:
> @Jarek + others -- I wanted to get your take on this. I'm pushing to get it > out soon (finally have a bit more focus time) and I drafted this. What does > the pod think about what is/isn't source? > > I'm generally happy with this, but I think the key part is that this is > documented/clear. See table at the top: > > > https://github.com/apache/burr/blob/20f7790dd60cec8e397184cfe0b2aaa564f49f48/scripts/README.md > > > On Sun, Nov 30, 2025 at 12:58 PM Jarek Potiuk <[email protected]> wrote: > > > Yep. All for moving forward fast even if it means some of the things that > > are none-MUST will be deferred. I tried to move things to the > > "MUST/SHOULD/NICE" in the way to make it clear what is the highest > priority > > - I think you get it mostly right. the "_" thing is really making sure > that > > the right tools (flit in this case) have some min-versions - recently > > enough to produce good naming). > > > > This is one more thing that I should have stressed - I think **some** > part > > of it is the way you make sure the tooling is recent enough. Approach for > > Airflow is that we have ALWAYS min version of a dependency set. > "Released 6 > > months ago" is a good "rule of thumb". I think that if you follow the > "env > > setup" and the env has some "min versions" - that solves at least some > part > > of the issues I found in the release. The "_" in the name is definitely > > going to be fixed if min-version is set appropriately. > > > > J. > > > > > > On Sun, Nov 30, 2025 at 8:27 PM Elijah ben Izzy < > > [email protected]> wrote: > > > > > @Jarek -- thanks, this is very clear (and absolutely worth getting > right > > > even if it means a delay in release). > > > > > > I want to sum up just to make sure I understand the high-level -- here > > are > > > the themes I'm picking up: > > > > > > 1. License on everything -- JSON is the exception but that's why we > have > > > .rat excludes (tpl looked like another JSON which I think is why I > missed > > > it) > > > 2. Underscores versus dashes for consistency to avoid trouble later on > > > 3. A consistent, documented internal opinion on what *is* and *isn't* > > > source > > > 4. Clean up weird stuff (I.E. the bento burr submodule, that probably > > > should just be removed from the repo) > > > 5. Anything to make development + verification easier on the developer > > (and > > > any downstream consumer of the source) > > > 6. More documentation overall > > > > > > > > > I think (3) and (6) are pretty big value-adds, I.E. where I should > focus > > > some time. High-level, for this project, I want to throw this out: > > > 1. Docs are *not* source -- not included in distribution > > > 2. Tests *are* source -- why? These let the developer download + run > in a > > > self-verification attempt > > > > > > This is pretty justifiable IMO: > > > > > > *Our best practice across the various projects I maintain is to always > > run > > > the tests from the source on the installed wheel. For each Python > version > > > and platform in our CI matrix, we do a clean sdist and wheel build > (with > > > build, of course), install the wheel in a clean env, and then run the > > tests > > > from the source against that, using either a src dir, tox and/or python > > -I, > > > plus pytest --import-mode=importlib, to ensure isolation from the > source > > > tree and we’re always using the installed copy. It isn’t as important > > that > > > the tests themselves work when packaged for end-distribution, but > rather > > > than the code under test works.* > > > > > > Going to take a bit of time later/this week to prep this and might > reach > > > out with more questions. Otherwise I'll also be reading over the > > resources > > > to ensure that nothing slipped up. > > > > > > Cheers, > > > Elijah > > > > > > On Sun, Nov 30, 2025 at 8:31 AM Jarek Potiuk <[email protected]> wrote: > > > > > > > -1 for now, sorry. > > > > > > > > Reviewed: > > > > > > > > * signatures OK > > > > * checksums OK > > > > * licences NOK > > > > * reproducibility from sources > > > > > > > > > > > > I think there is the .gitmodule problem that should be solved, also > > lack > > > of > > > > -source.tar.gz explicitly is not really good I think.. > > > > > > > > Several reasons: > > > > > > > > 1) Lack of explicit source package (this is "almost -1" for me, > because > > > > formally speaking the .sdist package is fulfilling the letter of the > > > source > > > > package, but IMHO it does not necessarily fulfills the spirit. > > > > > > > > I think it's not very clear which package is "source" and which are > > > > "convenience/binary" packages. From what I see, the .tar.gz is > > > **something > > > > between** source package and the .sdist. It **looks** like an sdist > > > package > > > > (with PKG_INFO) - but also it contains "tests" - which is unusual for > > > sdist > > > > packages (however there is a big debate about it [1]). The > requirement > > > for > > > > "source" packages published by the ASF is that it contains all the > > > sources > > > > needed to build code and tests [2] (which your .sdist file has, so > > that's > > > > cool) - it seems to some extent it follows the expectation. I think > it > > > must > > > > be clear which of the packages is "-source" one and naming it like > that > > > and > > > > keeping it separate from .sdist is a good idea. > > > > > > > > We also in Airflow - for quite a while - took some of our .sdist > files > > as > > > > "source" releases when we released only some of the distributions > that > > > are > > > > part of the monorepo. When we did it in the past - in Airflow we > > > > explicitly mentioned in our emails that those .sdist packages are the > > > > "source" packages as expected by the ASF [3] . But eventually we > > > entirely > > > > gave up on it (a few weeks ago) , because we opted in to include > > > > essentially **everything** that is in the source repo of ours (we are > > > > essentially using git archive to produce the source-tar.gz). The main > > > > reason was that if we **only** release .sdist, some of our important > > code > > > > (such as sources for docs) were not published when we released only > > > > .sdists. > > > > > > > > The .sdist of yours misses quite a number of files from the repo: > > > > > > > > * big number of examples > > > > * docs sources - I think this is an important miss - while docs are > > > > * telemetry folder > > > > * .github and .gitmodules (are those gitmodules necessary to build > the > > > > project?) > > > > > > > > It's likely that those files are excluded deliberately and something > > that > > > > you do not **want** to release at all, but I find it a bit strange to > > > > remove docs and many examples, It seems that those who unpack sources > > > from > > > > the official source package, cannot do all the same things as people > > who > > > > check it out from repo TAG . If someone takes it as "source" and > never > > > > looks at the GitHub repo - they will miss important sources (like > docs > > > > sources) that IMHO is something that the users **should** have. > > Generally > > > > users should be able to do the same with the "-source.tar.gz" as what > > > they > > > > can when they do `git checkout TAG` in your repo. > > > > > > > > The AI-generated (undoubtedly but that's ok ;) doducmentation in > > > README.md > > > > describes what goes in and out but it does not explain WHY. I think > if > > > you > > > > **really** want to exclude some files from your source distribution > you > > > > should explain WHY in the documentation. > > > > > > > > Just to add a bit of context. You might think that the > "-source.tar.gz" > > > > file is not that important, as nearly nobody will use it. Which is a > > fair > > > > assesment ("nearly nobody") - but those who do are the important > users > > - > > > > those are downstream packagers, who might want to include burr in > > distros > > > > for example. Many of the distros that are out there use the > officially > > > > signed and checksummed packages to build and install their packages. > > For > > > > example this is what conda might want to do. Or Debian maintainers. > > Those > > > > are important users and we need to make sure that they can do it > > easily. > > > > That's the safest bet to produce explicitly "-source.tar.gz" as a > "git > > > > archive" result IMHO - and not exclude things that you would normally > > > > commit to the repo (note that you can have generated code committed > to > > > your > > > > repo - and there is "no compiled code in your repo" - so that would > > > > probably be the only thing to exclude (if your build process rebuilds > > > those > > > > generated files automatically). This can be done via .gitattributes > [4] > > > in > > > > airflow. > > > > > > > > 2) The .gitmodules thing is the final reason why I gave -1. I am not > > > sure - > > > > it's not clear- if BentoBurr mentioned whether it is needed to build > > the > > > > project or not. This project is not only archived, but also misses > > > LICENCE > > > > information, so while it is actually **excluded** from .sdist > package, > > I > > > > think it should be either removed from the repo or included in > > > > -sources.tar.gz - generally ASF project should not depend on any > > project > > > > which has unknown licence. > > > > > > > > 3) At least in Airflow we are using `shasum -a 512 FILE` and it > > produces > > > > SHASUM + name of the file, which I think is a good idea to have in > .asc > > > > file. Also something that can be improved in the future. > > > > > > > > The Shasum are good, but when I diff on what shasum produces, we have > > > this: > > > > > > > > < > > > > > > > > > > > > > > 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111 > > > > apache_burr-0.41.0-py3-none-any.whl > > > > --- > > > > > > > > > > > > > > > > > > > 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111 > > > > Checking apache-burr-0.41.0-incubating.tar.gz.sha512 > > > > 1c1 > > > > < > > > > > > > > > > > > > > 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a > > > > apache-burr-0.41.0-incubating.tar.gz > > > > --- > > > > > > > > > > > > > > > > > > > 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a > > > > > > > > 4) files with unknown licences in the .sdist file (since it looks > like > > > > -sources). This is also quite hard -1 because of the .tpl file. > > > > > > > > There are a number of files with unapproved licenses (I unpacket the > > > > .tar.gz and downloaded and ran the > > > > https://dist.apache.org/repos/dist/release/creadur/apache-rat-0.17/ > on > > > the > > > > directory). While I understand why .jsonl files do not have licence > > (json > > > > cannot contain comments), the best way to deal with that is to add > > > > .rat-excludes file in your repo - see Airflow one [5] and make it > part > > of > > > > the source package. This way you can add -E .rat-excludes and it will > > > > exclude those files from check. The .tpl file seems to be a JINJA > > > template > > > > and those files allow for comments and can easily embed license > > > information > > > > that will be excluded in the final generated json file. > > > > > > > > ! Unapproved: 23 A count of unapproved licenses. > > > > ! > /burr/tracking/server/demo_data/demo_chatbot/chat-1-giraffe/log.jsonl > > > > ! > > /burr/tracking/server/demo_data/demo_chatbot/chat-2-geography/log.jsonl > > > > ! > /burr/tracking/server/demo_data/demo_chatbot/chat-3-physics/log.jsonl > > > > ! > > > > /burr/tracking/server/demo_data/demo_chatbot/chat-4-philosophy/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-5-jokes/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot/chat-6-demonstrate-errors/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-1-giraffe/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-2-geography/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-3-physics/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-4-philosophy/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-5-jokes/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-6-demonstrate-errors/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-1-food/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-2-work-history/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-3-activities/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/demo_data/demo_conversational-rag/rag-4-everything/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-1/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-10/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-100/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-42/log.jsonl > > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-50/log.jsonl > > > > ! > > > > > > > > > > > > > > /burr/tracking/server/s3/deployment/terraform/templates/ecs/burr_app.json.tpl > > > > > > > > 5) Bad naming of `sdist` file. > > > > > > > > I am not sure how you produced the .sdist file (again - no release > > > > instructions) but when I tried to build it and compare what's in my > > > .sdist > > > > and your .sdist, I got it quite different because name of my package > > > (tried > > > > it with flit, hatch and build packages) is (correctly) > > > > *apache_burr-0.41.0-incubating.tar.gz* and yours was > > > > *apache-burr-0.41.0-incubating.tar.gz*. We used to have the same in > > > Airflow > > > > and it caused us some serious problems when it comes to links to our > > > .sdist > > > > packages, and general difference of .whl vs. sdist. **Some** old > > tooling > > > > used to produce such names (old setuptools and old filt) but this > since > > > has > > > > been properly implemented by both. The thing is that the .sdist > package > > > > name SHOULD be normalized to contain the distribution name > normalized - > > > > which replaces all sequences of "_-." with a single "_" and lowercase > > [6] > > > > (unlike package names in PyPI, this follows the Binary wheel naming > > > > normalization which uses "_" rather than "-" in package name [7]. > > > > > > > > 6) Easier setup of the env > > > > > > > > I noticed some small issue with the env when preparing the release - > > > > missing `cli` extra when setting up the venv to build release. I > fixed > > it > > > > in [8] - also proposed a small addition of dev dependency group > (might > > > > split it if needed) and proposed that you might use some more modern > > > > standardised features of packaging like dependency groups and inline > > > script > > > > metadata. See details in the PR - we can discuss it there. > > > > > > > > 6) Reproducibility from sources: > > > > > > > > I tried to rebuild both .sdist and .whl package following the > > > instructions > > > > and initially I have not compiled the UI and got them missing (of > > > course) - > > > > I understand that full automation with custom build hook is deferred > > for > > > > later (which is OK) - but (as expected) the files in the package have > > > > different mtime. This can be easily fixed with hard-coding the > > > > SOURCE_DATE_EPOCH variable before the build [9] and since you are > > already > > > > using instructions and scripts, that should be an easy addition in > your > > > > docs. In airflow we have a prek commit that automatically regenerates > > the > > > > date when release notes change but at the beginning the mtime to be > > used > > > > can be simply hard-coded to basically any date. This way whoever > > follows > > > > your release process will have it closer to a truly reproducible > > package > > > > and diffoscope will start showing useful diffs in case there are some > > > [10] > > > > > > > > Summary of things: > > > > > > > > MUST > > > > * .tpl licence adding - 4) > > > > * explain (or likely remove) the .gitmodule BentoBurr reference - 2) > > > > * explicit rules in docs about why you exclude certain files from > > source > > > > package - 4) > > > > * separate -source.tar.gz package with all files including docs and > > > likely > > > > all files (subject to rules about exclusion above) 1) > > > > > > > > SHOULD: > > > > * proper naming of sdist artifacts (with _) (needs newer flit simply > > and > > > > doc update) - 5) > > > > * add .rat-excludes that will allow to use RAT to verify the official > > > > source packages 5) > > > > > > > > NICE TO HAVE: > > > > * shasum with filename - 3) > > > > * simplify the env setup with inline metadata, dev dependency groups > > > > (support for those already in uv, hatch and others) - 6) > > > > * reproducibility setup 7) > > > > > > > > > > > > > > > > [1] Debate about whether "tests" and "docs" should be included in > > .sdist > > > > > > > > > > https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578/26 > > > > [2] What should be included in source packages of ASF - > > > > https://www.apache.org/legal/release-policy.html#source-packages > > > > [3] Example email where Airflow PMC explicitly pointed to .sdist > > packages > > > > being "source" packages (see the description of .sdist files) > > > > https://lists.apache.org/thread/8ob972qkd7sy6k1pn5nskc2x0yjx2t2y > > > > [4] The .gitattributes file in Airflow repo > > > > https://github.com/apache/airflow/blob/main/.gitattributes > > > > [5] RAT excludes in Airflow repo > > > > https://github.com/apache/airflow/blob/main/.rat-excludes > > > > [6] PEP-625 Filename of a Source Distribution - > > > > https://peps.python.org/pep-0625/ > > > > [7] Binary packages distribution name normalization - > > > > > > > > > > > > > > https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode > > > > [8] PR to fix missing cli extra and improving dev-env to use it > > > > https://github.com/apache/burr/pull/604 > > > > [9] Flit reproducibility > > > https://flit.pypa.io/en/stable/reproducible.html > > > > [10] Diffoscope - tool to show reproducibility issues > > > > https://diffoscope.org/ > > > > > > > > J. > > > > > > > > > > > > > > > > > > > > On Sun, Nov 30, 2025 at 5:02 AM Elijah ben Izzy < > > > > [email protected]> wrote: > > > > > > > > > Hi all! Trying again! > > > > > > > > > > > > > > > This is a call for a vote on releasing Apache Burr > 0.41.0-incubating > > > > > Release Candidate 2. > > > > > > > > > > This release includes the following changes (see CHANGELOG for > > > details). > > > > > See all commits since prior release: > > > > > - https://github.com/apache/burr/compare/burr-0.40.2...main > > > > > > > > > > Key changes include: > > > > > - pool-based async PG persister > > > > > - multiple UI updates > > > > > - Apache compatible licenses/build processes > > > > > - bug fixes, typing, etc... > > > > > > > > > > The artifacts for this release candidate can be found at: > > > > > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/burr/0.41.0-incubating-RC2/ > > > > > > > > > > The Git tag to be voted upon is: v0.41.0 > > > > > > > > > > The release hash is 11783ba58f8c5bd161118976ced791a2f5bd78f3 > > > > > > > > > > Release artifacts are signed with the following key: > > > > > BB8B72B34AB9A664A109AA17A76CF4C80E4E5355 > > > > > The KEYS file is available at: > > > > > https://downloads.apache.org/incubator/burr/KEYS > > > > > > > > > > Please download, verify, and test the release candidate. For > testing > > > use > > > > > your best judgement. The following may suffice: > > > > > > > > > > 1. Build/run the UI following the instructions in scripts/README.md > > > > > 2. Run the tests in tests/ > > > > > 3. Import into a jupyter notebook and play around > > > > > > > > > > Highly encourage you to pip install from source, run `burr` and > play > > > with > > > > > the UI (some UI bugs I recently discovered will be filed) > > > > > > > > > > The vote will run for a minimum of 72 hours. > > > > > Please vote: > > > > > > > > > > [ ] +1 Release this package as Apache Burr 0.41.0-incubating > > > > > [ ] +0 No opinion > > > > > [ ] -1 Do not release this package because... (Please provide a > > reason) > > > > > > > > > > Checklist for reference: > > > > > [ ] Download links are valid. > > > > > [ ] Checksums and signatures. > > > > > [ ] LICENSE/NOTICE files exist > > > > > [ ] No unexpected binary files > > > > > [ ] All source files have ASF headers > > > > > [ ] Can compile from source > > > > > > > > > > On behalf of the Apache Burr PPMC, > > > > > > > > > > Elijah ben Izzy ([email protected]) > > > > > > > > > > > > > > >
