Yep. All for moving forward fast even if it means some of the things that
are none-MUST will be deferred.  I tried to move things to the
"MUST/SHOULD/NICE" in the way to make it clear what is the highest priority
- I think you get it mostly right. the "_" thing is really making sure that
the right tools (flit in this case) have some min-versions - recently
enough to produce good naming).

This is one more thing that I should have stressed - I think **some** part
of it is the way you make sure the tooling is recent enough. Approach for
Airflow is that we have ALWAYS min version of a dependency set. "Released 6
months ago" is a good "rule of thumb". I think that if you follow the "env
setup" and the env has some "min versions" - that solves at least some part
of the issues I found in the release. The "_" in the name is definitely
going to be fixed if min-version is set appropriately.

J.


On Sun, Nov 30, 2025 at 8:27 PM Elijah ben Izzy <
[email protected]> wrote:

> @Jarek -- thanks, this is very clear (and absolutely worth getting right
> even if it means a delay in release).
>
> I want to sum up just to make sure I understand the high-level -- here are
> the themes I'm picking up:
>
> 1. License on everything -- JSON is the exception but that's why we have
> .rat excludes (tpl looked like another JSON which I think is why I missed
> it)
> 2. Underscores versus dashes for consistency to avoid trouble later on
> 3. A consistent, documented internal opinion on what *is* and *isn't*
> source
> 4. Clean up weird stuff (I.E. the bento burr submodule, that probably
> should just be removed from the repo)
> 5. Anything to make development + verification easier on the developer (and
> any downstream consumer of the source)
> 6. More documentation overall
>
>
> I think (3) and (6) are pretty big value-adds, I.E. where I should focus
> some time. High-level, for this project, I want to throw this out:
> 1. Docs are *not* source -- not included in distribution
> 2. Tests *are* source -- why? These let the developer download + run in a
> self-verification attempt
>
> This is pretty justifiable IMO:
>
> *Our best practice across the various projects I maintain is to always run
> the tests from the source on the installed wheel. For each Python version
> and platform in our CI matrix, we do a clean sdist and wheel build (with
> build, of course), install the wheel in a clean env, and then run the tests
> from the source against that, using either a src dir, tox and/or python -I,
> plus pytest --import-mode=importlib, to ensure isolation from the source
> tree and we’re always using the installed copy. It isn’t as important that
> the tests themselves work when packaged for end-distribution, but rather
> than the code under test works.*
>
> Going to take a bit of time later/this week to prep this and might reach
> out with more questions. Otherwise I'll also be reading over the resources
> to ensure that nothing slipped up.
>
> Cheers,
> Elijah
>
> On Sun, Nov 30, 2025 at 8:31 AM Jarek Potiuk <[email protected]> wrote:
>
> > -1 for now, sorry.
> >
> > Reviewed:
> >
> > * signatures OK
> > * checksums  OK
> > * licences NOK
> > * reproducibility from sources
> >
> >
> > I think there is the .gitmodule problem that should be solved, also lack
> of
> > -source.tar.gz explicitly is not really good I think..
> >
> > Several reasons:
> >
> > 1) Lack of explicit source package (this is "almost -1" for me, because
> > formally speaking the .sdist package is fulfilling the letter of the
> source
> > package, but IMHO it does not necessarily fulfills the spirit.
> >
> > I think it's not very clear which package is "source" and which are
> > "convenience/binary" packages. From what I see, the .tar.gz is
> **something
> > between** source package and the .sdist. It **looks** like an sdist
> package
> > (with PKG_INFO) - but also it contains "tests" - which is unusual for
> sdist
> > packages (however there is a big debate about it  [1]). The requirement
> for
> > "source" packages published by the ASF is that it contains all the
> sources
> > needed to build code and tests [2] (which your .sdist file has, so that's
> > cool) - it seems to some extent it follows the expectation. I think it
> must
> > be clear which of the packages is "-source" one and naming it like that
> and
> > keeping it separate from .sdist is a good idea.
> >
> > We also in Airflow - for quite a while - took some of our .sdist files as
> > "source" releases when we released only some of the distributions that
> are
> > part of the monorepo.  When we did it in the past -  in Airflow we
> > explicitly mentioned in our emails that those .sdist packages are the
> > "source" packages as expected by the ASF [3] .  But eventually we
> entirely
> > gave up on it (a few weeks ago) , because we opted in to include
> > essentially **everything** that is in the source repo of ours (we are
> > essentially using git archive to produce the source-tar.gz). The main
> > reason was that if we **only** release .sdist, some of our important code
> > (such as sources for docs) were not published when we released only
> > .sdists.
> >
> > The .sdist of yours misses quite a number of files from the repo:
> >
> > * big number of  examples
> > * docs sources - I think this is an important miss - while docs are
> > * telemetry folder
> > * .github and .gitmodules (are those gitmodules necessary to build the
> > project?)
> >
> > It's likely that those files are excluded deliberately and something that
> > you do not **want** to release at all, but I find it a bit strange to
> > remove docs and many examples, It seems that those who unpack sources
> from
> > the official source package, cannot do all the same things as people who
> > check it out from repo TAG . If someone takes it as "source" and never
> > looks at the GitHub repo - they will miss important sources (like docs
> > sources) that IMHO is something that the users **should** have. Generally
> > users should be able to do the same with the "-source.tar.gz" as what
> they
> > can when they do `git checkout TAG` in your repo.
> >
> > The AI-generated (undoubtedly but that's ok ;)  doducmentation in
> README.md
> > describes what goes in and out but it does not explain WHY. I think if
> you
> > **really** want to exclude some files from your source distribution you
> > should explain WHY in the documentation.
> >
> > Just to add a bit of context. You might think that the "-source.tar.gz"
> > file is not that important, as nearly nobody will use it. Which is a fair
> > assesment ("nearly nobody") - but those who do are the important users -
> > those are downstream packagers, who might want to include burr in distros
> > for example. Many of the distros that are out there use the officially
> > signed and checksummed packages to build and install their packages. For
> > example this is what conda might want to do. Or Debian maintainers. Those
> > are important users and we need to make sure that they can do it easily.
> > That's the safest bet to produce explicitly "-source.tar.gz" as a "git
> > archive" result IMHO - and not exclude things that you would normally
> > commit to the repo (note that you can have generated code committed to
> your
> > repo - and there is "no compiled code in your repo" - so that would
> > probably be the only thing to exclude (if your build process rebuilds
> those
> > generated files automatically). This can be done via .gitattributes [4]
> in
> > airflow.
> >
> > 2) The .gitmodules thing is the final reason why I gave -1. I am not
> sure -
> > it's not clear- if BentoBurr mentioned whether it is needed to build the
> > project or not. This project is not only archived, but also misses
> LICENCE
> > information, so while it is actually **excluded** from .sdist package, I
> > think it should be either removed from the repo or included in
> > -sources.tar.gz - generally ASF project should not depend on any project
> > which has unknown licence.
> >
> > 3) At least in Airflow we are using `shasum -a 512 FILE` and it produces
> > SHASUM + name of the file, which I think is a good idea to have in .asc
> > file. Also something that can be improved in the future.
> >
> > The Shasum are good, but when I diff on what shasum produces, we have
> this:
> >
> > <
> >
> >
> 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111
> >  apache_burr-0.41.0-py3-none-any.whl
> > ---
> > >
> >
> >
> 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111
> > Checking apache-burr-0.41.0-incubating.tar.gz.sha512
> > 1c1
> > <
> >
> >
> 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a
> >  apache-burr-0.41.0-incubating.tar.gz
> > ---
> > >
> >
> >
> 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a
> >
> > 4) files with unknown licences in the .sdist file (since it looks like
> > -sources). This is also quite hard -1 because of the .tpl file.
> >
> > There are a number of files with unapproved licenses (I unpacket the
> > .tar.gz and downloaded and ran the
> > https://dist.apache.org/repos/dist/release/creadur/apache-rat-0.17/ on
> the
> > directory). While I understand why .jsonl files do not have licence (json
> > cannot contain comments), the best way to deal with that is to add
> > .rat-excludes file in your repo - see Airflow one [5] and make it part of
> > the source package. This way you can add -E .rat-excludes and it will
> > exclude those files from check. The .tpl file seems to be a JINJA
> template
> > and those files allow for comments and can easily embed license
> information
> > that will be excluded in the final generated json file.
> >
> > ! Unapproved:         23    A count of unapproved licenses.
> > ! /burr/tracking/server/demo_data/demo_chatbot/chat-1-giraffe/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_chatbot/chat-2-geography/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_chatbot/chat-3-physics/log.jsonl
> > !
> /burr/tracking/server/demo_data/demo_chatbot/chat-4-philosophy/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_chatbot/chat-5-jokes/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot/chat-6-demonstrate-errors/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-1-giraffe/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-2-geography/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-3-physics/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-4-philosophy/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-5-jokes/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-6-demonstrate-errors/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-1-food/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-2-work-history/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-3-activities/log.jsonl
> > !
> >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-4-everything/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_counter/count-to-1/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_counter/count-to-10/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_counter/count-to-100/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_counter/count-to-42/log.jsonl
> > ! /burr/tracking/server/demo_data/demo_counter/count-to-50/log.jsonl
> > !
> >
> >
> /burr/tracking/server/s3/deployment/terraform/templates/ecs/burr_app.json.tpl
> >
> > 5) Bad naming of `sdist` file.
> >
> > I am not sure how you produced the .sdist file (again - no release
> > instructions) but when I tried to build it and compare what's in my
> .sdist
> > and your .sdist, I got it quite different because name of my package
> (tried
> > it with flit, hatch and build packages) is (correctly)
> > *apache_burr-0.41.0-incubating.tar.gz* and yours was
> > *apache-burr-0.41.0-incubating.tar.gz*. We used to have the same in
> Airflow
> > and it caused us some serious problems when it comes to links to our
> .sdist
> > packages, and general difference of .whl vs. sdist. **Some** old tooling
> > used to produce such names (old setuptools and old filt) but this since
> has
> > been properly implemented by both. The thing is that the .sdist package
> > name SHOULD be normalized to contain the distribution name normalized -
> > which replaces all sequences of "_-." with a single "_" and lowercase [6]
> > (unlike package names in PyPI, this follows the Binary wheel naming
> > normalization which uses "_" rather than "-" in package name [7].
> >
> > 6) Easier setup of the env
> >
> > I noticed some small issue with the env when preparing the release -
> > missing `cli` extra when setting up the venv to build release. I fixed it
> > in [8] - also proposed a small addition of dev dependency group (might
> > split it if needed) and proposed that you might use some more modern
> > standardised features of packaging like dependency groups and inline
> script
> > metadata. See details in the PR - we can discuss it there.
> >
> > 6) Reproducibility from sources:
> >
> > I tried to rebuild both .sdist and .whl package following the
> instructions
> > and initially I have not compiled the UI and got them missing (of
> course) -
> > I understand that full automation with custom build hook is deferred for
> > later (which is OK) - but (as expected) the files in the package have
> > different mtime. This can be easily fixed with hard-coding the
> > SOURCE_DATE_EPOCH variable before the build [9] and since you are already
> > using instructions and scripts, that should be an easy addition in your
> > docs. In airflow we have a prek commit that automatically regenerates the
> > date when release notes change but at the beginning the mtime to be used
> > can be simply hard-coded to basically any date. This way whoever follows
> > your release process will have it closer to a truly reproducible package
> > and diffoscope will start showing useful diffs in case there are some
> [10]
> >
> > Summary of things:
> >
> > MUST
> > * .tpl licence adding - 4)
> > * explain (or likely remove) the .gitmodule BentoBurr reference - 2)
> > * explicit rules in docs about why you exclude certain files from source
> > package - 4)
> > * separate -source.tar.gz package with all files including docs and
> likely
> > all files (subject to rules about exclusion above) 1)
> >
> > SHOULD:
> > * proper naming of sdist artifacts (with _) (needs newer flit simply and
> > doc update) - 5)
> > * add .rat-excludes that will allow to use RAT to verify the official
> > source packages 5)
> >
> > NICE TO HAVE:
> > * shasum with filename - 3)
> > * simplify the env setup with inline metadata, dev dependency groups
> > (support for those already in uv, hatch and others) - 6)
> > * reproducibility setup 7)
> >
> >
> >
> > [1] Debate about whether "tests" and "docs" should be included in .sdist
> >
> https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578/26
> > [2] What should be included in source packages of ASF -
> > https://www.apache.org/legal/release-policy.html#source-packages
> > [3] Example email where Airflow PMC explicitly pointed to .sdist packages
> > being "source" packages (see the description of .sdist files)
> > https://lists.apache.org/thread/8ob972qkd7sy6k1pn5nskc2x0yjx2t2y
> > [4] The .gitattributes file in Airflow repo
> > https://github.com/apache/airflow/blob/main/.gitattributes
> > [5] RAT excludes in Airflow repo
> > https://github.com/apache/airflow/blob/main/.rat-excludes
> > [6] PEP-625 Filename of a Source Distribution -
> > https://peps.python.org/pep-0625/
> > [7] Binary packages distribution name normalization -
> >
> >
> https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode
> > [8] PR to fix missing cli extra and improving dev-env to use it
> > https://github.com/apache/burr/pull/604
> > [9] Flit reproducibility
> https://flit.pypa.io/en/stable/reproducible.html
> > [10] Diffoscope - tool to show reproducibility issues
> > https://diffoscope.org/
> >
> > J.
> >
> >
> >
> >
> > On Sun, Nov 30, 2025 at 5:02 AM Elijah ben Izzy <
> > [email protected]> wrote:
> >
> > > Hi all! Trying again!
> > >
> > >
> > > This is a call for a vote on releasing Apache Burr 0.41.0-incubating
> > > Release Candidate 2.
> > >
> > > This release includes the following changes (see CHANGELOG for
> details).
> > > See all commits since prior release:
> > > - https://github.com/apache/burr/compare/burr-0.40.2...main
> > >
> > > Key changes include:
> > > - pool-based async PG persister
> > > - multiple UI updates
> > > - Apache compatible licenses/build processes
> > > - bug fixes, typing, etc...
> > >
> > > The artifacts for this release candidate can be found at:
> > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/burr/0.41.0-incubating-RC2/
> > >
> > > The Git tag to be voted upon is: v0.41.0
> > >
> > > The release hash is 11783ba58f8c5bd161118976ced791a2f5bd78f3
> > >
> > > Release artifacts are signed with the following key:
> > > BB8B72B34AB9A664A109AA17A76CF4C80E4E5355
> > > The KEYS file is available at:
> > > https://downloads.apache.org/incubator/burr/KEYS
> > >
> > > Please download, verify, and test the release candidate. For testing
> use
> > > your best judgement. The following may suffice:
> > >
> > > 1. Build/run the UI following the instructions in scripts/README.md
> > > 2. Run the tests in tests/
> > > 3. Import into a jupyter notebook and play around
> > >
> > > Highly encourage you to pip install from source, run `burr` and play
> with
> > > the UI (some UI bugs I recently discovered will be filed)
> > >
> > > The vote will run for a minimum of 72 hours.
> > > Please vote:
> > >
> > > [ ] +1 Release this package as Apache Burr 0.41.0-incubating
> > > [ ] +0 No opinion
> > > [ ] -1 Do not release this package because... (Please provide a reason)
> > >
> > > Checklist for reference:
> > > [ ] Download links are valid.
> > > [ ] Checksums and signatures.
> > > [ ] LICENSE/NOTICE files exist
> > > [ ] No unexpected binary files
> > > [ ] All source files have ASF headers
> > > [ ] Can compile from source
> > >
> > > On behalf of the Apache Burr PPMC,
> > >
> > > Elijah ben Izzy ([email protected])
> > >
> >
>

Reply via email to