Looks good to me!

On Wed, Dec 24, 2025, 7:53 AM Elijah ben Izzy <
[email protected]> wrote:

> @Jarek + others -- I wanted to get your take on this. I'm pushing to get it
> out soon (finally have a bit more focus time) and I drafted this. What does
> the pod think about what is/isn't source?
>
> I'm generally happy with this, but I think the key part is that this is
> documented/clear. See table at the top:
>
>
> https://github.com/apache/burr/blob/20f7790dd60cec8e397184cfe0b2aaa564f49f48/scripts/README.md
>
>
> On Sun, Nov 30, 2025 at 12:58 PM Jarek Potiuk <[email protected]> wrote:
>
> > Yep. All for moving forward fast even if it means some of the things that
> > are none-MUST will be deferred.  I tried to move things to the
> > "MUST/SHOULD/NICE" in the way to make it clear what is the highest
> priority
> > - I think you get it mostly right. the "_" thing is really making sure
> that
> > the right tools (flit in this case) have some min-versions - recently
> > enough to produce good naming).
> >
> > This is one more thing that I should have stressed - I think **some**
> part
> > of it is the way you make sure the tooling is recent enough. Approach for
> > Airflow is that we have ALWAYS min version of a dependency set.
> "Released 6
> > months ago" is a good "rule of thumb". I think that if you follow the
> "env
> > setup" and the env has some "min versions" - that solves at least some
> part
> > of the issues I found in the release. The "_" in the name is definitely
> > going to be fixed if min-version is set appropriately.
> >
> > J.
> >
> >
> > On Sun, Nov 30, 2025 at 8:27 PM Elijah ben Izzy <
> > [email protected]> wrote:
> >
> > > @Jarek -- thanks, this is very clear (and absolutely worth getting
> right
> > > even if it means a delay in release).
> > >
> > > I want to sum up just to make sure I understand the high-level -- here
> > are
> > > the themes I'm picking up:
> > >
> > > 1. License on everything -- JSON is the exception but that's why we
> have
> > > .rat excludes (tpl looked like another JSON which I think is why I
> missed
> > > it)
> > > 2. Underscores versus dashes for consistency to avoid trouble later on
> > > 3. A consistent, documented internal opinion on what *is* and *isn't*
> > > source
> > > 4. Clean up weird stuff (I.E. the bento burr submodule, that probably
> > > should just be removed from the repo)
> > > 5. Anything to make development + verification easier on the developer
> > (and
> > > any downstream consumer of the source)
> > > 6. More documentation overall
> > >
> > >
> > > I think (3) and (6) are pretty big value-adds, I.E. where I should
> focus
> > > some time. High-level, for this project, I want to throw this out:
> > > 1. Docs are *not* source -- not included in distribution
> > > 2. Tests *are* source -- why? These let the developer download + run
> in a
> > > self-verification attempt
> > >
> > > This is pretty justifiable IMO:
> > >
> > > *Our best practice across the various projects I maintain is to always
> > run
> > > the tests from the source on the installed wheel. For each Python
> version
> > > and platform in our CI matrix, we do a clean sdist and wheel build
> (with
> > > build, of course), install the wheel in a clean env, and then run the
> > tests
> > > from the source against that, using either a src dir, tox and/or python
> > -I,
> > > plus pytest --import-mode=importlib, to ensure isolation from the
> source
> > > tree and we’re always using the installed copy. It isn’t as important
> > that
> > > the tests themselves work when packaged for end-distribution, but
> rather
> > > than the code under test works.*
> > >
> > > Going to take a bit of time later/this week to prep this and might
> reach
> > > out with more questions. Otherwise I'll also be reading over the
> > resources
> > > to ensure that nothing slipped up.
> > >
> > > Cheers,
> > > Elijah
> > >
> > > On Sun, Nov 30, 2025 at 8:31 AM Jarek Potiuk <[email protected]> wrote:
> > >
> > > > -1 for now, sorry.
> > > >
> > > > Reviewed:
> > > >
> > > > * signatures OK
> > > > * checksums  OK
> > > > * licences NOK
> > > > * reproducibility from sources
> > > >
> > > >
> > > > I think there is the .gitmodule problem that should be solved, also
> > lack
> > > of
> > > > -source.tar.gz explicitly is not really good I think..
> > > >
> > > > Several reasons:
> > > >
> > > > 1) Lack of explicit source package (this is "almost -1" for me,
> because
> > > > formally speaking the .sdist package is fulfilling the letter of the
> > > source
> > > > package, but IMHO it does not necessarily fulfills the spirit.
> > > >
> > > > I think it's not very clear which package is "source" and which are
> > > > "convenience/binary" packages. From what I see, the .tar.gz is
> > > **something
> > > > between** source package and the .sdist. It **looks** like an sdist
> > > package
> > > > (with PKG_INFO) - but also it contains "tests" - which is unusual for
> > > sdist
> > > > packages (however there is a big debate about it  [1]). The
> requirement
> > > for
> > > > "source" packages published by the ASF is that it contains all the
> > > sources
> > > > needed to build code and tests [2] (which your .sdist file has, so
> > that's
> > > > cool) - it seems to some extent it follows the expectation. I think
> it
> > > must
> > > > be clear which of the packages is "-source" one and naming it like
> that
> > > and
> > > > keeping it separate from .sdist is a good idea.
> > > >
> > > > We also in Airflow - for quite a while - took some of our .sdist
> files
> > as
> > > > "source" releases when we released only some of the distributions
> that
> > > are
> > > > part of the monorepo.  When we did it in the past -  in Airflow we
> > > > explicitly mentioned in our emails that those .sdist packages are the
> > > > "source" packages as expected by the ASF [3] .  But eventually we
> > > entirely
> > > > gave up on it (a few weeks ago) , because we opted in to include
> > > > essentially **everything** that is in the source repo of ours (we are
> > > > essentially using git archive to produce the source-tar.gz). The main
> > > > reason was that if we **only** release .sdist, some of our important
> > code
> > > > (such as sources for docs) were not published when we released only
> > > > .sdists.
> > > >
> > > > The .sdist of yours misses quite a number of files from the repo:
> > > >
> > > > * big number of  examples
> > > > * docs sources - I think this is an important miss - while docs are
> > > > * telemetry folder
> > > > * .github and .gitmodules (are those gitmodules necessary to build
> the
> > > > project?)
> > > >
> > > > It's likely that those files are excluded deliberately and something
> > that
> > > > you do not **want** to release at all, but I find it a bit strange to
> > > > remove docs and many examples, It seems that those who unpack sources
> > > from
> > > > the official source package, cannot do all the same things as people
> > who
> > > > check it out from repo TAG . If someone takes it as "source" and
> never
> > > > looks at the GitHub repo - they will miss important sources (like
> docs
> > > > sources) that IMHO is something that the users **should** have.
> > Generally
> > > > users should be able to do the same with the "-source.tar.gz" as what
> > > they
> > > > can when they do `git checkout TAG` in your repo.
> > > >
> > > > The AI-generated (undoubtedly but that's ok ;)  doducmentation in
> > > README.md
> > > > describes what goes in and out but it does not explain WHY. I think
> if
> > > you
> > > > **really** want to exclude some files from your source distribution
> you
> > > > should explain WHY in the documentation.
> > > >
> > > > Just to add a bit of context. You might think that the
> "-source.tar.gz"
> > > > file is not that important, as nearly nobody will use it. Which is a
> > fair
> > > > assesment ("nearly nobody") - but those who do are the important
> users
> > -
> > > > those are downstream packagers, who might want to include burr in
> > distros
> > > > for example. Many of the distros that are out there use the
> officially
> > > > signed and checksummed packages to build and install their packages.
> > For
> > > > example this is what conda might want to do. Or Debian maintainers.
> > Those
> > > > are important users and we need to make sure that they can do it
> > easily.
> > > > That's the safest bet to produce explicitly "-source.tar.gz" as a
> "git
> > > > archive" result IMHO - and not exclude things that you would normally
> > > > commit to the repo (note that you can have generated code committed
> to
> > > your
> > > > repo - and there is "no compiled code in your repo" - so that would
> > > > probably be the only thing to exclude (if your build process rebuilds
> > > those
> > > > generated files automatically). This can be done via .gitattributes
> [4]
> > > in
> > > > airflow.
> > > >
> > > > 2) The .gitmodules thing is the final reason why I gave -1. I am not
> > > sure -
> > > > it's not clear- if BentoBurr mentioned whether it is needed to build
> > the
> > > > project or not. This project is not only archived, but also misses
> > > LICENCE
> > > > information, so while it is actually **excluded** from .sdist
> package,
> > I
> > > > think it should be either removed from the repo or included in
> > > > -sources.tar.gz - generally ASF project should not depend on any
> > project
> > > > which has unknown licence.
> > > >
> > > > 3) At least in Airflow we are using `shasum -a 512 FILE` and it
> > produces
> > > > SHASUM + name of the file, which I think is a good idea to have in
> .asc
> > > > file. Also something that can be improved in the future.
> > > >
> > > > The Shasum are good, but when I diff on what shasum produces, we have
> > > this:
> > > >
> > > > <
> > > >
> > > >
> > >
> >
> 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111
> > > >  apache_burr-0.41.0-py3-none-any.whl
> > > > ---
> > > > >
> > > >
> > > >
> > >
> >
> 77ad9cf9ddf508645d094ae18efce76482ff86339ffd2cd9dfe46af5d0545bdfa949c00ccc7beb3f6ae5f2c65523cc1a3db9a7425921c86fde5c4d54eb893111
> > > > Checking apache-burr-0.41.0-incubating.tar.gz.sha512
> > > > 1c1
> > > > <
> > > >
> > > >
> > >
> >
> 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a
> > > >  apache-burr-0.41.0-incubating.tar.gz
> > > > ---
> > > > >
> > > >
> > > >
> > >
> >
> 2e755584eb71fcede377d92f67024e3694cee4729da55e8b8d5b8739388c9046438e40cd2428003cca1e11a7b40abb897371d608db1ce3c0638d266c3de2c50a
> > > >
> > > > 4) files with unknown licences in the .sdist file (since it looks
> like
> > > > -sources). This is also quite hard -1 because of the .tpl file.
> > > >
> > > > There are a number of files with unapproved licenses (I unpacket the
> > > > .tar.gz and downloaded and ran the
> > > > https://dist.apache.org/repos/dist/release/creadur/apache-rat-0.17/
> on
> > > the
> > > > directory). While I understand why .jsonl files do not have licence
> > (json
> > > > cannot contain comments), the best way to deal with that is to add
> > > > .rat-excludes file in your repo - see Airflow one [5] and make it
> part
> > of
> > > > the source package. This way you can add -E .rat-excludes and it will
> > > > exclude those files from check. The .tpl file seems to be a JINJA
> > > template
> > > > and those files allow for comments and can easily embed license
> > > information
> > > > that will be excluded in the final generated json file.
> > > >
> > > > ! Unapproved:         23    A count of unapproved licenses.
> > > > !
> /burr/tracking/server/demo_data/demo_chatbot/chat-1-giraffe/log.jsonl
> > > > !
> > /burr/tracking/server/demo_data/demo_chatbot/chat-2-geography/log.jsonl
> > > > !
> /burr/tracking/server/demo_data/demo_chatbot/chat-3-physics/log.jsonl
> > > > !
> > >
> /burr/tracking/server/demo_data/demo_chatbot/chat-4-philosophy/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_chatbot/chat-5-jokes/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot/chat-6-demonstrate-errors/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-1-giraffe/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-2-geography/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-3-physics/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-4-philosophy/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-5-jokes/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_chatbot_with_traces/chat-6-demonstrate-errors/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-1-food/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-2-work-history/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-3-activities/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/demo_data/demo_conversational-rag/rag-4-everything/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-1/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-10/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-100/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-42/log.jsonl
> > > > ! /burr/tracking/server/demo_data/demo_counter/count-to-50/log.jsonl
> > > > !
> > > >
> > > >
> > >
> >
> /burr/tracking/server/s3/deployment/terraform/templates/ecs/burr_app.json.tpl
> > > >
> > > > 5) Bad naming of `sdist` file.
> > > >
> > > > I am not sure how you produced the .sdist file (again - no release
> > > > instructions) but when I tried to build it and compare what's in my
> > > .sdist
> > > > and your .sdist, I got it quite different because name of my package
> > > (tried
> > > > it with flit, hatch and build packages) is (correctly)
> > > > *apache_burr-0.41.0-incubating.tar.gz* and yours was
> > > > *apache-burr-0.41.0-incubating.tar.gz*. We used to have the same in
> > > Airflow
> > > > and it caused us some serious problems when it comes to links to our
> > > .sdist
> > > > packages, and general difference of .whl vs. sdist. **Some** old
> > tooling
> > > > used to produce such names (old setuptools and old filt) but this
> since
> > > has
> > > > been properly implemented by both. The thing is that the .sdist
> package
> > > > name SHOULD be normalized to contain the distribution name
> normalized -
> > > > which replaces all sequences of "_-." with a single "_" and lowercase
> > [6]
> > > > (unlike package names in PyPI, this follows the Binary wheel naming
> > > > normalization which uses "_" rather than "-" in package name [7].
> > > >
> > > > 6) Easier setup of the env
> > > >
> > > > I noticed some small issue with the env when preparing the release -
> > > > missing `cli` extra when setting up the venv to build release. I
> fixed
> > it
> > > > in [8] - also proposed a small addition of dev dependency group
> (might
> > > > split it if needed) and proposed that you might use some more modern
> > > > standardised features of packaging like dependency groups and inline
> > > script
> > > > metadata. See details in the PR - we can discuss it there.
> > > >
> > > > 6) Reproducibility from sources:
> > > >
> > > > I tried to rebuild both .sdist and .whl package following the
> > > instructions
> > > > and initially I have not compiled the UI and got them missing (of
> > > course) -
> > > > I understand that full automation with custom build hook is deferred
> > for
> > > > later (which is OK) - but (as expected) the files in the package have
> > > > different mtime. This can be easily fixed with hard-coding the
> > > > SOURCE_DATE_EPOCH variable before the build [9] and since you are
> > already
> > > > using instructions and scripts, that should be an easy addition in
> your
> > > > docs. In airflow we have a prek commit that automatically regenerates
> > the
> > > > date when release notes change but at the beginning the mtime to be
> > used
> > > > can be simply hard-coded to basically any date. This way whoever
> > follows
> > > > your release process will have it closer to a truly reproducible
> > package
> > > > and diffoscope will start showing useful diffs in case there are some
> > > [10]
> > > >
> > > > Summary of things:
> > > >
> > > > MUST
> > > > * .tpl licence adding - 4)
> > > > * explain (or likely remove) the .gitmodule BentoBurr reference - 2)
> > > > * explicit rules in docs about why you exclude certain files from
> > source
> > > > package - 4)
> > > > * separate -source.tar.gz package with all files including docs and
> > > likely
> > > > all files (subject to rules about exclusion above) 1)
> > > >
> > > > SHOULD:
> > > > * proper naming of sdist artifacts (with _) (needs newer flit simply
> > and
> > > > doc update) - 5)
> > > > * add .rat-excludes that will allow to use RAT to verify the official
> > > > source packages 5)
> > > >
> > > > NICE TO HAVE:
> > > > * shasum with filename - 3)
> > > > * simplify the env setup with inline metadata, dev dependency groups
> > > > (support for those already in uv, hatch and others) - 6)
> > > > * reproducibility setup 7)
> > > >
> > > >
> > > >
> > > > [1] Debate about whether "tests" and "docs" should be included in
> > .sdist
> > > >
> > >
> >
> https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578/26
> > > > [2] What should be included in source packages of ASF -
> > > > https://www.apache.org/legal/release-policy.html#source-packages
> > > > [3] Example email where Airflow PMC explicitly pointed to .sdist
> > packages
> > > > being "source" packages (see the description of .sdist files)
> > > > https://lists.apache.org/thread/8ob972qkd7sy6k1pn5nskc2x0yjx2t2y
> > > > [4] The .gitattributes file in Airflow repo
> > > > https://github.com/apache/airflow/blob/main/.gitattributes
> > > > [5] RAT excludes in Airflow repo
> > > > https://github.com/apache/airflow/blob/main/.rat-excludes
> > > > [6] PEP-625 Filename of a Source Distribution -
> > > > https://peps.python.org/pep-0625/
> > > > [7] Binary packages distribution name normalization -
> > > >
> > > >
> > >
> >
> https://packaging.python.org/en/latest/specifications/binary-distribution-format/#escaping-and-unicode
> > > > [8] PR to fix missing cli extra and improving dev-env to use it
> > > > https://github.com/apache/burr/pull/604
> > > > [9] Flit reproducibility
> > > https://flit.pypa.io/en/stable/reproducible.html
> > > > [10] Diffoscope - tool to show reproducibility issues
> > > > https://diffoscope.org/
> > > >
> > > > J.
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Nov 30, 2025 at 5:02 AM Elijah ben Izzy <
> > > > [email protected]> wrote:
> > > >
> > > > > Hi all! Trying again!
> > > > >
> > > > >
> > > > > This is a call for a vote on releasing Apache Burr
> 0.41.0-incubating
> > > > > Release Candidate 2.
> > > > >
> > > > > This release includes the following changes (see CHANGELOG for
> > > details).
> > > > > See all commits since prior release:
> > > > > - https://github.com/apache/burr/compare/burr-0.40.2...main
> > > > >
> > > > > Key changes include:
> > > > > - pool-based async PG persister
> > > > > - multiple UI updates
> > > > > - Apache compatible licenses/build processes
> > > > > - bug fixes, typing, etc...
> > > > >
> > > > > The artifacts for this release candidate can be found at:
> > > > >
> > > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/burr/0.41.0-incubating-RC2/
> > > > >
> > > > > The Git tag to be voted upon is: v0.41.0
> > > > >
> > > > > The release hash is 11783ba58f8c5bd161118976ced791a2f5bd78f3
> > > > >
> > > > > Release artifacts are signed with the following key:
> > > > > BB8B72B34AB9A664A109AA17A76CF4C80E4E5355
> > > > > The KEYS file is available at:
> > > > > https://downloads.apache.org/incubator/burr/KEYS
> > > > >
> > > > > Please download, verify, and test the release candidate. For
> testing
> > > use
> > > > > your best judgement. The following may suffice:
> > > > >
> > > > > 1. Build/run the UI following the instructions in scripts/README.md
> > > > > 2. Run the tests in tests/
> > > > > 3. Import into a jupyter notebook and play around
> > > > >
> > > > > Highly encourage you to pip install from source, run `burr` and
> play
> > > with
> > > > > the UI (some UI bugs I recently discovered will be filed)
> > > > >
> > > > > The vote will run for a minimum of 72 hours.
> > > > > Please vote:
> > > > >
> > > > > [ ] +1 Release this package as Apache Burr 0.41.0-incubating
> > > > > [ ] +0 No opinion
> > > > > [ ] -1 Do not release this package because... (Please provide a
> > reason)
> > > > >
> > > > > Checklist for reference:
> > > > > [ ] Download links are valid.
> > > > > [ ] Checksums and signatures.
> > > > > [ ] LICENSE/NOTICE files exist
> > > > > [ ] No unexpected binary files
> > > > > [ ] All source files have ASF headers
> > > > > [ ] Can compile from source
> > > > >
> > > > > On behalf of the Apache Burr PPMC,
> > > > >
> > > > > Elijah ben Izzy ([email protected])
> > > > >
> > > >
> > >
> >
>

Reply via email to