Re: [OE-core] [bitbake-devel] 'vendor' fetching discussion cont.

Richard Purdie via lists.openembedded.org Thu, 13 Feb 2025 12:52:02 -0800

On Thu, 2025-02-13 at 17:33 +0100, Stefan Herbrechtsmeier wrote:
>  Am 13.02.2025 um 11:43 schrieb Richard Purdie via lists.openembedded.org:
> 
> > Most of the concerns I've seen are about how easy it is to understand
> > what is going on behind the scenes. The move of code to OE and
> > splitting everything into multiple tasks/stages does do that to some
> > extent but it does it in a way which I think is going to create a new
> > and different set of problems.
> >  
> > 
> > 
> > 
>  Okay, but please keep in mind that some of my oe patches are
> reasonable independent of the native bitbake fetcher and it is
> possible to integrate the steps from the early class into the fetch
> task.


I appreciate that and I appreciate the desire to push things into OE as
it appears easier. It can lead to much looser APIs and less structured
code and I'm wary of it here as we create a two layered system which I
think will be harder to understand (and hence harder to debug and use).

  
> > I'm therefore wondering if there is a different way. The changes
> > I'm
> > wondering about would be to:
> > 
> > a) embrace the single SRC_URI entry
> > 
> > b) require a checksum of the internal "URL list" that is included
> > in SRC_URI, much in the same way that we have checksums of
> > tarballs.
> >  
>  The list isn't fix because it depends on the configured package
> manager proxy or registry. We have to remove this feature. But the
> user could use a PREMIRROR to redirect the upstream proxy to its
> private proxy.

If that is true we have a huge problem.

By list I mean a list of something like (component, version) pairs
where component uniquely identifies the component and version is a
specific verifiable version of that component. If we can create that
list, we can checksum it and use it as above. If we can't create that
list, we have no idea what is in our builds and we may as well give up
as it isn't reproducible.

> > For better or worse, we have low trust in the underlying tools to
> > get this right (they are getting better).
> >  
>  
> We don't need to trust the tools. We parse the lock file and enrich
> it with fix values. The resolve is deterministic. The output only
> depends on the resolve function, variable values and lock file
> content.

This assumes the "resolve" always does the same thing. I'm afraid
experience shows these can have issues. I'd much rather we have some
kind of backup in the system which tells whether we did get the same
resolution which is what this checksum represents.

> > c) if the checksum doesn't match, we know something went wrong and
> > error
> >  
> Can you please elaborate this point. We already check the integrity
> of the lock file and we have deterministically resolve the SRC_URIs.

See above. I'd like to know that the list of components and versions we
resolve everything to matches what we expect it to look like.

> > d) require the new modules to write the URL list into a known
> > location as part of unpack
> > 
>  Why is this needed? The generated SRC_URIs could be resolved via
> fetcher.expanded_urldata().

If someone is trying to debug what the code did or resolved things too,
suggesting they run python functions to work it out will be a poor user
experience. If on the other hand they know the result is always stored
in WORKDIR/xyz/ABC, the know where and what to look at.

The user experience of using this code will make or break it's
adoption.

> > e) add the ability to add custom hooks in the fetch process to
> > handle
> > the cases of needing to alter the flow for patching the components
> > list
> >  
>  I'm afraid this will be complicated since PATCH is applied in S and
> not in UNPACKDIR.

Then we should work out how to handle that. We could allow the recipes
to specify the top level dir to apply patches from for example?


> > f) create new tools that allow the fetcher to be stepped through
> > and
> > for example partially run, or run with clear debug output showing
> > what
> > was happening at each stage (show the list of components?). This
> > may be
> > standalone tools, maybe a devtool module, I don't know. We may want
> > to
> > make the fetch/unpack logs more useful in general as right now you
> > don't get much useful data about what it is doing.
> > 
> > 
> > If we do those things, where does that get us? How much buy in do
> > our
> > different stakeholders have?
> > 
> > FWIW I am leaning towards having this code in the bitbake fetcher
> > as a
> > first class citizen as to do otherwise is going to create layers of
> > abstraction and we probably have enough of those already.
> >  
>  
> The advantage is that the SRC_URI still contains the dependencies if
> you expand the urldata. On the other side the integration of the
> patches in the fetcher sounds complicated.

Patches would stay where they are in the system in do_patch and use the
code in OE-Core. I'm just thinking we could add some hooks in the fetch
process to allow adjustment of things like the resolved component list.
It doesn't have to be a patch, it could be a function passed data.

Cheers,

Richard

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#211369): 
https://lists.openembedded.org/g/openembedded-core/message/211369
Mute This Topic: https://lists.openembedded.org/mt/111165666/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [OE-core] [bitbake-devel] 'vendor' fetching discussion cont.

Reply via email to