Re: [OE-core] 'vendor' fetching discussion cont.

Bruce Ashfield via lists.openembedded.org Thu, 13 Feb 2025 09:34:39 -0800

I did some replies to the other threads before seeing this, we can feel
free to let those other threads go unanswered, to unify things here.


On Thu, Feb 13, 2025 at 5:43 AM Richard Purdie <
richard.pur...@linuxfoundation.org> wrote:

> I've pulled this to a separate email/thread since I'd like to take a
> slight step back and put some different ideas into the mix as well as
> explain where my own thoughts are right now. I'm also doing this so
> that we can focus the discussion and give others a place to catch up
> from. For that reason I may restate some information below.
>
> We have two fundamentally different approaches:
>
> a) a single entry in SRC_URI with magic behind the scenes expanding
> this into a list of dependencies
>
> b) multiple entries in SRC_URI generated with tooling
>
> gitsm is similar to a), partly because it can contain recursive
> references to other submodules so the second approach wouldn't work.
> crates are handled with the b) approach today.
>
> Developers in general are more comfortable with b) since they can more
> easily see what is going on but it is also the harder one since it
> deviates most from the underlying tools and has two sets of data that
> require to be in sync.
>
> Some feel the .inc files enabling b) are effectively machine generated
> and could be removed with the work happening transparently instead,
> simplifying the recipes and commits. This allows easier use of the
> underlying tooling too.
>
> I think firstly, we need to document some key principles. Behind the
> scenes, any given url should expand to a defined list of components and
> that list has to be deterministic and not "floating", i.e. always the
> same regardless of changes in any registry or other upstream. If there
> are changes, they need to be detected and there needs to be a hard
> error. Also, if the code sees things declared in a way they could vary,
> that also needs to be a hard error.
>

And of course be expanded into something that is compatible
with our mirroring, but that was implied, I just wanted to say it :)


> Most of the concerns I've seen are about how easy it is to understand
> what is going on behind the scenes. The move of code to OE and
> splitting everything into multiple tasks/stages does do that to some
> extent but it does it in a way which I think is going to create a new
> and different set of problems.
>
> I'm therefore wondering if there is a different way. The changes I'm
> wondering about would be to:
>
> a) embrace the single SRC_URI entry
>

I still wonder how we'd be able to debug and/or override parts of
the single SRC_URI entry.  Do you consider a lock file or a language
dependency file that could be overwritten from recipe space as
a single SRC_URI entry ? If so, I can get on board with that.

I still prefer the expanded dependencies into some sort of base /
simple fetch format, but a single file that describes all the dependencies
is close enough. As long as there's a way to inspect what the file
was processed into for fetching, then there is some visibility in times
of need.

The remaining question for me is .. how recursive are the dependencies
in the file described on the SRC_URI ? If each line in the single file
is being expanded into multiple different dependencies, then the
visibility into the final list is low, the reproducibility and mirroring of
what
eventually gets fetched need to be guaranteed as well. That of course
isn't different from the issues which could arise with gitsm.


> b) require a checksum of the internal "URL list" that is included
> in SRC_URI, much in the same way that we have checksums of tarballs.
> For better or worse, we have low trust in the underlying tools to get
> this right (they are getting better).
>

This would be a checksum of the fully expanded dependencies of
the SRC_URI entry ?


>
> c) if the checksum doesn't match, we know something went wrong and
> error
>
> d) require the new modules to write the URL list into a known location
> as part of unpack
>

Aha. That answers the question that I had above.


> e) add the ability to add custom hooks in the fetch process to handle
> the cases of needing to alter the flow for patching the components list
>

Or potentially detect the output of d) being somehow supplied and not
do the dependency resolution ?


> f) create new tools that allow the fetcher to be stepped through and
> for example partially run, or run with clear debug output showing what
> was happening at each stage (show the list of components?). This may be
> standalone tools, maybe a devtool module, I don't know. We may want to
> make the fetch/unpack logs more useful in general as right now you
> don't get much useful data about what it is doing.
>
>
> If we do those things, where does that get us? How much buy in do our
> different stakeholders have?
>

I think it is getting closer!

If there's a way to see all of the individual fetches, and change those
fetches,
then it solves most of the issues that I've been using git:// fetches for
in my
go recipes.

Bruce



>
> FWIW I am leaning towards having this code in the bitbake fetcher as a
> first class citizen as to do otherwise is going to create layers of
> abstraction and we probably have enough of those already.
>
> Cheers,
>
> Richard
>
>
>
>
>
>
>
>
>

-- 
- Thou shalt not follow the NULL pointer, for chaos and madness await thee
at its end
- "Use the force Harry" - Gandalf, Star Trek II

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#211362): 
https://lists.openembedded.org/g/openembedded-core/message/211362
Mute This Topic: https://lists.openembedded.org/mt/111160509/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [OE-core] 'vendor' fetching discussion cont.

Reply via email to