I've pulled this to a separate email/thread since I'd like to take a
slight step back and put some different ideas into the mix as well as
explain where my own thoughts are right now. I'm also doing this so
that we can focus the discussion and give others a place to catch up
from. For that reason I may restate some information below.

We have two fundamentally different approaches:

a) a single entry in SRC_URI with magic behind the scenes expanding
this into a list of dependencies

b) multiple entries in SRC_URI generated with tooling

gitsm is similar to a), partly because it can contain recursive
references to other submodules so the second approach wouldn't work.
crates are handled with the b) approach today.

Developers in general are more comfortable with b) since they can more
easily see what is going on but it is also the harder one since it
deviates most from the underlying tools and has two sets of data that
require to be in sync.

Some feel the .inc files enabling b) are effectively machine generated
and could be removed with the work happening transparently instead,
simplifying the recipes and commits. This allows easier use of the
underlying tooling too.

I think firstly, we need to document some key principles. Behind the
scenes, any given url should expand to a defined list of components and
that list has to be deterministic and not "floating", i.e. always the
same regardless of changes in any registry or other upstream. If there
are changes, they need to be detected and there needs to be a hard
error. Also, if the code sees things declared in a way they could vary,
that also needs to be a hard error.

Most of the concerns I've seen are about how easy it is to understand
what is going on behind the scenes. The move of code to OE and
splitting everything into multiple tasks/stages does do that to some
extent but it does it in a way which I think is going to create a new
and different set of problems.

I'm therefore wondering if there is a different way. The changes I'm
wondering about would be to:

a) embrace the single SRC_URI entry

b) require a checksum of the internal "URL list" that is included
inĀ SRC_URI, much in the same way that we have checksums of tarballs.
For better or worse, we have low trust in the underlying tools to get
this right (they are getting better).

c) if the checksum doesn't match, we know something went wrong and
error

d) require the new modules to write the URL list into a known location
as part of unpack

e) add the ability to add custom hooks in the fetch process to handle
the cases of needing to alter the flow for patching the components list

f) create new tools that allow the fetcher to be stepped through and
for example partially run, or run with clear debug output showing what
was happening at each stage (show the list of components?). This may be
standalone tools, maybe a devtool module, I don't know. We may want to
make the fetch/unpack logs more useful in general as right now you
don't get much useful data about what it is doing.


If we do those things, where does that get us? How much buy in do our
different stakeholders have?

FWIW I am leaning towards having this code in the bitbake fetcher as a
first class citizen as to do otherwise is going to create layers of
abstraction and we probably have enough of those already.

Cheers,

Richard








-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#211324): 
https://lists.openembedded.org/g/openembedded-core/message/211324
Mute This Topic: https://lists.openembedded.org/mt/111160509/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to