Re: [yocto] [Openembedded-architecture] [OE-core] Core workflow: sstate for all, bblock/bbunlock, tools for why is sstate not being reused?

Mark Hatle Mon, 06 Nov 2023 11:44:50 -0800


On 11/5/23 1:43 PM, Adrian Freihofer wrote:

On Sat, 2023-11-04 at 11:09 +0000, Richard Purdie wrote:

On Sat, 2023-11-04 at 11:29 +0100, adrian.freiho...@gmail.com wrote:

Hi Alex, hi Richard

After some internal discussions, I would like to clarify my
previous
answers on this topic.

  * Usually there are two different workflows
     - application developers: could use an SDK with a locked
sstate-cache.
     - Yocto/BSP developers: need an unlocked SDK. They change the
recipes.
  * A locked SDK
     - can work with setscene from SSTATE_MIRRORS
     - setscene does caching in the SSTATE_DIR (no issue about that)
     - But network problems can occur during the initial build
because
       bitbake executes many independent setscene tasks. Opening so
many
       independent connections slows down the build, especially if
the
       server treats them as a denial of service attack.
     - The denial of service problem is difficult to solve because
each
       setscene task runs in its own bibtake task. Reusing a
connection to
       download multiple sstate artifacts seems almost impossible.
       This is much easier to solve with separate sstate download
script.


FWIW, we did have a similar issue with do_fetch overloading
servers/proxies/ISPs and added:

do_fetch[number_threads] = "4"

Finding the right place to put a thread limit on overall setscene
tasks
is harder but in theory possible. Or perhaps a "network capable
tasks"
thread limit?

Is the overload caused by the initial query of sstate presence, or,
does it happen when the setscene tasks themselves run?


The most extreme situation is probably bitbake --setscene-only with an
empty TMPDIR. Each of the setscene tasks establishes a new connection.
A server receives so many connections that it treats them as a denial
of service attack by throttling. A separate script would allow the same
connection to be reused to download all the required artifacts.
Limiting the number of threads does not really solve the issue because
there are still the same amount of connections which get quickly
opened.

  * An unlocked SDK
     - Tries to download the sstate cache for changed recipes and
their
       dependencies, which obviously can't work.
     - The useless download requests slow down the build
considerably and
       cause a high load on the servers without any benefit.


Is this sstate over http(s) or something else? I seem to remember you
mentioning sftp. If this were using sftp, it would be horribly slow
as
it was designed for a light overhead "does this exist?" check which
http(s) can manage well.


Yes, we are evaluating sftp. You are right, it is not optimal from a
performance point of view. For example S3 is much faster. A compromise
is to set up a limited number of parallel sftp connections. This has
worked very well so far.

The question of why we use sftp brings us to a larger topic that is
probably relevant for almost all Yocto users, but not for the Yocto
project itself: Security.

There is usually a git server infrastructure that makes it possible to
protect Git repositories with finely graded access policies. As the
sstate-cache contains the same source code, the protection concept for
the Git repositories must also be applied to the sstate-cache
artifacts.

First of all a user authentication is required for the sstate-mirror.
An obvious idea is to use the same user authentication for the sstate-
cache server as for the Git server. In addition to https, ssh is also
often used for git repositories. SSH even offers some advantages in
terms of user-friendliness and security (if a ssh agent is used). This
consideration finally leads us to use the sftp protocol for the sstate
mirror. This is also relatively easy to administer: Simply copy the
user's public ssh keys from the git server to the sftp server.

While being able to support ssh (or a related protocol) is useful, you need toalso remember that MANY MANY organizations absolutely block SSH access throughtheir firewalls. So _requiring_ sftp would be bad. Allowing it's usage wouldbe good.

As for logging in, https is transport 'security' but not authentication withoutadditional helpers. I think it's absolutely reasonable to say https accesseither needs an external helper for authentication purposes or it'sun-authenticated. If you want (internal to the company) then ssh/sftp orsimilar using the ssh-agent (or similar) should be the suggested approach.

We don't want to exclude anyone, but we want to be clear on the limitationsbased on an organization's specific choice.

If one then wants to scale an sstate-cache server for many different
projects and users, one quickly wishes for an option for authorization
at artifact level. Ideally, the access rights to the source code would
be completely transferred to the associated sstate artifacts. For such
an authorization the ssate mirror server would require the SRC_URI
which was used to compile the sstate artifact. With this information,
it could ask the Git server whether or not a user has access to all
source code repositories to grant or deny access to a particular sstate
artifact. It should not be forgotten that the access rights to the Git
repositories can change.

In my experience you do not use _one_ sstate-cache for multiple projects (at anorganization level), each project is responsible for it's own cache. Thisprevents even the possibility that one project could use code not intended for it.

From a more generic Yocto Project perspective, this means you really want touse a hierarchy of sstate-caches. (Maybe not a true hierarchy.). I.e. I use YP,so I get the YP sstate-cache for the base functionality. I usemeta-openembedded, so I want the meta-openembedded cache... project A, I wantthe project's cache, OE and YP caches as well.. project B, I want that projectscache, OE and YP caches. Project C? I might want it's cache, Project A,Project B, and OE and YP. You can see this gets complicated quickly.

If this either isn't inteded or a good idea, then alternatives need to beprovided for this. Everyone always ends up with an upstream provider (orproviders) be it YP, OE, OSVs, ISVs, local company resources, etc. How do wemanage this and keep it aligned?

Bring in hash equivalency and PR service and things get complicated. Thesstate-cache itself is NOT separable from those services. There are ways todecouple them, but they can be 'extreme'. I.e. turn off hash-equivalency, noneed for a hash-equivalency service. Don't cache the do_package_write* files,no PR service.... (but even that isn't fool proof due to git AUTOINC... so youend up seeding the AUTOINC with static entries or some other method...)

All of these items need to be dealt with and documented together. My PERSONALpreference, (without knowing any specific implementation details) is that thecontents of hash-equivalency and PR service is somehow stored with the sstate-cache.

One possible way this could be done.. System starts up, determines it needssomething it doesn't have, then goes out and checks if an updated index ispresent. If it is, downloads it adds to it's hash equivalency server. If noindex present, it can then look for the file lets say "sstate:....link". Ifthat comes back, we know we have an equivalency, it's downloaded added to thelocal database and then the pointed to file is retrieved.. (.siginfo and.tar.xz or whatever). This would ensure that the index is an optimization, butnot a requirement and would allow a "live" sstate-cache while losing someperformance. (This doesn't negate any of the comments about rights orpossibility to DoS a server via too many connections!)

Still have to solve the PR service problem, but this could get 'seeded' via theassociated do_write_package siginfo or similar.. and for the AUTOINC, seed itfrom the siginfo file for a given hash?

Doing something like the above could then allow the order specificed in theSSTATE_MIRRORS to be used to truely indicate the order things are resolved andloaded.


Recently we've been wondering about teaching the hashequiv server
about
"presence", which would then mean the build would only query things
that stood a good chance of existing.

Yes, that sound very interesting. There are probably even more such
kind of meta data which could be provided by the hashserver to improve
the management of a shared sstate mirror.

Would it make sense to include e.g. the SRC_URI in the hashserv
database and extend the hashserver's API to also provide meta data e.g.
for the authorization of the sstate-mirror? Or is security and
authorization something which should be handled independently from hash
equivalence?

The more I've thought about this, any sort of query directly to a remotehashservice seems more and more problematic.. Local hash database, absolutelyneeded as an optimization.

There is a second problem. My org for instant, it's easy for me to requesthttps server where I can serve files to the public. But asking for our IT tosupport a hash equivalency (and pr) server? This will likely take months ofnegotiation, possible security review, mitigation process, etc etc etc.. and noguaranty that it will actually get approved. I expect other people will be in asimilar situation.

Another topic where additional meta data about the sstate-cache seams
to be beneficial is sstate-mirror retention. Knowing which artifact was
compiled for which tag or commit of the bitbake layer could help to
wipe out some artifacts which are not needed anymore.

     - A script which gets a list of sstate artifacts from bitbake
and then
       does a upfront download works much better
        + The script runs only when the user calls it or the SDK
gets boot-
          strapped
        + The script uses a reasonable amount of parallel
connections which
          are re-used for more then one artifact download


Explaining to users they need to do X before Y quickly gets tiring,
both for people explaining it and the people doing it trying to
remember. I'd really like to get to a point where the system "does
the
right thing" if we can.

I don't believe the problems you describe are insurmountable. If you
are using sftp, that is going to be a big chunk of the problem as the
system assumes something faster is available. Yes, I've taken patches
to make sftp work but it isn't recommended at all. I appreciate there
would be reasons why you use sftp but if it is possible to get a list
of "available sstate" via other means, it would improve things.

  * Idea for a smart lock/unlock implementation
     - Form a user's perspective a locked vs. an unlocked SDK does
not make
       much sense. It makes more sense if the SDK would
automatically
       download the sstate-cache if it is expected to be available.
       Lets think about an implementation (which allows to override
the
       logic) to switch from automatic to manual mode:

SSTATE_MIRRORS_ENABLED ?= "${is_sstate_mirror_available()}"


What determines this availability? I worry that is something very
fragile and specific to your use case. It is also not an all or
nothing
binary thing.


It would probably be better to query a harserver if an artifact is
present.

       In our case the sstate mirror is expected to provide all
artifacts
       for tagged commits and for some git branches of the layer
       repositories.
       The sstate is obviousely not usable for a "dirty" git layer
       repository.


That isn't correct and isn't going to work. If I make a single change
locally, there is a good chance that 99.9% of the sstate could still
be
valid in some cases. Forcing the user through 10 hours of rebuild
when
potentially that much was available is a really really bad user
experience.


Maybe there is a better idea.

  That's what the is_sstate_mirror_available function
       could check to automatically enable and disable lazy
downloads.

- If is_sstate_mirror_available() returns false, it should

still be
       possible to initiate a sstate-cache download manually.

* Terminology

     - Older Yocto Releases:
        + eSDK means an installer which provides a different
environment with
          different tools
        + The eSDK was static, with a locked sstate cache
        + Was for one MACHINE, for one image...
     - Newer Yocto Releases:
        + The bitbake environment offers all features of the eSDK
installer. I
          consider this as already implemented with meta-ide-support
and
          build-sysroots.


Remember bblock and bbunlock too. These provide a way to fix or
unlock
specific sections of the codebase. Usually a developer has a pretty
good idea of which bits they want to allow to change. I don't think
people have yet realised/explored the potential these offer.


Yes, I also started thinking about the possibilities we would get for
the SDK if there is a hash-server or an even more generic a meta data
server for the sstate-cache in the middle of the infrastructure
picture. it would probably solve some challenges which I could not find
a solution so far.

Using the standard download model/approach we already have a "generic" metadataserver approach (and standard download URI supported by bitbake). Thespecialized approaches (prserver/hashserver) are where we run into issuesbecause it's no longer "generic" and well understood by others. Need to figureout a way for this all to work and allow the most "reasonable" re-use we can.


--Mark


Thank you for your response.

Adrian

Cheers,

Richard

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#61627): https://lists.yoctoproject.org/g/yocto/message/61627
Mute This Topic: https://lists.yoctoproject.org/mt/102428240/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [yocto] [Openembedded-architecture] [OE-core] Core workflow: sstate for all, bblock/bbunlock, tools for why is sstate not being reused?

Reply via email to