On Tue, Jun 18, 2024 at 8:48 AM Marta Rybczynska <rybczyn...@gmail.com> wrote: > > Hello, > This is my first round of the review, before applying the code. > > On Mon, Jun 10, 2024 at 11:45 PM Joshua Watt via lists.openembedded.org > <JPEWhacker=gmail....@lists.openembedded.org> wrote: > > | +SPDX_PROFILES ?= "core build software simpleLicensing security" > > I would like to discuss this choice by default. People who do security won't > need the 'build' profile (and this likely adds > much data). On the other hand, people who do licensing won't need the > 'security' profile. > > It would be really nice to generate a basic SPDX without doing the build...
I think we should look at that later just to keep the scope reasonable; This is purposely trying to stick close to what SPDX 2.2 gives us. > >> >> + >> +SPDX_HASH_ALGORITHMS ?= "sha256 sha1" > > > Any reason to still use sha1? It is obsolete. We could use sha3 instead if we > want to have multiple ones. I've dropped this in the latest version, in favor of only using sha256 to make my life easier :) > > >> >> >> + >> + >> +python do_create_spdx() { >> + import oe.sbom30 >> + import oe.spdx30 >> + from pathlib import Path >> + from contextlib import contextmanager >> + import oe.cve_check >> + from datetime import datetime >> + >> + >> + def set_var_field(var, obj, name, package=None): >> + val = None >> + if package: >> + val = d.getVar("%s:%s" % (var, package)) >> + >> + if not val: >> + val = d.getVar(var) >> + >> + if val: >> + setattr(obj, name, val) >> + >> + deploydir = Path(d.getVar("SPDXDEPLOY")) >> + deploy_dir_spdx = Path(d.getVar("DEPLOY_DIR_SPDX")) >> + spdx_workdir = Path(d.getVar("SPDXWORK")) >> + include_sources = d.getVar("SPDX_INCLUDE_SOURCES") == "1" >> + pkg_arch = d.getVar("SSTATE_PKGARCH") >> + is_native = bb.data.inherits_class("native", d) or >> bb.data.inherits_class("cross", d) >> + >> + build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN")) >> + >> + build = build_objset.new_sub_build("recipe", "recipe") >> + build_objset.doc.rootElement.append(build) >> + >> + build_objset.set_is_native(is_native) >> + >> + for var in (d.getVar('SPDX_CUSTOM_ANNOTATION_VARS') or "").split(): >> + new_annotation( >> + d, >> + build_objset, >> + build, >> + "%s=%s" % (var, d.getVar(var)), >> + oe.spdx30.AnnotationType.other >> + ) >> + >> + build_inputs = set() >> + >> + # Add CVEs >> + cve_by_status = {} >> + cve_by_status["Patched"] = {cve: build_objset.new_cve_vuln(cve) for cve >> in oe.cve_check.get_patched_cves(d)} >> + >> + for cve in (d.getVarFlags("CVE_STATUS") or {}): >> + status, _, _ = oe.cve_check.decode_cve_status(d, cve) >> + # Already added above >> + if status == "Patched": >> + continue >> + >> + cve_by_status.setdefault(status, {})[cve] = >> build_objset.new_cve_vuln(cve) >> + >> + cpe_ids = oe.cve_check.get_cpe_ids(d.getVar("CVE_PRODUCT"), >> d.getVar("CVE_VERSION")) > > > This will be adding only those from patches and everything with CVE_STATUS. > In some cases (extra-exclusions) vulnerabilities > that have no real link with the given package. Also, the list isn't complete, > does not contain most of the unpatched ones. > >> +do_create_spdx[vardepsexclude] += "BB_NUMBER_THREADS" >> +# NOTE: depending on do_unpack is a hack that is necessary to get it's >> dependencies for archive the source >> +addtask do_create_spdx after do_populate_sysroot do_package do_packagedata >> do_unpack do_collect_spdx_deps do_package_write_ipk do_pacakge_write_deb >> do_package_write_rpm before do_populate_sdk do_build do_rm_work >> + > > > This is really late... I fear that if we get that in, it will be too hard to > make the generation mode modulable and output > certain profiles at a different place than the whole build. With the same > problem for the world build as we have today > with 2.2. Yep, but I think we need to focus on that after the initial implementation. It's a solvable problem, but it's going to to be tricky so it will take time, and I'd rather get SPDX 3 support out early. > >> +def spdxid_hash(*items): >> + h = hashlib.md5() >> + for i in items: >> + if isinstance(i, oe.spdx30.Element): >> + h.update(i._id.encode("utf-8")) >> + else: >> + h.update(i.encode("utf-8")) >> + return h.hexdigest() >> + > > > Can we avoid MD5? It's not cryptographic, so it should not matter. I don't want the hash strings to be _terribly_ long, so unless you have some other short hashing algorithm, I'd rather stick with this. >> >> + def new_cve_vuln(self, cve): >> + v = oe.spdx30.security_Vulnerability() >> + v._id = self.new_spdxid("vulnerability", cve) >> + v.creationInfo = self.doc.creationInfo >> + >> + v.externalIdentifier.append( >> + oe.spdx30.ExternalIdentifier( >> + externalIdentifierType=oe.spdx30.ExternalIdentifierType.cve, >> + identifier=cve, >> + identifierLocator=[ >> + f"https://cve.mitre.org/cgi-bin/cvename.cgi?name={cve}", >> + f"https://www.cve.org/CVERecord?id={cve}", >> + ], > > > Better to link to the JSON format https://cveawg.mitre.org/api/cve/{cve} Will do, thanks. Should that be a third entry, or the only entry, or.... ? > >> >> + ) >> + ) >> + return self.add(v) >> + >> + def new_vex_patched_relationship(self, from_, to): >> + self.add( >> + oe.spdx30.security_VexFixedVulnAssessmentRelationship( >> + _id=self.new_spdxid("vex-fixed", spdxid_hash(from_, *to)), >> + creationInfo=self.doc.creationInfo, >> + from_=from_, >> + relationshipType=oe.spdx30.RelationshipType.fixedIn, >> + to=to, >> + security_vexVersion=VEX_VERSION, >> + ) >> + ) > > > For machine-readable entries, this should be linking to a proof of a fix, for > example in > a relationship to the fix file. Yes, I think we can do that, but we need cve_check.py to be able to give us the file names (or hashes?) so we can correlate them.... I'll log a bug to do that later. > >> >> + >> + def new_vex_unpatched_relationship(self, from_, to): >> + self.add( >> + oe.spdx30.security_VexAffectedVulnAssessmentRelationship( >> + _id=self.new_spdxid("vex-affected", spdxid_hash(from_, >> *to)), >> + creationInfo=self.doc.creationInfo, >> + from_=from_, >> + relationshipType=oe.spdx30.RelationshipType.affects, >> + to=to, >> + security_vexVersion=VEX_VERSION, >> + ) >> + ) >> + >> + def new_vex_ignored_relationship(self, from_, to, *, impact_statement): >> + self.add( >> + oe.spdx30.security_VexNotAffectedVulnAssessmentRelationship( >> + _id=self.new_spdxid( >> + "vex-not-affected", spdxid_hash(impact_statement, >> from_, *to) >> + ), >> + creationInfo=self.doc.creationInfo, >> + from_=from_, >> + relationshipType=oe.spdx30.RelationshipType.doesNotAffect, >> + to=to, >> + security_vexVersion=VEX_VERSION, >> + security_impactStatement=impact_statement, >> + ) >> + ) > > > Missing conversion of 'Ignored' to 'justificationType' which is necessary to > allow machine processing of VEX statements. With 'impactStatement' only, > which is freeform text, this is human readable only. Ya, I think I can add that based on the not-applicable-platform or not-applicable-config > > That's it in the first pass. There will be more when I can have a look in the > output. > > Regards, > Marta
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#200876): https://lists.openembedded.org/g/openembedded-core/message/200876 Mute This Topic: https://lists.openembedded.org/mt/106602521/21656 Group Owner: openembedded-core+ow...@lists.openembedded.org Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-