From b492079c33b4179d8d4153da6106a958c23a3400 Mon Sep 17 00:00:00 2001 From: Seth Michael Larson Date: Thu, 30 Jan 2025 16:38:30 -0600 Subject: [PATCH] PEP 770: Add sections for Users, Projects, and SCA tools in 'How to Teach' (#4222) --- peps/pep-0770.rst | 141 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 114 insertions(+), 27 deletions(-) diff --git a/peps/pep-0770.rst b/peps/pep-0770.rst index c7f50a17381..7c25d862e7b 100644 --- a/peps/pep-0770.rst +++ b/peps/pep-0770.rst @@ -15,14 +15,19 @@ Post-History: Abstract ======== +Almost all Python packages today are accurately measurable by software +composition analysis (SCA) tools. For projects that are not accurately +measurable, there is no existing mechanism to annotate a Python package +with composition data to improve measurability. + Software Bill-of-Materials (SBOM) is a technology-and-ecosystem-agnostic method for describing software composition, provenance, heritage, and more. -SBOMs are used as inputs for software composition analysis (SCA) tools, -such as scanners for vulnerabilities and licenses, and have been gaining -traction in global software regulations and frameworks. +SBOMs are used as inputs for SCA tools, such as scanners for vulnerabilities and +licenses, and have been gaining traction in global software regulations and +frameworks. This PEP proposes using SBOM documents included in Python packages as a -means to improve software measurability for Python packages. +means to improve automated software measurability for Python packages. The changes will update the `Core Metadata specification `__ to version 2.5. @@ -141,37 +146,60 @@ In addition to the above, an informational PEP will be created for tools consuming included SBOM documents and other Python package metadata to generate complete SBOM documents for Python packages. +Terminology +----------- + +This section describes terminology used later in the document: + +.. glossary:: + + root SBOM directory + The directory under which SBOM files are stored in a + :term:`project source tree`, :term:`distribution archive` + or :term:`installed project`. + Also, the root directory that their paths + recorded in the :ref:`Sbom-File <770-spec-sbom-file-field>` + :term:`Core Metadata field` are relative to. + Defined to be the :term:`project root directory` + for a :term:`project source tree` or + :term:`source distribution `; + and a subdirectory named ``sboms`` of + the directory containing the :term:`built metadata`— + i.e., the ``.dist-info/sboms`` directory— + for a :term:`Built Distribution` or :term:`installed project`. + .. _770-spec-core-metadata: Core Metadata ------------- +.. _770-spec-sbom-file-field: + Add ``Sbom-File`` field ~~~~~~~~~~~~~~~~~~~~~~~ -The ``Sbom-File`` is an optional Core Metadata field. Each instance contains a -string representation of the path of an SBOM document. The path is located -within the project source tree, relative to the project root directory. It is a +The ``Sbom-File`` is a new optional Core Metadata field. Each instance contains a +string representation of the path to an SBOM document. The path is specified +relative to the :term:`root SBOM directory` for all project types. It is a multi-use field that MAY appear zero or more times and each instance lists the path to one such file. Files specified under this field are SBOM documents that are distributed with the package. As `specified by this PEP <#770-spec-project-formats>`__, its value is also -that file's path relative to the root SBOM directory in both installed projects -and the standardized Distribution Package types. +that file's path relative to the :term:`root SBOM directory` in both installed +projects and the standardized Distribution Package types. If an ``Sbom-File`` is listed in a :term:`Source Distribution ` or :term:`Built Distribution`'s Core Metadata: * That file MUST be included in the :term:`distribution archive` at the - specified path relative to the root SBOM directory. + specified path relative to the :term:`root SBOM directory`. * Installers MUST install the file with the :term:`project` at that same relative path. -* Inside the root SBOM directory, packaging tools MUST reproduce the directory - structure under which the source files are located relative to the project - root. The root SBOM directory is - `specified in a later section <#770-spec-project-formats>`__. +* Inside the :term:`root SBOM directory`, packaging tools MUST reproduce the + directory structure under which the source files are located relative to the + project root. * Path delimiters MUST be the forward slash character (``/``), and parent directory indicators (``..``) MUST NOT be used. @@ -191,10 +219,10 @@ This PEP specifies changes to the project's source metadata under a Add ``sbom-files`` key ~~~~~~~~~~~~~~~~~~~~~~ -A new ``sbom-files`` key is added to the ``[project]`` table for specifying -paths in the project source tree relative to ``pyproject.toml`` to file(s) -containing SBOMs to be distributed with the package. This key corresponds to the -``Sbom-File`` fields in the Core Metadata. +A new optional ``sbom-files`` key is added to the ``[project]`` table for +specifying paths in the project source tree relative to ``pyproject.toml`` to +file(s) containing SBOMs to be distributed with the package. This key +corresponds to the ``Sbom-File`` fields in the Core Metadata. Its value is an array of strings which MUST contain valid glob patterns, as specified below: @@ -371,18 +399,76 @@ of this standard. The details of this standard are most important to either maintainers of Python packages and developers of SCA tools such as SBOM generation tools and vulnerability scanners. -Most Python packages don't contain code from other software components and thus -are already measurable by SCA tools without the need of this standard or -additional SBOM documents. Pure-Python packages are about `~90% `__ -of popular packages on PyPI. +What do Python package maintainers need to know? +------------------------------------------------ + +Python package metadata can already describe the top-level software included in +a package archive, but what if a package archive contains other software +components beyond the top-level software? For example, the Python wheel for +"Pillow" contains a handful of other software libraries bundled inside, like +``libjpeg``, ``libpng``, ``libwebp``, and so on. This scenario is where this PEP +is most useful, for adding metadata about bundled software to a Python package. + +Some build tools may be able to automatically annotate bundled dependencies. +Typically tools can automatically annotate bundled dependencies when those +dependencies come from a "packaging ecosystem" (such as PyPI, Linux distros, +Crates.io, NPM, etc). + +For packages which cannot be automatically annotated and if the package author +wishes to provide an SBOM the approach will be to generate or author SBOM files +and then include those files using ``pyproject.toml``: + + .. code-block:: toml + + [project] + ... + sbom-files = [ + "sboms/bom.cdx.json" + ] + +For projects manually specifying an SBOM document the challenge will be +keeping the document up-to-date. The CPython project has some +`customized tooling `__ +for this task, but it can likely be generalized into a tool reusable by other +projects. + +What do SBOM tool authors need to know? +--------------------------------------- -For projects that do contain other software components, documentation will be -added to the Python Packaging User Guide for how to specify and maintain -SBOM documents for Python packages in source code. +Developers of SBOM generation tooling will need to know about the existence +of this PEP and that Python packages may begin publishing SBOM documents +within package archives. This information needs to be included as a part of +generating an SBOM document for a particular Python package or Python +environment. A follow-up informational PEP will be authored to describe how to transform Python packaging metadata, including the mechanism described in this PEP, -into an SBOM document describing Python packages. +into an SBOM document describing Python packages. Once the informational PEP is +complete, tracking issues will be opened specifically linking to the +informational PEP to spur the adoption of PEP 770 by SBOM tools. + +A `benchmark is being created `__ +to compare the outputs of different SBOM tools when run with various Python +packaging inputs (package archive, installed package, environment, container +image) is being created to track the progress of different SBOM generation +tools. This benchmark will inform where tools have gaps in support +of this PEP and Python packages. + +What do users of SBOM documents need to know? +--------------------------------------------- + +Many users of this PEP won't know of its existence, instead their software +composition analysis tools, SBOM tools, or vulnerability scanners will simply +begin giving more comprehensive information after an upgrade. For users that are +interested in the sources of this new information, the "tool" field of SBOM +metadata already provides linkages to the projects generating their SBOMs. + +For users who need SBOM documents describing their open source dependencies the +first step should always be "create them yourself". Using the benchmarks above +a list of tools that are known to be accurate for Python packages can be +documented and recommended to users. For projects which require +additional manual SBOM annotation: tips for contributing this data and tools for +maintaining the data can be recommended. Reference Implementation ======================== @@ -433,7 +519,8 @@ Open Issues Conditional project source SBOM files ------------------------------------- -How can a project specify an SBOM file that is conditional? Under what circumstances would an SBOM document be conditional? +How can a project specify an SBOM file that is conditional? Under what +circumstances would an SBOM document be conditional? References ==========