-
Notifications
You must be signed in to change notification settings - Fork 65
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Blog posts for OSSF Scorecard launch (#363)
Signed-off-by: John McBride <[email protected]>
- Loading branch information
Showing
6 changed files
with
363 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
--- | ||
title: "Introducing OpenSSF Scorecard for OpenSauced" | ||
tags: ["open source security foundation", "openssf", "openssf scorecard", "open source", "open source compliance", "open source security"] | ||
authors: jpmcb | ||
slug: introducing-ossf-scorecard | ||
description: "Learn how OpenSauced integrates OpenSSF Scorecard to enhance open source security and compliance." | ||
--- | ||
|
||
In September of 2022, the European Parliament introduced the [“Cyber Resilience Act”](https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act), | ||
commonly called the CRA: a new piece of legislation that requires anyone providing | ||
digital products in the EU to meet certain security and compliance requirements. | ||
|
||
<!-- truncate --> | ||
|
||
But there’s a catch: before the CRA, companies providing or distributing software | ||
would often need to take on much of the risk when ensuring safe and reliable software | ||
was being shipped to end users. Now, software maintainers further down the supply | ||
chain will have to carry more of that weight. Not only may certain open source | ||
maintainers need to meet certain requirements, but they may have to provide an | ||
up to date security profile of their project. | ||
|
||
[As the Linux Foundation puts it](https://www.linuxfoundation.org/blog/understanding-the-cyber-resilience-act): | ||
|
||
> The Act shifts much of the security burden onto those who develop software, | ||
as opposed to the users of software. This can be justified by two assumptions: | ||
first, software developers know best how to mitigate vulnerabilities and distribute | ||
patches; and second, it’s easier to mitigate vulnerabilities at the source than | ||
requiring users to do so. | ||
|
||
There’s a lot to unpack in the CRA. And it’s still not clear how individual open | ||
source projects, maintainers, foundations, or companies will be directly impacted | ||
But, it’s clear that the broader open source ecosystem needs easier ways to understand | ||
the security risk of projects deep within dependency chains. With all that in mind, | ||
we are very excited to introduce the OpenSSF Scorecard ratings within the OpenSauced | ||
platform. | ||
|
||
## What is the OpenSSF Scorecard? | ||
|
||
The OpenSSF is [the Open Source Security Foundation](https://openssf.org/): a multidisciplinary group of | ||
software developers, industry leaders, security professionals, researchers, and | ||
government liaisons. The OpenSSF aims to enable the broader open source ecosystem | ||
“to secure open source software for the greater public good.” They interface with | ||
critical personnel across the software industry to fight for a safer technological | ||
future. | ||
|
||
[The OpenSSF Scorecard project](https://github.com/ossf/scorecard) is an effort | ||
to unify what best practices open source maintainers and consumers should use to | ||
judge if their code, practices, and dependencies are safe. Ultimately, the “scorecard” | ||
command line interface gives any the capability to inspect repositories, run “checks” | ||
against those repos, and derive an overall score for the risk profile of that project. | ||
It’s a very powerful software tool that gives you a general picture of where a piece | ||
of software is considered risky. It can also be a great starting point for any open | ||
source maintainer to develop better practices and find out where they may need to | ||
make improvements. By providing a standardized approach to assessing open source | ||
security and compliance, the Scorecard helps organizations more easily identify | ||
supply chain risks and regulatory requirements. | ||
|
||
## OpenSauced OpenOSSF Scorecards | ||
|
||
Using the scorecard command line interface as a cornerstone, we’ve built infrastructure | ||
and tooling to enable OpenSauced to capture scores for nearly all repositories on | ||
GitHub. Anything over a 6 or a 7 is generally considered safe to use with no blaring | ||
issues. Scores of 9 or 10 are doing phenomenally well. And projects with lower scores | ||
should be inspected closely to understand what’s gone wrong. | ||
|
||
Scorecards are enabled across all repositories. With this integration, we aim to | ||
make it easier for software maintainers to understand the security posture of their | ||
project and for software consumers to be assured that their dependencies are safe | ||
to use. | ||
|
||
Starting today, you can see the score for any project within individual [Repository Pages](https://opensauced.pizza/docs/features/repo-pages/). | ||
For example, in [kubernetes/kubernetes](https://app.opensauced.pizza/s/kubernetes/kubernetes), | ||
we can see the project is safe for use: | ||
|
||
![Kubernetes Scorecard](../../static/img/kubernetes-scorecard.png) | ||
|
||
Let’s look at another example: [crossplane/crossplane](https://app.opensauced.pizza/s/crossplane/crossplane). | ||
These maintainers are doing an awesome job of ensuring they are following best | ||
practices for open source security and compliance!! | ||
|
||
![Crossplan Scorecard](../../static/img/crossplane-scorecard.png) | ||
|
||
The checks that the OpenSSF Scorecard looks for involves a wide range of common | ||
open source security practices, both “in code” and with the maintenance of the | ||
project: e.g. checking for code review best practices, if there are “dangerous | ||
workflows” present (like untrusted code being run and checked out during CI/CD runs), | ||
if the project is actively maintained, the use of signed releases, and many more. | ||
|
||
## The Future of OpenSSF Scorecards at OpenSauced | ||
|
||
We plan to bring the OpenSSF Scorecard to more of the OpenSauced platform, as we | ||
aim to be the definitive place for open source security and compliance for maintainers | ||
and consumers. As part of that, we’ll be bringing more details to the OpenSSF Scorecard | ||
with how individual checks are ranked: | ||
|
||
![Future Scorecard](../../static/img/future-scorecard.png) | ||
|
||
We’ll also be bringing OpenSSF Scorecard to our premium offering, [Workspaces](https://opensauced.pizza/docs/features/workspaces/): | ||
|
||
![Bottlerocket Scorecard Workspace](../../static/img/future-scorecard-workspaces.png) | ||
|
||
Within a Workspace, you’ll soon be able to get an idea of how each of the projects | ||
you are tracking stack up alongside each other's score for open source security and | ||
compliance. You can use the OpenSSF Score together with all the Workspace insights | ||
and metrics, all in one single dashboard, to get a good idea of what’s happening within | ||
a set of repositories and what their security posture is. In this example, I’m tracking | ||
all the repositories within the bottlerocket-os org on GitHub, a security focused | ||
Linux based operating system: I can see that each of the repositories has a good | ||
rating which gives me greater confidence in the maintenance status and security | ||
posture of this ecosystem. This also enables stakeholders and maintainers of Bottlerocket | ||
to have a birds eye snapshot of the compliance and maintenance status of the | ||
entire org. | ||
|
||
As the CRA and similar regulations push more of the security burden onto developers, | ||
tools like the OpenSSF Scorecard become invaluable. They offer a standardized, accessible | ||
way to assess and improve the security of open source projects, helping maintainers | ||
meet new compliance requirements and giving software consumers confidence in their | ||
choices. | ||
|
||
Looking ahead, we're committed to expanding these capabilities at OpenSauced. By | ||
providing comprehensive security insights, from individual repository scores to | ||
organization-wide overviews in Workspaces, we're working to create a more secure | ||
and transparent open source ecosystem, to enable anyone in the open source community | ||
to better understand their software dependencies, feel empowered to make a meaningful | ||
change if needed, and provide helpful tools to open source maintainers to better | ||
maintain their projects. | ||
|
||
Stay saucy! |
235 changes: 235 additions & 0 deletions
235
blog/2024/2024-08-08-ossf-scorecard-technical-deep-dive.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,235 @@ | ||
--- | ||
title: "Using Kubernetes jobs to scale OpenSSF Scorecard" | ||
tags: ["open source security foundation", "openssf", "openssf scorecard", "open source", "open source compliance", "open source security", "kubernetes", "kubernetes jobs"] | ||
authors: jpmcb | ||
slug: ossf-scorecard-technical-deep-dive | ||
description: "Learn how OpenSauced uses Kubernetes to scale the OpenSSF Scorecard." | ||
unlisted: true | ||
--- | ||
|
||
We recently released integrations with the [OpenSSF Scorecard on the OpenSauced platform](https://opensauced.pizza/blog/introducing-ossf-scorecard). | ||
The OpenSSF Scorecard is a powerful Go command line interface that anyone can use | ||
to begin understanding the security of their projects and dependencies. It runs | ||
several checks for dangerous workflows, CICD best practices, if the project is | ||
still maintained, and much more. This enables software builders and consumers to | ||
understand their security posture, deduce if a project is safe to use, and where | ||
improvements to security practices need to be made. | ||
|
||
<!-- truncate --> | ||
|
||
But one of our goals with integrating the OpenSSF Scorecard into the OpenSauced | ||
platform was to make this available to the broader open source ecosystem at large. | ||
If it’s a repository on GitHub, we wanted to be able to display a score for it. | ||
This meant scaling the Scorecard CLI to target nearly any repository on GitHub. | ||
Much easier said than done! | ||
|
||
In this blog post, let’s dive into how we did that using Kubernetes and what technical | ||
decisions we made with implementing this integration. | ||
|
||
## Technical decisions | ||
|
||
We knew that we would need to build a cron type microservice that would frequently | ||
update scores across a myriad of repositories: the true question was how we would | ||
do that. It wouldn't make sense to run the scorecard CLI ad-hoc: the platform could | ||
too easily get overwhelmed and we wanted to be able to do deeper analysis on scores | ||
across the open source ecosystem, even if the OpenSauced repo page hasn’t been | ||
visited recently. Initially, we looked at using the Scorecard Go library as direct | ||
dependent code and running scorecard checks within a single, monolithic microservice. | ||
We also considered using serverless jobs to run one off scorecard containers that | ||
would give back the results for individual repositories. | ||
|
||
The approach we ended up landing on, which marries simplicity, flexibility, and | ||
power, is to use Kubernetes Jobs at scale, all managed by a “scheduler” Kubernetes | ||
controller microservice. Instead of building a deeper code integration with scorecard, | ||
running one off Kubernetes Jobs gives us the same benefits of using a serverless approach, | ||
but with reduced cost since we’re managing it all directly on our Kubernetes cluster. | ||
Jobs also offer alot of flexibility in how they run: they can have long, extended | ||
timeouts, they can use disk, and like any other Kubernetes paradigm, they can have | ||
multiple pods doing different tasks. | ||
|
||
Let’s break down the individual components of this system and see how they work | ||
in depth: | ||
|
||
## Building the Kubernetes controller | ||
|
||
The first and biggest part of this system is the “scorecard-k8s-scheduler”; a Kubernetes | ||
controller-like microservice that kicks off new jobs on-cluster. While this microservice | ||
follows many of the principles, patterns, and methods used when building a traditional | ||
Kubernetes controller or operator, it does not watch for or mutate custom resources | ||
on the cluster. Its function is to simply kick off Kubernetes Jobs that run the Scorecard | ||
CLI and gather finished job results. | ||
|
||
Let’s look first at the main control loop in the Go code. This microservice uses | ||
the Kubernetes Client-Go library to interface directly with the cluster the microservice | ||
is running on: this is often referred to as an on-cluster config and client. Within | ||
the code, after bootstrapping an on-cluster config and client, we poll for repositories | ||
in our database that need updating. Once some repos are found, we kick off Kubernetes | ||
jobs on individual worker “threads” that will wait for each job to finish. | ||
|
||
```go | ||
// buffered channel, sort of like semaphores, for threaded working | ||
sem := make(chan bool, numConcurrentJobs) | ||
|
||
// continuous control loop | ||
for { | ||
// blocks on getting semaphore off buffered channel | ||
sem <- true | ||
|
||
go func() { | ||
// release the hold on the channel for this Go routine when done | ||
defer func() { | ||
<-sem | ||
}() | ||
|
||
// grab repo needing update, start scorecard Kubernetes Job on-cluster, | ||
// wait for results, etc. etc. | ||
|
||
// sleep the configured amount of time to relieve backpressure | ||
time.Sleep(backoff) | ||
}() | ||
} | ||
``` | ||
|
||
This “infinite control loop” method, with a buffered channel, is a common way in | ||
Go to continuously do something but only using a configured number of threads. | ||
The number of concurrent Go funcs that are running at any one given time depends | ||
on what configured value the “numConcurrentJobs” variable has. This sets up the | ||
buffered channel to act as a worker pool or semaphore which denotes the number of | ||
concurrent Go funcs running at any one given time. Since the buffered channel is | ||
a shared resource that all threads can use and inspect, I often like to think of | ||
this as a semaphore: a resource, much like a mutex, that multiple threads can attempt | ||
to lock on and access. In our production environment, we’ve scaled the number of | ||
threads in this scheduler all running at once. Since the actual scheduler isn’t | ||
very computationally heavy and will just kick off jobs and wait for results to eventually | ||
surface, we can push the envelope of what this scheduler can manage. We also have | ||
a built-in backoff system that attempts to relieve pressure when needed: this system | ||
will increment the configured “backoff” value if there are errors or if there are | ||
no repos found to go calculate the score for. This ensures we’re not continuously | ||
slamming our database with queries and the scorecard scheduler itself can remain | ||
in a “waiting” state, not taking up precious compute resources on the cluster. | ||
|
||
Within the control loop, we do a few things: first, we query our database for repositories | ||
needing their scorecard updated. This is a simple database query that is based on | ||
some timestamp metadata we watch for and have indexes on. Once a configured amount | ||
of time passes since the last score was calculated for a repo, it will bubble up | ||
to be crunched by a Kubernetes Job running the Scorecard CLI. | ||
|
||
## Kicking off Scorecard jobs | ||
|
||
Next, once we have a repo to get the score for, we kick off a Kubernetes Job using | ||
the “gcr.io/openssf/scorecard” image. Bootstrapping this job in Go code using Client-Go | ||
looks very similar to how it would look with yaml, just using the various libraries | ||
and apis available via “k8s.io” imports and doing it programmatically: | ||
|
||
```go | ||
// defines the Kubernetes Job and its spec | ||
job := &batchv1.Job{ | ||
// structs and details for the actual Job including metav1.ObjectMeta and batchv1.JobSpec | ||
} | ||
|
||
// create the actual Job on cluster using the in-cluster config and client | ||
return s.clientset.BatchV1().Jobs(ScorecardNamespace).Create(ctx, job, metav1.CreateOptions{}) | ||
``` | ||
|
||
After the job is created, we wait for it to signal it has completed or errored. | ||
Much like with kubectl, Client-Go offers a helpful way to “watch” resources and | ||
observe their state when they change: | ||
|
||
```go | ||
// watch selector for the job name on cluster | ||
watch, err := s.clientset.BatchV1().Jobs(ScorecardNamespace).Watch(ctx, metav1.ListOptions{ | ||
FieldSelector: "metadata.name=" + jobName, | ||
}) | ||
|
||
// continuously pop off the watch results channel for job status | ||
for event := range watch.ResultChan() { | ||
// wait for job success, error, or other states | ||
} | ||
``` | ||
|
||
Finally, once we have a successful job completion, we can grab the results from | ||
the Job’s pod logs which will have the actual json results from the scorecard | ||
CLI! Once we have those results, we can upsert the scores back into the database | ||
and mutate any necessary metadata to signal to our other microservices or the | ||
OpenSauced API that there’s a new score! | ||
|
||
As mentioned before, the scorecard-k8s-scheduler can have any number of concurrent | ||
jobs running at once: in our production setting we have a large number of jobs running | ||
at once, all managed by this microservice. The intent is to be able to update scores | ||
every 2 weeks across all repositories on GitHub. With this kind of scale, we hope | ||
to be able to provide powerful tooling and insights to any open source maintainer | ||
or consumer! | ||
|
||
## Role-based access control | ||
|
||
The “scheduler” microservice ends up being a small part of this whole system: anyone | ||
familiar with Kubernetes controllers knows that there are additional pieces of Kubernetes | ||
infrastructure that are needed to make the system work. In our case, we needed some | ||
role-based access control (RBAC) to enable our microservice to create Jobs on the cluster. | ||
|
||
First, we need a service account: this is the account that will be used by the | ||
scheduler and have access controls bound to it: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: ServiceAccount | ||
metadata: | ||
name: scorecard-sa | ||
namespace: scorecard-ns | ||
``` | ||
We place this service account in our “scorecard-ns” namespace where all this runs. | ||
Next, we need to have a role and role binding for the service account. This includes | ||
the actual access controls (including being able to create Jobs, view pod logs, etc.) | ||
```yaml | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: Role | ||
metadata: | ||
name: scorecard-scheduler-role | ||
namespace: scorecard-ns | ||
rules: | ||
- apiGroups: ["batch"] | ||
resources: ["jobs"] | ||
verbs: ["create", "delete", "get", "list", "watch", "patch", "update"] | ||
- apiGroups: [""] | ||
resources: ["pods", "pods/log"] | ||
verbs: ["get", "list", "watch"] | ||
|
||
--- | ||
|
||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: RoleBinding | ||
metadata: | ||
name: scorecard-scheduler-role-binding | ||
namespace: scorecard-ns | ||
subjects: | ||
- kind: ServiceAccount | ||
name: scorecard-sa | ||
namespace: scorecard-ns | ||
roleRef: | ||
kind: Role | ||
name: scorecard-scheduler-role | ||
apiGroup: rbac.authorization.k8s.io | ||
``` | ||
You might be asking yourself “Why do I need to give this service account access | ||
to get pods and pod logs? Isn’t that an over extension of the access controls?” | ||
Remember! Jobs have pods and in order to get the pod logs that have the actual | ||
results of the scorecard CLI, we must be able to list the pods from a job and then | ||
read their logs! | ||
The second part of this, the “RoleBinding”, is where we actually attach the Role | ||
to the service account. This service account can then be used when kicking off | ||
new jobs on the cluster. | ||
All in all, this architecture allows us to use the flexibility and power of serverless like setups, | ||
but it still takes advantage of the cost savings and existing infrastructure we have | ||
with Kubernetes. Using existing paradigms and components can be a great way to unlock | ||
existing capabilities you already have within your platform of choice! | ||
Huge shout out to [Alex Ellis](https://github.com/alexellis) and his excellent [run-job controller](https://github.com/alexellis/run-job): | ||
this was a huge inspiration and reference for correctly using Client-Go with Jobs! | ||
Stay saucy! |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.