Skip to content

Latest commit

 

History

History
306 lines (218 loc) · 21.3 KB

README.md

File metadata and controls

306 lines (218 loc) · 21.3 KB

GitHub Advanced Security - Code Scanning, Secret Scanning & Dependabot Bulk Enablement Tooling

Purpose

The purpose of this tool is to help enable GitHub Advanced Security (GHAS) across multiple repositories in an automated way. There will be times when you need the ability to enable Code Scanning (CodeQL), Secret Scanning, Dependabot Alerts, and/or Dependabot Security Updates across various repositories, and you don't want to click buttons manually or drop a GitHub Workflow for CodeQL into every repository. Doing this is manual and painstaking. The purpose of this utility is to help automate these manual tasks.

Context

The primary motivator for this utility is CodeQL. It is incredibly time-consuming to enable CodeQL across multiple repositories. Additionally, no API allows write access to the .github/workflow/ directory. So this means teams have to write various scripts with varying results. This tool provides a tried and proven way of doing that.

Secret Scanning & Dependabot is also hard to enable if you only want to enable it on specific repositories versus everything. This tool allows you to do that easily.

What does this tooling do?

There are two main actions this tool does:

Part One:

Goes and collects repositories that will have Code Scanning (CodeQL)/Secret Scanning/Secret Scanning Push Protection/Dependabot Alerts/Dependabot Security Updates/Actions enabled. There are three main ways these repositories are collected.

  • Collect the repositories where the primary language matches a specific value. For example, if you provide JavaScript, all repositories will be collected where the primary language is, Javascript.
  • Collect the repositories to which a user has administrative access, or a GitHub App has access.

If you select option 1, the script will return all repositories in the language you specify (which you have access to). The repositories collected from this script are then stored within a repos.json file. If you specify option 2, the script will return all repositories you are an administrator over. The third option is to define the repos.json manually. We don't recommend this, but it's possible. If you want to go down this path, first run one of the above options for collecting repository information automatically, look at the structure, and build your fine of the laid out format.

Part Two:

Loops over the repositories found within the repos.json file and enables Code Scanning(CodeQL)/Secret Scanning/Secret Scanning Push Protection/Dependabot Alerts/Dependabot Security Updates.

  • If you pick Code Scanning:
    • Loops over the repositories found within the repos.json file. A pull request gets created on that repository with the codeql-analysis-${language}.yml found in the bin/workflows directory.
    • The ${language} will be replaced at runtime with the primary language of the repository.
    • For convenience, all pull requests made will be stored within the prs.txt file, where you can see and manually review the pull requests after the script has run.
  • If you pick Secret Scanning:
    • Loops over the repositories found within the repos.json file. Secret Scanning is then enabled on these repositories.
  • If you pick Push Protections:
    • Loops over the repositories found within the repos.json file. Secret Scanning Push Protection is then enabled on these repositories.
  • If you pick Dependabot Alerts:
    • Loops over the repositories found within the repos.json file. Dependabot Alerts is then enabled on these repositories.
  • If you pick Dependabot Security Updates:
    • Loops over the repositories found within the repos.json file. Dependabot Security Updates is then enabled on these repositories.
  • If you pick Actions:
    • Loops over the repositories found within the repos.json file. Actions is then enabled on these repositories.
    • This is useful if you want to ensure that the Code Scanning workflow can run and Actions isn't disabled.
  • If you pick Create Issue:
    • Loops over the repositories found within the repos.json file. An issue will be created with the following text.
    • This alerts repository maintainers that a pull request for CodeQL was created, along with other helpful resources.

Prerequisites

  • Node v20 or higher installed.
  • Yarn*
  • TypeScript
  • Git installed on the (user's) machine running this tool.
  • A Personal Access Token (PAT) that has at least admin access over the repositories they want to enable Code Scanning on or GitHub App credentials which have access to the repositories you want to enable Code Scanning on.
  • Some basic software development skills, e.g., can navigate their way around a terminal or command prompt.
  • You can use npm but for the sake of this README.md; we are going to standardise the commands on yarn. These are easily replaceable though with npm commands.

Set up Instructions

  1. Clone this repository onto your local machine.

    git clone https://github.com/NickLiffen/ghas-enablement.git
  2. Change the directory to the repository you have just installed.

    cd ghas-enablement
  3. Generate your chosen authentication strategy. You are either able to use a GitHub App or a Personal Access Token (PAT). The GitHub App needs to have permissions of read and write of administration, Code scanning alerts, contents, issues, pull requests, workflows. The GitHub PAT needs access to repo, workflow and read:org only. (if you are running yarn run getOrgs you will also need the read:enterprise scope).

  4. Copy the .env.sample to .env. On a Mac, this can be done via the following terminal command:

    cp .env.sample .env
  5. Update the .env with the required values. Please pick one of the authentication methods for interacting with GitHub. You can either fill in the GITHUB_API_TOKEN with a PAT that has access to the Org. OR, fill in all the values required for a GitHub App. Note: It is recommended to pick the GitHub App choice if running on thousands of repositories, as this gives you more API requests versus a PAT.

    • If using a GitHub App, either paste in the value as-is in the APP_PRIVATE_KEY in the field surrounded by double quotes (the key will take up multiple lines), or convert the private key to a single line surrounded in double quotes by replacing the new line character with \n (In VS Code on Mac, you can use ⌃ + Enter to find/replace the new line character)
  6. Update the GITHUB_ORG value found within the .env. Remove the XXXX and replace that with the name of the GitHub Organisation you would like to use as part of this script. NOTE: If you are running this across multiple organisations within an enterprise, you can not set the GITHUB_ORG variable and instead set the GITHUB_ENTERPRISE one with the name of the enterprise. You can then run yarn run getOrgs, which will collect all the organisations dynamically. This will mean you don't have to hardcode one. However, for most use cases, simply hardcoding the specific org within the GITHUB_ORG variable where you would like this script to run will work.

  7. Update the LANGUAGE_TO_CHECK value found within the .env. Remove the XXXX and replace that with the language you would like to use as a filter when collecting repositories. Note: Please make sure these are lowercase values, such as: javascript, typescript, python, go, ruby, c#, c++, java, or kotlin

  8. Decide what you want to enable. Update the ENABLE_ON value to choose what you want to enable on the repositories found within the repos.json. This can be one or multiple values. If you are enabling just code scanning (CodeQL) you will need to set ENABLE_ON=codescanning, if you are enabling everything, you will need to set ENABLE_ON=codescanning,secretscanning,pushprotection,dependabot,dependabotupdates,actions. You can pick one, two or three. The format is a comma-seperated list.

  9. OPTIONAL: Update the CREATE_ISSUE value to true/false depending on if you would like to create an issue explaining the purpose of the PR. We recommend this, as it will help explain why the PR was created; and give some context. However, this is optional. The text which is in the issue can be modified and found here: ./src/utils/text/.

  10. OPTIONAL: If you are a GHES customer, then you will need to set the GHES env to true and then set GHES_SERVER_BASE_URL to the URL of your GHES instance. E.G https://octodemo.com.

  11. OPTIONAL: If you are planning to enable features on an Organization level using the yarn run enableOrg then you additionally have the option ENABLE_ON=...,automatic to set also Automatically enable for new repositories for each product.

  12. If you are enabling Code Scanning (CodeQL), check the codeql-analysis.yml file. This is a sample file; please configure this file to suit your repositories needs.

  13. Run yarn install or npm install, which will install the necessary dependencies.

  14. Run yarn run build or npm run build, which will create the JavaScript bundle from TypeScript.

How to use?

There are two simple steps to run:

Step One

The first step is collecting the repositories you would like to run this script on. You have three options as mentioned above. Option 1 is automated and finds all the repositories within an organisation you have admin access to. Option 2 is automated and finds all the repositories within an organisation based on the language you specify. Or, Option 3, which is a manual entry of the repositories you would like to run this script on. See more information below.

OPTION 1 (Preferred)

# In the `.env` set the `LANGUAGE_TO_CHECK=` to the language. E.G.: `javascript`, `typescript`, `python`, `go`, `ruby`, `c#`, `c++`, `java`, or `kotlin`
yarn run getRepos # or npm run getRepos

When using GitHub Actions, we commonly find (especially for non-build languages such as JavaScript) that the codeql-analysis.yml file is repeatable and consistent across multiple repositories of the same language. About 80% of the time, teams can reuse the same workflow files for the same language. For Java, C++ that number drops down to about 60% of the time. But the reason why we recommend enabling Code Scanning at bulk via language is the codeql-analysis.yml file you propose within the pull request has the highest chance of being most accurate. Even if the file needs changing, the team reviewing the pull request would likely only need to make small changes. We recommend you run this command first to get a list of repositories to enable Code Scanning. After running the command, you are welcome to modify the ./bin/repos.json file. Just make sure it's a valid JSON file before saving.

This script only returns repositories where CodeQL results have not already been uploaded to code scanning. If any CodeQL results have been uploaded to a repositories code scanning feature, that repository will not be returned to this list. The motivation behind this is not to raise pull requests on repositories where CodeQL has already been enabled.

OPTION 2

# In the `.env` leave the `LANGUAGE_TO_CHECK=` empty to pull in all repos
yarn run getRepos # or npm run getRepos

Similar to step one, another automated approach is to enable by user access (i.e., enable for all repositories the user/PAT has administrative access to). This approach will be a little less accurate as the codeql-analysis.yml file will most certainly need changing between a Python project and a Java project (if you are enabling CodeQL). But the file you propose is going to be a good start. After running the command, you are welcome to modify the ./bin/repos.json file. Just make sure it's a valid JSON file before saving.

This script only returns repositories where CodeQL results have not already been uploaded to code scanning. If any CodeQL results have been uploaded to a repositories code scanning feature, that repository will not be returned to this list. The motivation behind this is not to raise pull requests on repositories where CodeQL has already been enabled.

OPTION 3

Create a file called repos.json within the ./bin/ directory. This file needs to have an array of organization objects, each with its own array of repository objects. The structure of the objects should look like this:

[
  {
    "login": "string <org>",
    "repos":
    [
      {
        "createIssue": "boolean",
        "enableCodeScanning": "boolean",
        "enableDependabot": "boolean",
        "enableDependabotUpdates": "boolean",
        "enablePushProtection": "boolean",
        "enableSecretScanning": "boolean",
        "enableActions": "boolean",
        "repo": "string <org/repo>",
      }
    ]
  }
]

As you can see, the object takes a number of boolean keys: createIssue, enableCodeScanning, enableDependabot, enableDependabotUpdates, enablePushProtection, enableSecretScanning, and enableActions along with a single string key, namely, repo. Set repo to the name of the repository name where you would like to run this script. Set enableDependabot to true if you would also like to enable Dependabot Alerts on that repo; set it to false if you do not want to enable Dependabot Alerts. The same goes for enableDependabotUpdates for Dependabot Security Updates, enableSecretScanning for Secret Scanning, pushprotection for Secret Scanning push protection, enableCodeScanning for Code Scanning (CodeQL), and enableActions to enable Actions. Finally set createIssue to true if you would like to create an issue on the repository with the text found in the ./src/utils/text/issueText.ts file to supplement the PR.

NOTE: The account that generated the PAT needs to have write access or higher over any repository that you include within the repos key.

Step Two

Run the script which enables Code Scanning (and/or Dependabot Alerts/Dependabot Security Updates/Secret Scanning) on your repository by running:

yarn run start // or npm run start

This will run a script, and you should see output text appearing on your screen.

After the script has run, please head to your ~/Desktop directory and delete the tempGitLocations directory that has been automatically created.

Running this with the CLI

The cli.sh is a Bash script that wraps the funcitonalities of this tool. It is intented to augment the tool's functionality and make it easier to use in order to achieve a controlled GHAS rollout. The script is located in the root directory of this repository.

The CLI interface helps fulfil the follow use cases:

Use Case 1: Enable GHAS on all repositories in all your organizations

  1. Create a GitHub Personal Access Token (PAT).
  2. Start the CLI by running ./cli.sh in the root directory of this repository.
  3. Select the 5. Configure option to create your .env
  • Fill in the PAT
  • Fill in the GHAS features you would like to enable
  • Fill in the programming languages you want to filter on (optional)
  • If you are using a GitHub Enterprise Server, fill in the URL of your instance (leave empty if you are using GitHub.com)
  • Select your temporary directory (or leave it to the tool to create one)
  1. Select the 1. Get Organizations in your Enterprise option to get a list of all your organizations.
  2. Select the 2. Enable features for Organization - For all repos at once option to enable selected GHAS features.
  3. When asked to select an organization, type all to select all organizations.
  4. Select the 4. Print progress option to check the progress and confirm that all organizations have been processed.

Use Case 2: Enable GHAS on all repositories in a controlled fashion: organization by organization

  1. Create a GitHub Personal Access Token (PAT).
  2. Start the CLI by running ./cli.sh in the root directory of this repository.
  3. Select the 5. Configure option to create your .env
  • Fill in the PAT
  • Fill in the GHAS features you would like to enable
  • Fill in the programming languages you want to filter on (optional)
  • If you are using a GitHub Enterprise Server, fill in the URL of your instance (leave empty if you are using GitHub.com)
  • Select your temporary directory (or leave it to the tool to create one)
  1. Select the 1. Get Organizations in your Enterprise option to get a list of all your organizations.
  2. Select the 2. Enable features for Organization - For all repos at once option to enable selected GHAS features.
  3. When asked to select an organization, type next to see the next 10 organizations that GHAS haven't been enabled on.
  4. Type in the name of the organization you want to enable selected GHAS features on.
  5. Select the 4. Print progress option to check the progress and confirm that all organizations have been processed.
  6. Repeat steps 5-7 until all organizations you have processed all organizations.

Notes:

  • The CLI will automatically create a .env file in the root directory of this repository. This file contains the configuration for the tool. You can edit this file to change the configuration of the tool manually. You don't need to run the 5. Configure option to change the configuration.
  • You can choose to use 3. Enable features for Organization - Per repo instead of the 2. Enable features for Organization - For all repos at once. This option will make more API calls as it enables the features per repository. The result will be the same if you are GHEC user, however if you are a GHES user, you will not be able to enable dependabotupdates with the 3. Enable features for Organization - Per repo.
  • The 2. Enable features for Organization - For all repos at once works for GHES 3.7 and above.
  • Enabling codescanning with the 2. Enable features for Organization - For all repos at once option will use the (Code Scanning Default Setup)[https://github.blog/2023-01-09-default-setup-a-new-way-to-enable-github-code-scanning/]. This is not yet available on GHES.

Running this within a Codespace?

There are some key considerations that you will need to put into place if you are running this script within a GitHub Codespace:

  1. You will need to add the following snippet to the .devcontainer/devcontainer.json:
  "codespaces": {
    "repositories": [
      {
        "name": "<ORG_NAME>/*",
        "permissions": "write-all"
      }
    ]
  }

The reason you need this within your .devcontainer/devcontainer.json file is the GITHUB_TOKEN tied to the Codespace will need to access other repositories within your organisation which this script may interact with. You will need to create a new Codespace after you have added the above and pushed it to your repository.

You do not need to do the above if you are not running it from a Codespace.

Running as a (scheduled) GitHub workflow

Since this tool uses a PAT or GitHub App Authentication wherever authentication is required, it can be run unattended. You can see in the example below how you could run the tool in a scheduled GitHub workflow. Instead of using the .env file you can configure all the variables from the .env.sample directly as environment variables. This will allow you to (easily) make use of GitHub action secrets for the PAT or GitHub App credentials.

on:
  schedule:
    - cron: "5 16 * * 1"

env:
  APP_ID: ${{ secrets.GHAS_ENABLEMENT_APP_ID }}
  APP_CLIENT_ID: ${{ secrets.GHAS_ENABLEMENT_APP_CLIENT_ID }}
  APP_CLIENT_SECRET: ${{ secrets.GHAS_ENABLEMENT_APP_CLIENT_SECRET }}
  APP_PRIVATE_KEY: ${{ secrets.GHAS_ENABLEMENT_APP_PRIVATE_KEY }}
  ENABLE_ON: "codescanning,secretscanning,pushprotection,dependabot,dependabotupdates,actions"
  DEBUG: "ghas:*"
  CREATE_ISSUE: "false"
  GHES: "false"
  # Organization specific variables
  APP_INSTALLATION_ID: "12345678"
  GITHUB_ORG: "my-target-org"

jobs:
  enable-security-javascript:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          repository: NickLiffen/ghas-enablement
      - name: Get dependencies and configure
        run: |
          yarn
          git config --global user.name "ghas-enablement"
          git config --global user.email "[email protected]"
      - name: Enable security on organization (javascript)
        run: |
          npm run getRepos
          npm run start
        env:
          LANGUAGE_TO_CHECK: "javascript"
          TEMP_DIR: ${{ github.workspace }}

You can duplicate the last step for the other languages commonly used within your enterprise/organisation. If you didn't configure the tool as a GitHub App, you can remove all the APP_* and set GITHUB_API_TOKEN instead. Above we rely on the sample codeql file for javascript included in this repository. Alternatively you could add this workflow to a repository containing your customized codeql files and use those to overwrite the samples.

Found an Issue?

Create an issue within the repository and make it to @nickliffen. Key things to mention within your issue:

  • Windows, Linux, Codespaces or Mac
  • What version of NodeJS you are running.
  • Add any logs that appeared when you ran into the issue.

Want to Contribute?

Great! Open an issue, describe what feature you want to create and make sure to @nickliffen.