Skip to content

Commit

Permalink
Add enterprise-wide code scanning alerts for Enterprise Server and GH…
Browse files Browse the repository at this point in the history
…AE (#3)

* start work on ghes/ghae support

* add csv files to gitignore

* add enterprise report function

* add enterprise-scope code scanning reporting

* update readme

* add dependency review check

* mess with line length in linter

* mess with linter

* still messing with linter
  • Loading branch information
some-natalie authored May 10, 2022
1 parent d15982a commit bc9ffe2
Show file tree
Hide file tree
Showing 8 changed files with 217 additions and 42 deletions.
2 changes: 1 addition & 1 deletion .github/linters/.markdownlint.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"MD013": false,
"line-length": false,
"MD033": { "allowed_elements": ["br"] }
}
14 changes: 14 additions & 0 deletions .github/workflows/dependency-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: "Dependency Review"
on: [pull_request]

permissions:
contents: read

jobs:
dependency-review:
runs-on: ubuntu-latest
steps:
- name: "Checkout Repository"
uses: actions/checkout@v3
- name: "Dependency Review"
uses: actions/dependency-review-action@v1
3 changes: 2 additions & 1 deletion .github/workflows/linter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,13 @@ jobs:
# Run Linter against code base #
################################
- name: Lint Code Base
uses: github/super-linter@v4
uses: github/super-linter/slim@v4
env:
VALIDATE_ALL_CODEBASE: false
DEFAULT_BRANCH: main
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
VALIDATE_DOCKERFILE_HADOLINT: true
VALIDATE_GITHUB_ACTIONS: true
VALIDATE_MARKDOWN: true
MARKDOWN_CONFIG_FILE: .markdownlint.json
VALIDATE_PYTHON_BLACK: true
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,6 @@ dmypy.json

# Notes, etc.
swap.md

# CSV files
*.csv
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ An example of use is below. Note that the custom inputs, such as if you are wan

```yaml
- name: CSV export
uses: some-natalie/ghas-to-csv@v0.2.0
uses: some-natalie/ghas-to-csv@v0.3.0
env:
GITHUB_PAT: ${{ secrets.PAT }} # if you need to set a custom PAT
- name: Upload CSV
Expand All @@ -43,21 +43,17 @@ An example of use is below. Note that the custom inputs, such as if you are wan
if-no-files-found: error
```
## But it doesn't do THIS THING
The API docs are [here](https://docs.github.com/en/enterprise-cloud@latest) and pull requests are welcome! :heart:
## Reporting
| | GitHub Enterprise Cloud | GitHub Enterprise Server (3.4) | GitHub AE (M2) | Notes |
| --- | --- | --- | --- | --- |
| Secret scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:white_check_mark: Enterprise | :white_check_mark: Repo<br>:white_check_mark: Org<br>:white_check_mark: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/secret-scanning) |
| Code scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/code-scanning) |
| Code scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:curly_loop: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:curly_loop: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/code-scanning) |
| Dependabot | :x: | :x: | :x: | Waiting on [this API](https://github.com/github/roadmap/issues/495) to :ship: |
:information_source: All of this reporting requires either public repositories or a GitHub Advanced Security license.
:information_source: Any item with a :curly_loop: needs some looping logic, since repositories are supported and not higher-level ownership (like orgs or enterprises). How this looks won't differ much between GHAE or GHES. In both cases, you'll need an enterprise admin PAT to access the `all_organizations.csv` or `all_repositories.csv` report from `stafftools/reports`, then looping over it in the appropriate scope. That will tell you about the existence of everything, but not give you permission to access it. To do that, you'll need to use `ghe-org-admin-promote` in GHES ([link](https://docs.github.com/en/enterprise-server@3.4/admin/configuration/configuring-your-enterprise/command-line-utilities#ghe-org-admin-promote))
:information_source: Any item with a :curly_loop: needs some looping logic, since repositories are supported and not higher-level ownership (like orgs or enterprises). How this looks won't differ much between GHAE or GHES. In both cases, you'll need an enterprise admin PAT to access the `all_organizations.csv` or `all_repositories.csv` report from `stafftools/reports`, then looping over it in the appropriate scope. That will tell you about the existence of everything, but not give you permission to access it. To do that, you'll need to use `ghe-org-admin-promote` in GHES ([link](https://docs.github.com/en/enterprise-server@latest/admin/configuration/configuring-your-enterprise/command-line-utilities#ghe-org-admin-promote)) to own all organizations within the server.

## Using this with Flat Data

Expand All @@ -79,7 +75,7 @@ jobs:
- name: Check out repo
uses: actions/checkout@v3
- name: CSV export
uses: some-natalie/ghas-to-csv@v0.2.0
uses: some-natalie/ghas-to-csv@v0.3.0
env:
GITHUB_PAT: ${{ secrets.PAT }} # needed if not running against the current repository
SCOPE_NAME: "OWNER-NAME/REPO-NAME" # repository name, needed only if not running against the current repository
Expand Down Expand Up @@ -121,6 +117,10 @@ jobs:
nginx-pid/
```

## Notes
## But it doesn't do THIS THING

The API docs are [here](https://docs.github.com/en/enterprise-cloud@latest) and pull requests are welcome! :heart:

## Other notes

[GitHub Copilot](https://copilot.github.com/) wrote most of the Python code in this project. I mostly just structured the files/functions, wrote some docstrings, accounted for the differences in API versions across the products, and edited what it gave me. :heart:
14 changes: 13 additions & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"""

# Import modules
from src import code_scanning, secret_scanning
from src import code_scanning, enterprise, secret_scanning
import os

# Read in config values
Expand All @@ -26,6 +26,11 @@
else:
api_endpoint = os.environ.get("GITHUB_API_ENDPOINT")

if os.environ.get("GITHUB_SERVER_URL") is None:
url = "https://github.com"
else:
url = os.environ.get("GITHUB_SERVER_URL")

if os.environ.get("GITHUB_PAT") is None:
github_pat = os.environ.get("GITHUB_TOKEN")
else:
Expand All @@ -49,6 +54,13 @@
api_endpoint, github_pat, scope_name
)
secret_scanning.write_enterprise_secrets_list(secrets_list)
# code scanning
if enterprise.get_enterprise_version(api_endpoint) != "GHEC":
repo_list = enterprise.get_repo_report(url, github_pat)
cs_list = code_scanning.list_enterprise_code_scanning_alerts(
api_endpoint, github_pat, repo_list
)
code_scanning.write_enterprise_cs_list(cs_list)
elif report_scope == "organization":
# code scanning
cs_list = code_scanning.list_org_code_scanning_alerts(
Expand Down
153 changes: 123 additions & 30 deletions src/code_scanning.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,26 +22,20 @@ def list_repo_code_scanning_alerts(api_endpoint, github_pat, repo_name):
url = "{}/repos/{}/code-scanning/alerts?per_page=100&page=1".format(
api_endpoint, repo_name
)
response = requests.get(
url,
headers={
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
},
)
headers = {
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(url, headers=headers)
if response.status_code == 404:
return "need permission to access,{}".format(repo_name) # don't have permission
if response.status_code == 403:
return "need to enable GHAS,{}".format(repo_name) # no GHAS
response_json = response.json()
while "next" in response.links.keys():
response = requests.get(
response.links["next"]["url"],
headers={
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
},
)
response = requests.get(response.links["next"]["url"], headers=headers)
response_json.extend(response.json())

print("Found {} code scanning alerts in {}".format(len(response_json), repo_name))

# Return code scanning alerts
return response_json

Expand Down Expand Up @@ -131,22 +125,14 @@ def list_org_code_scanning_alerts(api_endpoint, github_pat, org_name):
url = "{}/orgs/{}/code-scanning/alerts?per_page=100&page=1".format(
api_endpoint, org_name
)
response = requests.get(
url,
headers={
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
},
)
headers = {
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(url, headers=headers)
response_json = response.json()
while "next" in response.links.keys():
response = requests.get(
response.links["next"]["url"],
headers={
"Authorization": "token {}".format(github_pat),
"Accept": "application/vnd.github.v3+json",
},
)
response = requests.get(response.links["next"]["url"], headers=headers)
response_json.extend(response.json())

print("Found {} code scanning alerts in {}".format(len(response_json), org_name))
Expand Down Expand Up @@ -235,3 +221,110 @@ def write_org_cs_list(cs_list):
str(cs["repository"]["private"]),
]
)


def list_enterprise_code_scanning_alerts(api_endpoint, github_pat, repo_list):
"""
Get a list of all code scanning alerts on a given enterprise.
Inputs:
- API endpoint (for GHES/GHAE compatibility)
- PAT of appropriate scope
- Repository list in "org/repo" format (from enterprise.get_repo_report)
Outputs:
- List of _all_ code scanning alerts in enterprise that PAT user can access
Notes:
- Use `ghe-org-admin-promote` to gain ownership of all organizations.
- Personal repos will not be reported on, as they cannot use code scanning.
"""

alerts = []
while True:
try:
repo_name = next(repo_list) # skip the header by putting this up front
alerts.append(
list_repo_code_scanning_alerts(api_endpoint, github_pat, repo_name)
)
except StopIteration:
break
except Exception as e:
print(e)
return alerts


def write_enterprise_cs_list(cs_list):
"""
Write a list of code scanning alerts to a csv file.
Inputs:
- List from list_enterprise_code_scanning_alerts function, which contains
strings and lists of dictionaries for the alerts.
Outputs:
- CSV file of code scanning alerts
- CSV file of repositories not accessible or without code scanning enabled
"""

for alert_list in cs_list:
if type(alert_list) == list:
print(alert_list)
with open("cs_list.csv", "a") as f:
writer = csv.writer(f)
writer.writerow(
[
"number",
"created_at",
"html_url",
"state",
"fixed_at",
"dismissed_by",
"dismissed_at",
"dismissed_reason",
"rule_id",
"rule_severity",
"rule_tags",
"rule_description",
"rule_name",
"tool_name",
"tool_version",
"most_recent_instance_ref",
"most_recent_instance_state",
"most_recent_instance_sha",
"instances_url",
]
)
for cs in alert_list: # loop through each alert in the list
if cs["state"] == "open":
cs["fixed_at"] = "none"
cs["dismissed_by"] = "none"
cs["dismissed_at"] = "none"
cs["dismissed_reason"] = "none"
writer.writerow(
[
cs["number"],
cs["created_at"],
cs["html_url"],
cs["state"],
cs["fixed_at"],
cs["dismissed_by"],
cs["dismissed_at"],
cs["dismissed_reason"],
cs["rule"]["id"],
cs["rule"]["severity"],
cs["rule"]["tags"],
cs["rule"]["description"],
cs["rule"]["name"],
cs["tool"]["name"],
cs["tool"]["version"],
cs["most_recent_instance"]["ref"],
cs["most_recent_instance"]["state"],
cs["most_recent_instance"]["commit_sha"],
cs["instances_url"],
]
)
else:
with open("excluded_repos.csv", "a") as g:
writer = csv.writer(g)
writer.writerow([alert_list])
52 changes: 52 additions & 0 deletions src/enterprise.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# This holds all the logic for the various enterprise differences.

# Imports
import csv
from time import sleep
import requests


def get_enterprise_version(api_endpoint):
"""
Get the version of GitHub Enterprise. It'll be used to account for
differences between GHES and GHAE and GHEC, like the organization secret
scanning API not existing outside GHEC.
GitHub AE returns "GitHub AE" as of M2
GHES returns the version of GHES that's installed (e.g. "3.4.0")
"""
if api_endpoint != "https://api.github.com":
url = "{}/meta".format(api_endpoint)
response = requests.get(url)
if "installed_version" in response.json():
return response.json()["installed_version"]
else:
return "unknown version of GitHub"
else:
return "GHEC"


def get_repo_report(url, github_pat):
"""
Get the `all_repositories.csv` report from GHES / GHAE.
"""
headers = {
"Accept": "application/vnd.github.v3+json",
"Authorization": "token {}".format(github_pat),
}
url = "{}/stafftools/reports/all_repositories.csv".format(url)
response = requests.get(url, headers=headers)
if response.status_code == 202: # report needs to be generated
while response.status_code == 202:
print("Waiting a minute for the report to be generated ...")
sleep(60)
response = requests.get(url, headers=headers)
elif response.status_code == 200: # report is ready
print("Report is ready! Reading it now ...")
for row in csv.reader(response.text.splitlines()): # skip user repos
if row[2] == "Organization":
yield "{}/{}".format(row[3], row[5])
else:
pass
else: # something went wrong with fetching the report
exit("Error: {}".format(response.status_code))

0 comments on commit bc9ffe2

Please sign in to comment.