Skip to content

Infrastructure

Greg Bowler edited this page Nov 25, 2022 · 3 revisions

This document outlines the non-code part of the technical product.

There are two extremes of infrastructure:

  1. index.php on a $5/month virtual server. Deployments are done by coping index.php onto the server.
  2. An auto-scaling Kubernetes cluster, defined in a hosted resource inventory service, with load balanced pods secured within a web application firewall node, cost-optimised by a realtime flow log analysis, with certificates automatically rolled and applied by a security posture management service, where the application code runs in elastic load balanced containers, the database is hosted in an in-memory managed relational database, and user data is stored on block storage backed by an object warehouse (on a separate transit gateway with its own threat detection service). Deployments are automated by hooking into the continuous integration pipeline with blue:green fleet servers.

When launching a product it's important to plan the infrastructure so it's simple, understandable and maintainable by the team at the start of the product's life, but also keeping the door open to extension when necessary in the future.

There's nothing wrong with starting with point 1 and adding extra steps when required, but there's everything wrong with starting at point 2 (where there are more servers than there are customers). By the way, the text from point 2 is taken from a subcontractor's quote on a product that I worked on recently. I personally can't understand how a product could ever require that kind of infrastructure, but it's really common to see startups assume they do.

What do we need?

  1. Code must be tested before any deployment is made. This is already automated thanks to Github Actions.
  2. Deployments need to be predictable - should we deploy every change that's merged into the master branch, or should we have a manual release process? For this project, I'm tempted to use a slower, manual release, bundling a known number of milestones into an update that we can blog about, but I also see the benefit of having rolling releases. This is a point for discussion.
  3. Deployments need a plan B - if something goes wrong, it needs to be easy to get back to the last working state.
  4. Servers are cattle, not pets - each time a deployment is made, the whole project should be set up, installed and made live on a new server, after which the old server can be culled. This ensures there are no assumptions made in the deployment/infrastructure, and makes it super easy to understand if it's done from scratch each time.
  5. Infrastructure should be automated wherever possible - with cloud computing as mature as it is these days, it's possible to define all steps in code, leaving no room for error by a miss-click in a config panel somewhere.

Starting out

// TODO: 1) Digital Ocean's doctl command to provision a baseline image, 2) deploy to new server using existing image, 3) switch dynamic IP to new server, 4) smoke test on live, 5) cull old server (maybe leave one old server running so there's always a backup).

Deployment pipeline

// TODO: 1) Push, 2) Github Actions, 3) deploy master to dev server/releases to live server, 4) smoke test, 5) post-deploy actions

Future planning

// TODO: Ongoing document to timestamp when infrastructure changes, with what, how and why.