Skip to content
This repository has been archived by the owner on Dec 5, 2024. It is now read-only.

Scalability

Davide Berdin edited this page Jun 10, 2020 · 1 revision

Phoenix has been designed with scalability in mind. In fact, the reason why we split the project in different APIs was to be able to deploy the public service on faster and larger machines to absorb the high volume of traffic that we have daily at RTL.

For scaling the application, we adopted an opinionated way for our production Kubernetes cluster in AWS. We have the following definition:

  • 1 m5.xlarge instance where we deployed both the worker and the internal services
  • 4 c5.2xlarge instances where we deployed the public service

The above machines are in two different AutoscalingGroups with node-toleration in place. This helps the deployment to target the correct instance since the Kubernetes scheduler will act according to those labels.

The public service has also a HorizontalPodAutoscaling mechanism that will help the application to scale in case the CPU and Memory usage go over a certain threshold. We did not make the hpa publicly available in our chart since it's tailored to our Kubernetes cluster.

If you feel like it, you can write a more generic hpa and open up a PR ❤️

Clone this wiki locally