Skip to content

Commit

Permalink
open source v2
Browse files Browse the repository at this point in the history
  • Loading branch information
nathanielparke committed Oct 19, 2020
1 parent dc7673b commit 35a7a33
Show file tree
Hide file tree
Showing 34 changed files with 1,002 additions and 1 deletion.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,8 @@ dmypy.json

# Pyre type checker
.pyre/

deploy/id_rsa
*.retry
*.key
.idea/
56 changes: 55 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,55 @@
# validators
# Validators
## Motivation
This repository is meant to serve as an example for how to run a solana validator.
It does not give specifics on the architecture of Solana, and should not be used as a substitute for Solana's documentation.
It is highly recommended to read [Solana's Documentation](https://docs.solana.com/running-validator) about running a validator.
This repository should be used in conjunction with Solana's guide. It provides practical
real-world examples of cluster setup, and should act as a starting point for participating
in mainnet validation.

This repository gives two examples of potential validator setups. The first is a
single node validator that can be used as an entry point for querying on-chain Solana data, or
validating transactions.
The second is a cluster of Solana validators that are load balanced by an nginx server. Nginx
has an active health check feature offered in their premium version. A configuration
for active health checks is also included.

The end goal of this guide is to have a solana validator cluster running in a cloud
environment.

## Overview of setups
- run a single validator
- run a cluster of validators
## Running a single validator
#### Instance configuration
##### Choosing an instance type
Solana's documentation recommends choosing a node type with the highest number of cores possible ([see here](https://docs.solana.com/running-validator/validator-reqs)).
Additionally the Solana mainnet utilizes GPUs to increase network throughput. Solana's documentation
recommends using Nvidia Turing or Volta family GPUs which are available through most cloud providers.

This guide was tested using [Amazon AWS g4dn.16xlarge instances](https://aws.amazon.com/ec2/instance-types/g4/) using the
Ubuntu 18.04 Deep Learning AMI. These instances provide Nvidia T4 GPUs with a balance of high network
throughput and CPU resources.

##### Instance network configuration
After provisioning an instance it is important to configure network whitelists to be compatible
with a validator's network usage. Solana nodes communicate via a gossip protocol. This protocol takes
place over a port range specified upon validator startup. For this guide we will set that port range to
8000-8012. Be sure to whitelist network traffic on whichever port range you choose.

Validator RPC servers also bind to configurable ports. This guide will set RPC servers to use port 8899
for standard HTTP requests and 8900 for websocket connections.

#### Setting up a single validator
Once an instance has been deployed and is accessible over SSH, we can use ansible to run some basic setup
scripts. Ansible works by inspecting the contents of a `hosts.yaml` file, which defines the inventory of servers to which one can deploy.
To make our servers accesible to ansible, add your server's network location to the validators block in `deploy/hosts.yaml`.
This will indicate that the specified server is part of the `validators` group, which will contain our validator machines.
`deploy/setup.yaml` contains a set of common setup steps for configuring a server from the base OS image. You can run these
setup steps using
```
# run this from the /deploy directory
ansible-playbook -i hosts.yaml -l validators setup.yaml
```

## Running a cluster of validators
7 changes: 7 additions & 0 deletions deploy/ansible.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[defaults]
inventory = ./hosts.yaml
forks = 100
interpreter_python = auto

[ssh_connection]
pipelining = True
5 changes: 5 additions & 0 deletions deploy/check_slot_distance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash -e

pssh=$(which parallel-ssh || which pssh)

$pssh -h <(ansible all --list-hosts -i hosts.yaml | tail -n+2) -i -l ubuntu -- 'curl http://localhost:8899/health'
5 changes: 5 additions & 0 deletions deploy/etc/common/dnsmasq.d/local.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Increase cache size
cache-size=4096

# Cache negative replies even if they do not have TTLs
neg-ttl=10
82 changes: 82 additions & 0 deletions deploy/etc/common/nginx/nginx.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
worker_rlimit_nofile 80000;

events {
worker_connections 50000;
# multi_accept on;
}

http {

##
# Basic Settings
##

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;

server_names_hash_bucket_size 128;
# server_name_in_redirect off;

include /etc/nginx/mime.types;
default_type application/octet-stream;

client_max_body_size 100m;

proxy_busy_buffers_size 32k;
proxy_buffers 128 8k;

##
# SSL Settings
##

ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
ssl_session_timeout 100m;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
ssl_dhparam /etc/nginx/ssl/dhparam.pem;

##
# Logging Settings
##

log_format main_ext '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$host" sn="$server_name" '
'rt=$request_time '
'ua="$upstream_addr" us="$upstream_status" '
'ut="$upstream_response_time" ul="$upstream_response_length" '
'cs=$upstream_cache_status '
'msec=$msec '
'aid="$upstream_http_account_id"';
access_log /var/log/nginx/access.log main_ext;
error_log /var/log/nginx/error.log warn;

##
# Gzip Settings
##

gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

##
# Virtual Host Configs
##

include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
9 changes: 9 additions & 0 deletions deploy/etc/common/nginx/sites-enabled/status
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
server {
listen 127.0.0.1:81;
server_name 127.0.0.1;
location /nginx_status {
stub_status on;
allow 127.0.0.1;
deny all;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload";
add_header X-Frame-Options sameorigin;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header Content-Security-Policy block-all-mixed-content;
13 changes: 13 additions & 0 deletions deploy/etc/common/nginx/ssl/dhparam.pem
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
-----BEGIN DH PARAMETERS-----
MIICCAKCAgEAwSMA4EB8xhOeTzV+UMAG7fGVvE75S7WqUMG83YC0hXuefpNY0w5b
wQkM5ffkNiIa/lv2W+SqR2WRoh0M6xI0HdUdVKVkNYyWqBKRW4fjh+hbYMar8FCM
TFibDHoNU+40Z9bwKWWeURZAQj9yCA0dbXCkv7nIuVrWTHBMHtNt9quMvqevZPoU
wL6N004E9pjlEogH4PX/H+o08xGicNtlJXsU0rd2Xev9URo/8IU92qocBjUiUvow
yRUJaufmqfT5IV+ezLUCV1yC2UOj0BA3sNVdFNS8MUIIJWWUfLspXHE0iQjNJuW6
HOmj9sMwVWjnuRjpMza6wNi+CAaKgzI8YrfABd/PtRl9bxztGRXTaLK+ecRlUbq3
l++SLu3mX7GfoACxHhAxQAoaDsZZMgqvsI23DP5FHCCMSQGw6r/dJuZ4q4b8qjWX
u6eOY+ZBg4FIYiMsHcgNcNPGKoLf/YQ3L3EAl9iRb2dXPza5QW9pLzoGLRC94EIT
Wq2hthOqJPsiEihc2gBaV5sdcbO+tqf4XhtbWLKMVDt91TSYzukdrlE5rnFpmvr5
0ze5saNI1tsAgpL8UmJkjpT19VUF6eTv7wpc2gAklel+kUTlJ1rjwja2uq+zNDI5
dzt6iXs1SHgY6wkn9orNPAmWFRoKkaLJgmWFeJHIqp14opS4ZESaSiMCAQI=
-----END DH PARAMETERS-----
12 changes: 12 additions & 0 deletions deploy/etc/lb/nginx/sites-available/dashboard.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
server {
listen 30000;
location /api {
api;
}
location = /dashboard.html {
root /usr/share/nginx/html;
}
# location /swagger-ui {
# root /usr/share/nginx/html;
# }
}
73 changes: 73 additions & 0 deletions deploy/etc/lb/nginx/sites-available/validator-health-checks.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
upstream validator_backend {
zone validator_backend 512k;
least_conn;
keepalive 8192;
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
}

upstream validator_ws_backend {
zone validator_ws_backend 512k;
least_conn;
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
}

server {
listen 80;
server_name validator-lb.test.net;
status_zone http_status_zone;

location / {
try_files /nonexistent @$http_upgrade;
}

location @websocket {
proxy_pass http://validator_ws_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
health_check uri=/health port=9090;
}

location @ {
proxy_pass http://validator_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_next_upstream error timeout non_idempotent;
proxy_next_upstream_timeout 5;
proxy_next_upstream_tries 5;
health_check uri=/health port=9090;
}
}

server {
listen 443;
server_name validator-lb.test.net;
status_zone https_status_zone;

ssl on;
ssl_certificate /etc/ssl/certs/test.net.pem;
ssl_certificate_key /etc/ssl/private/test.net.key;
ssl_client_certificate /etc/ssl/certs/cloudflare.pem;
ssl_verify_client on;

location / {
try_files /nonexistent @$http_upgrade;
}

location @websocket {
proxy_pass http://validator_ws_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
health_check uri=/health port=9090;
}

location @ {
proxy_pass http://validator_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
health_check uri=/health port=9090;
}
}
65 changes: 65 additions & 0 deletions deploy/etc/lb/nginx/sites-available/validator.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
upstream validator_backend {
least_conn;
keepalive 8192;
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
}

upstream validator_ws_backend {
least_conn;
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
}

server {
listen 80;
server_name validator-lb.test.net;

location / {
try_files /nonexistent @$http_upgrade;
}

location @websocket {
proxy_pass http://validator_ws_backend/$1;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}

location @ {
proxy_pass http://validator_backend/$1;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_next_upstream error timeout non_idempotent;
proxy_next_upstream_timeout 5;
proxy_next_upstream_tries 5;
}
}

server {
listen 443;
server_name validator-lb.test.net;

ssl on;
ssl_certificate /etc/ssl/certs/test.net.pem;
ssl_certificate_key /etc/ssl/private/test.net.key;
ssl_client_certificate /etc/ssl/certs/cloudflare.pem;
ssl_verify_client on;

location / {
try_files /nonexistent @$http_upgrade;
}

location @websocket {
proxy_pass http://validator_ws_backend/$1;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}

location @ {
proxy_pass http://validator_backend/$1;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
1 change: 1 addition & 0 deletions deploy/etc/lb/ssl/certs/test.net.pem
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# insert ssl certificate here
1 change: 1 addition & 0 deletions deploy/etc/lb/ssl/private/test.net.key
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# insert ssl private key here
30 changes: 30 additions & 0 deletions deploy/etc/validator/nginx/sites-available/validator.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
upstream validator_backend {
keepalive 8192;
server localhost:8899 max_fails=20 fail_timeout=2;
}

upstream validator_ws_backend {
least_conn;
server localhost:8900 fail_timeout=2;
}

server {
listen 80;

location / {
try_files /nonexistent @$http_upgrade;
}

location @websocket {
proxy_pass http://validator_ws_backend/$1;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}

location @ {
proxy_pass http://validator_backend/$1;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
3 changes: 3 additions & 0 deletions deploy/etc/validator/sysctl.d/20-solana-mmaps.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Increase memory mapped files limit
# https://docs.solana.com/running-validator/validator-start#manual
vm.max_map_count = 500000
6 changes: 6 additions & 0 deletions deploy/etc/validator/sysctl.d/20-solana-udp-buffers.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Increase UDP buffer size
# https://docs.solana.com/running-validator/validator-start#manual
net.core.rmem_default = 134217728
net.core.rmem_max = 134217728
net.core.wmem_default = 134217728
net.core.wmem_max = 134217728
Loading

0 comments on commit 35a7a33

Please sign in to comment.