From 3449fdb7be7b1a25c52f51e44d9e63007ac18f68 Mon Sep 17 00:00:00 2001 From: Alex Ruzenhack Date: Thu, 18 Jan 2024 18:50:22 +0000 Subject: [PATCH 1/4] chore: update README.md with more information for deploying process --- README.md | 39 +++++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index e5f079ec..ee5cc1d5 100644 --- a/README.md +++ b/README.md @@ -14,10 +14,41 @@ If you add a new API, be careful to add ```request.header.origin``` as cacheKey, ## Deploying -Deploys are automated using Github Actions and Flux. +Deploys are automated using Github Actions and Flux in combination. A deployment is made for 3 environments: `dev`, `testnet` and `mainnet`; and 2 systems: lambdas and deamon. -To deploy to the `testnet` environment, simply commit to the `main` branch. +Watch logs and alerts related to the deployment of the services in [SOP Logs](https://github.com/HathorNetwork/ops-tools/blob/master/docs/sops/hathor-explorer-service.md#logs). -To deploy to the `mainnet` environment, create a release in Github using a tag in the format `v0.0.0` +### Systems -Currently we do not have an automated mechanism to be warned if some automated deployment fails. So, after triggering the deploy, you should check if a commit was made by `fluxcdbot` in https://github.com/HathorNetwork/ops-tools/commits/master, updating the project's manifests with the new Docker image tag. \ No newline at end of file +There are 2 systems. + +#### Lambdas + +It happens by the make script method `deploy-lambdas-ci`, which build and deploy the source code to lambda services in the AWS. + +#### Deamon + +It happens by the make script method `deploy-deamons`, which build a docker image and push it to ECR, in the `hathor-explorer-service` repository. Once an image is pushed to ECR, the [fluxcdbot] identifies the lastest version tag and commit it in the ops-tools, in the hathor-explorer-service's kubernetes manifest. When the new definition is commited, a scheme sync happens in the kubernetes and new pods for the new version are lifted up. + +>[!TIP] +>After a deploy job finish with success in the GitHub action, look at the [Ops-Tools repository commits](https://github.com/HathorNetwork/ops-tools/commits/master) and check if the new hathor-explorer-service image for the new version is commited. + +### Environments + +There are 3 environments. + +#### Dev + +Every push to `dev` branch triggers the deployment for both systems in the `dev` environment. + +#### Testnet + +Every push to `main` branch triggers the deployment for both systems in the `testnet` environment. + +#### Mainnet + +Every push of a tag in the format `v0.0.0` triggers the deployment for both systems in the `mainnet` environment. + +### Troubleshooting + +Look for issues and possible solutions in the [On-Call Incidents repository](https://github.com/HathorNetwork/on-call-incidents/issues?q=is:issue+explorer-service), [Internal Issues](https://github.com/HathorNetwork/internal-issues/issues?q=is:issue+is:open+explorer-service), or in the repository itself. From 4d96729848d0c3b657f6f93dfeb28d89d33d2069 Mon Sep 17 00:00:00 2001 From: Alex Ruzenhack Date: Tue, 30 Jan 2024 15:49:11 +0000 Subject: [PATCH 2/4] chore: migrate Deployment section to SOP --- README.md | 39 +-------------------------------------- 1 file changed, 1 insertion(+), 38 deletions(-) diff --git a/README.md b/README.md index ee5cc1d5..f2ca0c1b 100644 --- a/README.md +++ b/README.md @@ -14,41 +14,4 @@ If you add a new API, be careful to add ```request.header.origin``` as cacheKey, ## Deploying -Deploys are automated using Github Actions and Flux in combination. A deployment is made for 3 environments: `dev`, `testnet` and `mainnet`; and 2 systems: lambdas and deamon. - -Watch logs and alerts related to the deployment of the services in [SOP Logs](https://github.com/HathorNetwork/ops-tools/blob/master/docs/sops/hathor-explorer-service.md#logs). - -### Systems - -There are 2 systems. - -#### Lambdas - -It happens by the make script method `deploy-lambdas-ci`, which build and deploy the source code to lambda services in the AWS. - -#### Deamon - -It happens by the make script method `deploy-deamons`, which build a docker image and push it to ECR, in the `hathor-explorer-service` repository. Once an image is pushed to ECR, the [fluxcdbot] identifies the lastest version tag and commit it in the ops-tools, in the hathor-explorer-service's kubernetes manifest. When the new definition is commited, a scheme sync happens in the kubernetes and new pods for the new version are lifted up. - ->[!TIP] ->After a deploy job finish with success in the GitHub action, look at the [Ops-Tools repository commits](https://github.com/HathorNetwork/ops-tools/commits/master) and check if the new hathor-explorer-service image for the new version is commited. - -### Environments - -There are 3 environments. - -#### Dev - -Every push to `dev` branch triggers the deployment for both systems in the `dev` environment. - -#### Testnet - -Every push to `main` branch triggers the deployment for both systems in the `testnet` environment. - -#### Mainnet - -Every push of a tag in the format `v0.0.0` triggers the deployment for both systems in the `mainnet` environment. - -### Troubleshooting - -Look for issues and possible solutions in the [On-Call Incidents repository](https://github.com/HathorNetwork/on-call-incidents/issues?q=is:issue+explorer-service), [Internal Issues](https://github.com/HathorNetwork/internal-issues/issues?q=is:issue+is:open+explorer-service), or in the repository itself. +See in the [SOP](https://github.com/HathorNetwork/ops-tools/blob/master/docs/sops/hathor-explorer-service.md#deployment). From a4426d2d111ac1687be8c4d85220ced948cc85cd Mon Sep 17 00:00:00 2001 From: Luis Helder Date: Fri, 16 Feb 2024 08:48:38 -0300 Subject: [PATCH 3/4] chore: use hathor-core's health endpoint (#311) --- gateways/clients/hathor_core_client.py | 1 + gateways/healthcheck_gateway.py | 8 +-- .../unit/gateways/test_healthcheck_gateway.py | 14 ++-- tests/unit/usecases/test_healthcheck.py | 70 +++++++++---------- usecases/get_healthcheck.py | 44 ++++++++---- 5 files changed, 77 insertions(+), 60 deletions(-) diff --git a/gateways/clients/hathor_core_client.py b/gateways/clients/hathor_core_client.py index 8f2ec5f5..78b75229 100644 --- a/gateways/clients/hathor_core_client.py +++ b/gateways/clients/hathor_core_client.py @@ -24,6 +24,7 @@ TX_ACC_WEIGHT_ENDPOINT = "/v1a/transaction_acc_weight" VERSION_ENDPOINT = "/v1a/version" FEATURE_ENDPOINT = "/v1a/feature" +HEALTH_ENDPOINT = "/v1a/health" class HathorCoreAsyncClient: diff --git a/gateways/healthcheck_gateway.py b/gateways/healthcheck_gateway.py index 6d8631a3..2d8703af 100644 --- a/gateways/healthcheck_gateway.py +++ b/gateways/healthcheck_gateway.py @@ -3,7 +3,7 @@ from common.configuration import ELASTIC_INDEX from gateways.clients.cache_client import CacheClient from gateways.clients.elastic_search_client import ElasticSearchClient -from gateways.clients.hathor_core_client import VERSION_ENDPOINT, HathorCoreAsyncClient +from gateways.clients.hathor_core_client import HEALTH_ENDPOINT, HathorCoreAsyncClient from gateways.clients.wallet_service_db_client import WalletServiceDBClient # The default lambda timeout for the Healtcheck Lambda is set to @@ -32,11 +32,11 @@ def __init__( wallet_service_db_client or WalletServiceDBClient() ) - async def get_hathor_core_version(self) -> Optional[dict]: - """Retrieve hathor-core version information""" + async def get_hathor_core_health(self) -> Optional[dict]: + """Retrieve hathor-core health information""" return await self.hathor_core_async_client.get( - VERSION_ENDPOINT, timeout=HEALTHCHECK_CLIENT_TIMEOUT_IN_SECONDS + HEALTH_ENDPOINT, timeout=HEALTHCHECK_CLIENT_TIMEOUT_IN_SECONDS ) def ping_redis(self) -> bool: diff --git a/tests/unit/gateways/test_healthcheck_gateway.py b/tests/unit/gateways/test_healthcheck_gateway.py index 6ef9ca83..3adb31f5 100644 --- a/tests/unit/gateways/test_healthcheck_gateway.py +++ b/tests/unit/gateways/test_healthcheck_gateway.py @@ -18,15 +18,15 @@ def setUp(self): wallet_service_db_client=self.wallet_service_db_client, ) - async def test_get_hathor_core_version(self): - async def mock_get_hathor_core_version(endpoint, **kwargs): - return {"version": "0.39.0"} + async def test_get_hathor_core_health(self): + async def mock_get_hathor_core_health(endpoint, **kwargs): + return {"status": "pass"} - self.hathor_core_async_client.get.side_effect = mock_get_hathor_core_version - result = await self.healthcheck_gateway.get_hathor_core_version() - self.assertEqual(result, {"version": "0.39.0"}) + self.hathor_core_async_client.get.side_effect = mock_get_hathor_core_health + result = await self.healthcheck_gateway.get_hathor_core_health() + self.assertEqual(result, {"status": "pass"}) self.hathor_core_async_client.get.assert_called_once_with( - "/v1a/version", timeout=5 + "/v1a/health", timeout=5 ) def test_ping_redis(self): diff --git a/tests/unit/usecases/test_healthcheck.py b/tests/unit/usecases/test_healthcheck.py index acc1ba63..fa59d36d 100644 --- a/tests/unit/usecases/test_healthcheck.py +++ b/tests/unit/usecases/test_healthcheck.py @@ -18,11 +18,11 @@ def setUp(self): ) def test_all_components_healthy(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = True @@ -80,11 +80,11 @@ async def mock_get_hathor_core_version(): ) def test_hathor_core_returns_error(self): - async def mock_get_hathor_core_version(): + async def mock_get_hathor_core_health(): return {"error": "Unable to connect"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = True @@ -142,11 +142,11 @@ async def mock_get_hathor_core_version(): ) def test_wallet_service_db_raises_exception(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.side_effect = Exception( "Unable to connect" @@ -206,11 +206,11 @@ async def mock_get_hathor_core_version(): ) def test_wallet_service_db_reports_unhealthy(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = ( False, @@ -271,11 +271,11 @@ async def mock_get_hathor_core_version(): ) def test_redis_raises_exception(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.side_effect = Exception( @@ -335,11 +335,11 @@ async def mock_get_hathor_core_version(): ) def test_redis_reports_unhealthy(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = False @@ -397,11 +397,11 @@ async def mock_get_hathor_core_version(): ) def test_elasticsearch_raises_exception(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = True @@ -459,11 +459,11 @@ async def mock_get_hathor_core_version(): ) def test_elasticsearch_reports_unhealthy(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = True @@ -521,11 +521,11 @@ async def mock_get_hathor_core_version(): ) def test_elasticsearch_report_yellow(self): - async def mock_get_hathor_core_version(): - return {"version": "0.38.0"} + async def mock_get_hathor_core_health(): + return {"status": "pass"} - self.mock_healthcheck_gateway.get_hathor_core_version.side_effect = ( - mock_get_hathor_core_version + self.mock_healthcheck_gateway.get_hathor_core_health.side_effect = ( + mock_get_hathor_core_health ) self.mock_healthcheck_gateway.ping_wallet_service_db.return_value = (True, "1") self.mock_healthcheck_gateway.ping_redis.return_value = True diff --git a/usecases/get_healthcheck.py b/usecases/get_healthcheck.py index d57daf78..5a2a9ac1 100644 --- a/usecases/get_healthcheck.py +++ b/usecases/get_healthcheck.py @@ -6,6 +6,7 @@ HealthcheckCallbackResponse, HealthcheckDatastoreComponent, HealthcheckHTTPComponent, + HealthcheckStatus, ) from common.configuration import ( @@ -54,18 +55,33 @@ def __init__( self.healthcheck.add_component(component) async def _get_fullnode_health(self): - # TODO: We need to use the hathor-core's /health endpoint when it's available - version_response = await self.healthcheck_gateway.get_hathor_core_version() + health_response = await self.healthcheck_gateway.get_hathor_core_health() - if "error" in version_response: + if "error" in health_response: return HealthcheckCallbackResponse( - status="fail", - output=f"Fullnode healthcheck errored: {version_response['error']}", + status=HealthcheckStatus.FAIL, + output=f"Fullnode healthcheck errored: {health_response['error']}", + ) + + status = health_response["status"] + + # Here we're assuming that a 'warn' status will be considered as unhealthy + is_healthy = status == HealthcheckStatus.PASS + is_unhealthy = status in [HealthcheckStatus.FAIL, HealthcheckStatus.WARN] + + if is_unhealthy: + output = f"Fullnode is not healthy: {str(health_response)}" + elif is_healthy: + output = "Fullnode is healthy" + else: + status = HealthcheckStatus.FAIL + output = ( + f"Fullnode returned an unexpected health status: {str(health_response)}" ) return HealthcheckCallbackResponse( - status="pass", - output="Fullnode is healthy", + status=status, + output=output, ) async def _get_wallet_service_db_health(self): @@ -73,12 +89,12 @@ async def _get_wallet_service_db_health(self): is_healthy, output = self.healthcheck_gateway.ping_wallet_service_db() except Exception as e: return HealthcheckCallbackResponse( - status="fail", + status=HealthcheckStatus.FAIL, output=f"Wallet service DB healthcheck errored: {repr(e)}", ) return HealthcheckCallbackResponse( - status="pass" if is_healthy else "fail", + status=HealthcheckStatus.PASS if is_healthy else HealthcheckStatus.FAIL, output="Wallet service DB is healthy" if is_healthy else f"Wallet service DB didn't respond as expected: {output}", @@ -89,12 +105,12 @@ async def _get_redis_health(self): is_healthy = self.healthcheck_gateway.ping_redis() except Exception as e: return HealthcheckCallbackResponse( - status="fail", + status=HealthcheckStatus.FAIL, output=f"Redis healthcheck errored: {repr(e)}", ) return HealthcheckCallbackResponse( - status="pass" if is_healthy else "fail", + status=HealthcheckStatus.PASS if is_healthy else HealthcheckStatus.FAIL, output="Redis is healthy" if is_healthy else "Redis reported as unhealthy", ) @@ -103,13 +119,13 @@ async def _get_elasticsearch_health(self): elasticsearch_info = self.healthcheck_gateway.get_elasticsearch_health() except Exception as e: return HealthcheckCallbackResponse( - status="fail", + status=HealthcheckStatus.FAIL, output=f"Elasticsearch healthcheck errored: {repr(e)}", ) if elasticsearch_info["status"] == "red": return HealthcheckCallbackResponse( - status="fail", + status=HealthcheckStatus.FAIL, output=f"Elasticsearch is not healthy: {str(elasticsearch_info)}", ) if elasticsearch_info["status"] == "yellow": @@ -119,7 +135,7 @@ async def _get_elasticsearch_health(self): ) return HealthcheckCallbackResponse( - status="pass", + status=HealthcheckStatus.PASS, output=str(elasticsearch_info), ) From 3b91199d41b14da0166ce3f64c6548a3f46da425 Mon Sep 17 00:00:00 2001 From: Luis Helder Date: Fri, 23 Feb 2024 09:59:02 -0300 Subject: [PATCH 4/4] bump: v0.14.0 (#312) --- package-lock.json | 4 ++-- package.json | 2 +- pyproject.toml | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/package-lock.json b/package-lock.json index cf3168bd..fd4877d5 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "hathor-explorer-service", - "version": "0.13.0", + "version": "0.14.0", "lockfileVersion": 2, "requires": true, "packages": { "": { "name": "hathor-explorer-service", - "version": "0.13.0", + "version": "0.14.0", "license": "MIT", "dependencies": { "@apidevtools/swagger-cli": "^4.0.4", diff --git a/package.json b/package.json index b08ca4b6..a10be040 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "hathor-explorer-service", - "version": "0.13.0", + "version": "0.14.0", "description": "Hathor Explorer Service Serverless deps", "dependencies": { "@apidevtools/swagger-cli": "^4.0.4", diff --git a/pyproject.toml b/pyproject.toml index f678d71e..940066f1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "hathor-explorer-service" -version = "0.13.0" +version = "0.14.0" description = "" authors = ["Hathor Labs "] license = "MIT"