Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chatie API Server Down Accident Report #73

Open
huan opened this issue Jul 15, 2021 · 0 comments
Open

Chatie API Server Down Accident Report #73

huan opened this issue Jul 15, 2021 · 0 comments

Comments

@huan
Copy link
Member

huan commented Jul 15, 2021

Token Service Discovery Service Accident

Our wechaty puppet service discovery service has been experiencing out-of-service issues from 11 am Jun 15.

  1. 11 am: out-of-service due to SSL cert expired
  2. 2 pm: we have noticed this problem in the noon then working on it, and found that the 80 ports of the server can not be reached from the public internet
  3. 2:30 pm: the service is back to service by switching to the Heroku Dynos under a downgraded level because we have to use two dynos to serve more than 1,300 concurrency WebSocket connections. You might notice that the token service sometimes returns 404, you can retry 1-2 times to get the right result. (because the token is registered to one server, but not the other)
  4. 10 pm: the service has been moved back to the Azure server by creating a new server, which fixed the 80 port unreachable problem. (it might be related to the azure bug because we can not make the 80 port to be visitable from the internet)
  5. 11 pm: the server fully restored
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant