From 752a787282f209671670490af264b9e3727e6447 Mon Sep 17 00:00:00 2001 From: Georgiana Dolocan Date: Wed, 22 Nov 2023 14:07:16 +0200 Subject: [PATCH 1/2] Add 30m checklist --- projects/managed-hubs/index.md | 9 ++++ projects/managed-hubs/support.md | 6 ++- .../timeboxed-initial-ticket-evaluation.md | 43 +++++++++++++++++++ 3 files changed, 56 insertions(+), 2 deletions(-) create mode 100644 projects/managed-hubs/timeboxed-initial-ticket-evaluation.md diff --git a/projects/managed-hubs/index.md b/projects/managed-hubs/index.md index b0c10a5a..75009445 100644 --- a/projects/managed-hubs/index.md +++ b/projects/managed-hubs/index.md @@ -6,3 +6,12 @@ The Managed JupyterHub Service is an ongoing service to **sustain and scale** a **[`docs.2i2c.org`](https://docs.2i2c.org) has most of the information about this service**. The sections here contain information that is more relevant to 2i2c team members (like support process documentation). + +```{toctree} +:maxdepth: 2 +showcase-hub +sales +support +timeboxed-initial-ticket-evaluation +incidents +``` \ No newline at end of file diff --git a/projects/managed-hubs/support.md b/projects/managed-hubs/support.md index f260c41b..969c80ed 100644 --- a/projects/managed-hubs/support.md +++ b/projects/managed-hubs/support.md @@ -177,11 +177,11 @@ This process is carried out in an ongoing basis by the {term}`Support Stewards`. The goal of the non-incident response process is to bring standardization to our support response. This simple workflow tries to battle the bias towards a reactive response whereas it is also bringing some common patterns so all of our non-incident support responses are cohesive and shared among our support stewards. The current iteration of the workflow states each step and who should be responsible/accountable for the specific step, plus some other clarifications. -When a new ticket lands in Freshdesk under the support group and it is not an incident, you should follow the following steps: +When a new ticket lands in Freshdesk under the support group and it is not an incident, we aim to respond within 24 working hours with a suggested next action. The next steps should be followed when resolving a ticket: 1. `Who: Support steward` - **Respond within 24 working hours**. Acknowledge receipt of the support request and let the {term}`Community Representative` know a time-boxed investigation will start soon. Please request any additional information you may need to be able to reproduce the issue in step 2. + **First 24h initial ticket evaluation**. In the first 24h a support ticket was opened, you should do an initial evaluation of the ticket and ask the {term}`Community Representative` about any additional information you may need. 2. `Who: Support steward` @@ -189,6 +189,8 @@ When a new ticket lands in Freshdesk under the support group and it is not an in 1. If you resolve the issue, then jump to the "Confirm resolution" step 7. 2. If you don't believe you can resolve the issue (or you couldn't) in 30 minutes, jump to the next step. + Follow the guide at [](support:timeboxed-evaluation) to try and reach to a decision. + 3. `Who: Support Steward` **Open an engineering issue**. If this is a {term}`Change Request` or {term}`Guidance Request` and/or you cannot resolve the issue within 30 minutes, then open a support issue for the team to discuss. diff --git a/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md b/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md new file mode 100644 index 00000000..2d71d28e --- /dev/null +++ b/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md @@ -0,0 +1,43 @@ +(support:timeboxed-evaluation)= +# Initial timeboxed (30m) ticket resolution checklist + +In the [non-incident support response process](https://compass.2i2c.org/projects/managed-hubs/support/#non-incident-response-process), an initial 30m time-boxed ticket resolution process is documented. + +The support triagers use these 30m time interval to try an resolve a ticket, before opening a follow-up issue about it. + +The next sections represents an incomplete initial checklist that the support triager can follow in order to resolve the ticket or decide on opening a tracking issue about it, with the context they gained during this investigation. + +The steps to follow depend greatly on the type of ticket. To simplify, only three big ticket categories will be addressed. + +## Category 1: Something is not working + +```{important} +If something is not working, you might be dealing with an incident, so depending on the scale of the issue and its nature, you might want to consider following the [Incident Response Process](https://compass.2i2c.org/projects/managed-hubs/incidents/#incident-response-process). +``` + +1. ✅ Ask for any additional info might be needed +1. ✅ Check the [](https://infrastructure.2i2c.org/howto/troubleshoot/logs/kubectl-logs) +1. ✅ Check the [](https://infrastructure.2i2c.org/howto/troubleshoot/logs/cloud-logs) +1. ✅ Save any of the logs that look useful +1. ✅ Check if there's any of the issues described at [](troubleshooting) + 1. ❌ If not, then open a new GitHub issue, sharing as much context from the previous steps as possible and continue with the [non-incident response process](https://compass.2i2c.org/projects/managed-hubs/support/#non-incident-response-process) + +## Category 2: New feature requested +```{list-table} +:widths: 30 +:header-rows: 1 + +* - Is the feature requested documented at [](hub-features)? +* - ☑ Yes? Then enable it after checking it is in the scope of the contract. +* - ▫️ No? Then open a GitHub tracking issue about it and continue following the non-incident process. +``` + +## Category 3: Technical advice +```{list-table} +:widths: 30 +:header-rows: 1 + +* - Is the question about an area where the support triager has insight into? +* - ☑ Yes? Then answer the ticket. +* - ▫️ No? Then open a GitHub tracking issue about it and continue following the non-incident process +``` From 13de848c9b565c3dbae74ba87195d472b93af9c1 Mon Sep 17 00:00:00 2001 From: Georgiana Dolocan Date: Thu, 23 Nov 2023 11:00:29 +0200 Subject: [PATCH 2/2] Fix links to relevant infra docs --- .../timeboxed-initial-ticket-evaluation.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md b/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md index 2d71d28e..02624d8e 100644 --- a/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md +++ b/projects/managed-hubs/timeboxed-initial-ticket-evaluation.md @@ -16,10 +16,15 @@ If something is not working, you might be dealing with an incident, so depending ``` 1. ✅ Ask for any additional info might be needed -1. ✅ Check the [](https://infrastructure.2i2c.org/howto/troubleshoot/logs/kubectl-logs) -1. ✅ Check the [](https://infrastructure.2i2c.org/howto/troubleshoot/logs/cloud-logs) +1. ✅ Check if the errors being reported are listed in this incomplete list of [the most common seen errors](https://infrastructure.2i2c.org/howto/troubleshoot/logs/common-errors/). +1. ✅ Depending on the issue being experienced, you should check the relevant logs: + + 🟡 via cloud-agnostic tools like [kubectl or the deployer](https://infrastructure.2i2c.org/howto/troubleshoot/logs/kubectl-logs), which provide details about the current running components + + 🟡 or search [the logs via the console](https://infrastructure.2i2c.org/howto/troubleshoot/logs/cloud-logs) which can be useful for digging out information about components, persisted for a longer time span (30d in GCP's case). + 1. ✅ Save any of the logs that look useful -1. ✅ Check if there's any of the issues described at [](troubleshooting) +1. ✅ Check if you are dealing with any of [the most common seen problems](https://infrastructure.2i2c.org/sre-guide/common-problems-solutions/) and try and fix it. 1. ❌ If not, then open a new GitHub issue, sharing as much context from the previous steps as possible and continue with the [non-incident response process](https://compass.2i2c.org/projects/managed-hubs/support/#non-incident-response-process) ## Category 2: New feature requested @@ -28,8 +33,8 @@ If something is not working, you might be dealing with an incident, so depending :header-rows: 1 * - Is the feature requested documented at [](hub-features)? -* - ☑ Yes? Then enable it after checking it is in the scope of the contract. -* - ▫️ No? Then open a GitHub tracking issue about it and continue following the non-incident process. +* - ✅ Yes? Then enable it after checking it is in the scope of the contract. +* - ❌ No? Then open a GitHub tracking issue about it and continue following the non-incident process. ``` ## Category 3: Technical advice @@ -38,6 +43,6 @@ If something is not working, you might be dealing with an incident, so depending :header-rows: 1 * - Is the question about an area where the support triager has insight into? -* - ☑ Yes? Then answer the ticket. -* - ▫️ No? Then open a GitHub tracking issue about it and continue following the non-incident process +* - ✅ Yes? Then answer the ticket. +* - ❌ No? Then open a GitHub tracking issue about it and continue following the non-incident process ```