Outage - Do you have an unplanned outage process?
Last updated by Chloe Lin [SSW] 12 months ago.See historyDuring your course of being a SysAdmin, you will come across many unplanned outages. Some of them will impact BAU (Business as usual) and others will just be minor service outages. Do you know what to do in the event of these outages?
For planned outages, see Outage - Do you have a planned outage process?
Below is a process for these types of outages. Some amount of common sense is required here, an outage would be if services that would affect BAU work are disrupted and/or some hardware has failed.
Hardware Outage:
- Firewall
- Switch
- Blade Servers
- SAN Storage
- UPS
Service Outage:
- Active Directory Domain Services
- O365 Services; Teams, SharePoint, Exchange, OneDrive
- File Servers
- SQL Servers
- IIS Servers
Determining what services are disrupted
Many services can be used for device monitoring e.g. WhatsUp Gold, Solarwinds, SCOM. You would do the following in any of them:
- Login to monitoring service
- Check to see what services are down
First contact
After you have determined what services have been disrupted it is time to call your SysAdmin team and organize a quick conference call. This will allow you to have a discussion prior to making any changes/fixes that could cause the outage to become worse.
Key discussion points:
- What services have been disrupted?
- What is the impact of these services?
- Is an email to everyone in your company required?
- What are your next steps?
What if you cannot reach anyone?
If you cannot reach anyone move on to the Email section.
If from the previous discussion you have determined that an email needs to be sent to your entire company, or you have decided this is necessary if you cannot contact anyone above, send an email in the following format:
To: | SSWAll |
Subject: | SysAdmins – Outage Notice |
A separate email needs to be sent to SysAdmins outlining what was discussed on the call. If no one was contactable, please proceed with what you have determined on your own.
To: | SysAdmins |
Subject: | SysAdmins – Outage Notice |
Next steps did NOT resolve the issue
If you have completed your tasks but the issue has not resolved, please try to make contact with the SysAdmin team again and send an updated 'To Myself' email.
Next steps resolved the issue
If your actions have resolved the issue, please notify ALL of the services being restored and update your 'To Myself' email.