Do you know how to name a new server, export configuration data, or fix an annoying bug that continues to appear? This information is found in a runbook for IT professionals who are well-prepared and versed on what to do.
Runbooks are standardized documents, procedures, and references that explain the most common recurring IT tasks that are seen. Rather than repeatedly figure out the same issue, you can refer to the runbook to find the best way to handle the work. You can also delegate tasks and even onboard employees much more effectively if you have documentation in place for training. As an SRE, it is important to know how to create these and ensure you can benefit from them.
When it comes to your time, the less you can spend trying to figure out how to do something, the better it will be for your business efficiency and productivity, along with your own sanity.
Keep reading to learn more about runbooks, different documentation options, and some processes you can use in your own business.
Getting To Know the Two Runbook Options
Runbook is actually a broad term. It usually refers to one of two types of documentation. The first is general documentation. This is updated by sysadmin if a new procedure is created or evolves. The second type is technical documentation. This runbook is written for one system, one use-case, or one team.
A single sysadmin or IT department is likely going to have several runbooks to help with routine tasks, and they will serve as a reference manual for all types of special cases.
It’s important to note that runbook documentation is precise. It is specific to the systems that your business runs and the configurations that you have created. Even though the process above cannot cater to your current procedures and systems, it can serve as a template or demonstration that you can fill in with the common day-to-day tasks you have.
It is important to note that it will vary from one organization to the next with specialized documentation.
Disaster Recovery Runbook
Besides the system-specific documentation, most businesses and organizations are going to create use-case specific documentation. The most important of these is disaster recovery, which is something that must be executed properly and quickly.
The disaster recovery plan is something that many businesses will only create after an issue has occurred. The top causes of costly downtime include network failures, hardware failures, and power outages. You have to know how to respond to these incidents for every unique case.
Tips for Creating a Runbook
Each runbook is specific and unique. The content is going to be specific to your organization’s needs. However, the methodology is the same as it is for all types of process creation. The first step in this is to determine what procedures should be documented in the runbook you create. After creating the list, you can write everything up in detail. Once your field tests a process, you can then make necessary optimizations and updates, as needed.
The runbook, according to a former Google employee, should be made up of seven sections.
-
Overview: This will be an overview of the service, what it is, what it is owned, the main contacts, and steps for reporting bugs or other information.
-
Build: This covers the right way to build software that makes the service, where it can be downloaded, where the code repository is located, and more. Make sure to include new instructions about how a new developer can begin with this.
-
Deploy: This outlines the right way to deploy the software. It includes the right way to build a server from scratch and more.
-
Common Tasks: Including instructions for things like common issues and their solutions is key.
-
Pager Playbook: Create a list of all alerts the monitoring system could generate for the service. This will let users know what to do if one is seen.
-
DR: The disaster recovery plans and procedure is outlined here. If the system or machine failed or died, how would you failover to the cold or hot spare?
-
SLA: A Service Level Agreement is a contract made with customers. Usually, this includes things like RTO, RPO, and Uptime Goal.
The way that these things are structured will be left to your discretion. However, it is a good idea to put the non-interactive documents in your cloud storage. After that, you can use your preferred workflow app to record and then assign the most common tasks.
When you know what to do, it is possible to create a runbook that will help with any action or software. Be sure to keep the tips and advice here in mind to achieve the best results with the documentation created. Being informed is the best way to ensure the desired runbook is created.