Business relies on IT systems and data for daily operation and even survival. Increasing threats from bad actors, physical infrastructure failure, data corruption, and human error persist in both the traditional datacenter and the cloud.
Businesses strive to keep systems running, and plan for restoring services as quickly as possible when things inevitably fail. Any restoration or recovery must happen quickly with minimal data loss. But how do we qualify ‘running’, how do we measure ‘quickly’, and how to we determine ‘minimal’? What’s more, how do these definitions and goals fit within our real-world budgets?
Business Continuity and Disaster Recovery (BCDR) goals are designed for and measured by how quickly and how completely losses can be mitigated. The Recovery Time Objective (RTO) defines the expectation of how quickly a system, service or data must be restored; in other words, the maximum acceptable downtime. The Recovery Point Objective (RPO) defines how current any recovered data or system must be once restored. RPO inherently describes the acceptable amount of data loss.
Prevention is the most cost-effective way of avoiding disaster. Azure includes a rich suite of products to protect systems, documents, data, and identities. Azure also provides for redundancy at every turn, helping make your environment as resilient as possible. Backups protect from data loss and corruption, while replication protects from system loss or disruption. Most businesses will choose to deploy both solutions in addition to standard preventative measures.
Backups create a historical series of recovery points over time. The resources that are protected by backups can be returned to any one of those points in time, or any of those backups can be recovered and inspected as an archival record. Restoring from backup includes inherent data loss, which is the time difference between the point of failure and the last good restore point. Your RPO policies will determine backup frequency and retention policies.
Azure Backup protects files, system state, SQL databases, etc., and can target systems in physical environments, other hypervisors, and in Azure. Retention policies are managed without a need to worry about full, incremental, or differential. During recovery, the administrator is presented a list of backups over time from which to restore. Depending on what it is, the administrator can create a new VM, restore a volume (mount a vhd), or restore a database. With several clicks you’ll have access to your data quickly and reliably.
Replication creates a real-time copy of a system or information which can be brought on line to replace a lost resource very quickly. It is used to bring the latest, single snapshot of the protected resource back into service as quickly as possible. Replication offers minimal data loss, but no historical data points.
Azure Site Recovery (ASR) creates a synchronized, stand-by copy of your systems whether they are physical, Hyper-V, VMWare, AWS, or other Azure VMs. Should a replicated resource go offline, an Azure administrator can quickly bring the system back online in Azure.
Restorations and Disaster Recovery operations do not happen during low stress times. We all know the story about the company that kept backups diligently for years only to find out their tapes were corrupted. Having the right tools and plans in place will make sure you have what you need to recover. Having the operations in the plans committed to muscle memory provides confidence to the team, the executives, and the shareholders.