Disaster Recovery Plan Notes
We have engineered the CustomerGauge system that in the event of hardware or software outage, we will perform a “fast-failover” to minimize disruption.
- Auto Scaling: We take advantage of the Amazon EC2 infrastructure to “auto scale”. For example, we are currently running typically 3 servers in each geography to serve the CustomerGauge application. During a busy day, the system will automatically launch up to 8 servers to take account of load spikes, normally when CPU is more than 30% usage.
- This infrastucture also gives us the ability to rapidly deploy an instance of CustomerGauge in one of the several Amazon data centres should a disaster strike in one centre.
- Client data is stored in Amazon RDS server infrastructure. Each client has separate database, which backed up approximately every hour, with a “snapshot” image. Amazon RDS automatically patches the database software and backs up the databases, storing the backups for a standard 15 day retention period and enabling point-in-time recovery.
We offer our Enterprise clients the Multi-AZ data deployment as standard. When we provision this Multi-AZ DB Instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. In case of an infrastructure failure (for example, instance crash, storage failure, or network disruption), Amazon RDS performs an automatic failover to the standby so that we can resume database operations as soon as the failover is complete. From a technical point of view, the endpoint for the DB Instance remains the same after a failover, so the CustomerGauge application resumes database operation without the need for manual administrative intervention.
- This “fast failover” typically takes around 2 – 3 minutes, and we test this approximately once each 3 months.
- You have access to your data 24×7 in case you wish to have exports for your site.
- You may want to consider a backup stored in another geography, should data privacy laws allow, for maximum disaster recovery. We can arrange this, please contact us for further details.
We provide “Incident Reports” to clients in the event of outages or loss of service
- Typical SLAs on survey and API: 99.95% for client facing surveys (20min per month)
- Typical SLA on dashboard and reporting system: 99.5% per month, planned maintenance performed at weekends wherever possible
We are able to make specific terms and conditions for clients over a certain financial commitment, and are happy to do so if the client is entering in significant sized contract with a defined length of time.