A question we rarely get though we believe should be asked is how does your service ensure uptime?
The first level of failover is data center failover. If a data submission endpoint is unavailable it fails over to another submission endpoint in another region.
What is unavailable when talking about a data submission endpoint? Unified Logging checks its data submission endpoints every 30 seconds for availability. This availability check makes sure all resources are available to accept messages, not just connectivity to the endpoint.
Another more granular level of failover is, what if the data cache fails that the data submission endpoint depends on to accept messages on what should happen?
Unified Logging has a double failover in this scenario. The data cache is a critical part of being able to accept messages as fast as possible. If the cache fails it first tries to failover to retrieving the data needed from the database. If the database is unavailable it then fails over to retrieving a data copy in Azure’s Blob Storage.
Why a double failover?
A data cache failure is not unheard of, neither is a database outage (even in a clustered environment) so a third level of failover was put in place for Unified Logging to ensure message submission would always be available.