We have become aware of core network issue with one of our providers. Updates to follow.
Our report from the incident is as follows.
Packet loss affecting worldwide connectivity.
We had been made aware from one of our carriers that maintenance would be conducted between 12am and 6am; we had made provisions for this and were prepared for systems to automatically switch over during the outage on one carrier.
However, the failover did not occur as planned, as it appeared our other carrier was also affected.
Our internal and external monitoring probes immediately reported a fault.
The BGP sessions were not able to be re-established despite efforts. As a last resort, both routers were consecutively rebooted by an on-site technician. This re-established the BGP sessions and connectivity was restored.
We believe there many be commonality between the carriers (shared fibre/conduits/backhauls). An investigation has been launched to see why and how both were simultaneously affected.
Services are being closely monitored and it is likely that some failover tests will be conducted throughout the week to test and guarantee against future failure.