Some ISPs are reporting connectivity issues.
Post-Mortem
Our report from the incident is as follows.
Issue
Minor network outage
Outage Length
3 seconds
Underlying cause
One of our transit providers (Cogent) experienced a router failure within their network. Increasing CPU usage on their core router caused packets to be progressively dropped.
Symptoms
Our external monitoring probes immediately reported the fault. Some customers (whose traffic was routed over Cogent), experienced an extremely brief window (<1 minute) of slow page load times or server inaccessibility.
Resolution
Once the packet loss threshold was hit, our internal BGP latency and packet loss measuring device automatically de-preferenced Cogent from the available BGP routes. Once Cogent was removed, traffic continued to flow out over our remaining carriers as normal.
Convergence took <5 seconds, but propagation at other ISPs may have taken a couple of minutes, which is why some customers may have experienced a slightly longer outage.
Our automated systems and monitoring systems behaved exactly as designed for this disaster scenario and recovered the carrier failure in less than 5 seconds.