All systems are operational

Past Incidents

Sunday 4th March 2012

Network Pingdom reporting connectivity issues

We use two forms of monitoring, Pingdom (an external service) and our own monitoring platform (also external).

Within the last 15 minutes, we have received several Pingdom notifications reporting connectivity dropping and immediately coming back up. However, this does not correspond with our own monitoring reports.

Both Pingdom’s monitoring service and Pingdom’s FPT are showing strange results - however, other 3rd party services are reporting no issues.

At the moment, we are investigating what is going on, but it looks to be an issue with Pingdom rather than our connectivity. Enquiries are under way.

Saturday 3rd March 2012

No incidents reported

Friday 2nd March 2012

No incidents reported

Thursday 1st March 2012

Network 27/02/2012 downtime explaination

Our report from the incident on 27/02/2012 is as follows.
 
Issue

DDOS attack to our transit provider’s network

Underlying cause

External high volume attack from multiple sources targeting a customer subnet

Symptoms

Intermittent loss of service on multiple subnets

Resolution

  1. 9:31pm 27/02/2012 the network monitoring and noc team saw a sustained DDOS attack to the network or around 3-4Gbit per second from around 2-3k of hosts. Traffic was received over all 4 carriers from both sites
  2. 9:41pm port security violations limits were hit on our one of our carrier upstreams which took one of the carriers offline increasing the load on the remaining carriers   
  3. 10:05pm affected IP subnet block was identified and network engineers began null routing the affected subnet from the network core
  4. 10:30pm Delta house network connectivity was restored
  5. 10:41pm partial traffic restored in Reynolds house network
  6. 10:45pm full network was restored and DDOS traffic was being held back by the border routers and full customer services restored
  7. 2:30am amended border routers to further drop packets from the attack.

From the information gathered so far the evidence points to a single attack to one customer.

The team are still looking through logs and progressing the incident with the relevant authorities and further measures are currently being invoked to reduce such attacks in future.

Wednesday 29th February 2012

No incidents reported

Tuesday 28th February 2012

Network Core issue resolved

We have had a brief chat with the data centre team and the root cause of the downtime last night is believed to be down to a broad DDoS attack across a number of subnets - peripheral to our own network, but substantial enough to saturate the 10GB uplinks to our peers.

A formal investigation is under way at present, however, we have been assured our own connectivity should not be affected any more.

We would like to apologise for the outage last night, which spanned 11 minutes in total, but we hope our proactive response to the situation and information clarity throughout was of some benefit to concerned customers.

We are currently discussing means to prevent this happening again, however, as the attack was not directed at subnets within our own network, it will still be hard to mitigate.

For reliability and performance, we hand off BGP to our upstream provider who uses multiple peers and handles external (internet) routes on our behalf - however, this was our downfall, as when another customer of theirs fell victim to a DDoS attack, it saturated the common transit uplinks affecting the entire data centre.

We are not in doubt of our current peers/transit providers; as it has served us well, with 3 years of 100% network connectivity and we have full faith in their ability to deal with future issues.

Monday 27th February 2012

Network Experiencing high packet loss

Connectivity was mostly restored after a few small windows of downtime, but routes are flapping at the moment.

Engineers are still working on a resolution and to identify the root issue - but at present we are awaiting updates.

What we know

The issue is outside of Sonassi Hosting’s network; our transit provider is experiencing difficulties at the data centre which is something that we cannot remedy. They have engineers on site working on a fix.

We still have 100% power and 100% cooling, as well as our internal network (from edge-in) is 100% functional, however outbound/inbound national routes are flapping.

First ever significant outage

This is our first ever significant outage in 3 years of operations and certainly not what our clients are accustomed to.

We would like to reassure all customers that we will remain available on here and Twitter (@sonassi @sonassihosting) if you want to talk to us directly.

Network Experiencing high packet loss

The issue looks to be much further upstream than our network. Technicians are on site from our transit provider and are working hard to achieve a resolution.

Another update will be given within the hour.

Network Experiencing high packet loss

Still under investigation, but it looks like a dDOS attack across a number of subnets. We are looking to null route the affected subnets whilst we investigate.

Network Experiencing high packet loss

Still under investigation, but it looks like a dDOS attack across a number of subnets. We are looking to null route the affected subnets whilst we investigate.