Archived and Closed
This conversation is no longer open for comments or replies and is no longer visible to community members. The community moderator provided the following reason for archiving: Timing passed on announcement
Title: Service Not Available for All Customers Due to Network Firewall Software Defect and DDOS Attack
Date/Time: Friday, December 15, 2017, 9:55 AM to 1:15 PM Eastern US Time (200 Minutes)
Description/Customer Impact: The Quick Base platform became unavailable to all customers due to a software defect in our network firewall that caused it to effectively stop passing traffic between the Internet and the Quick Base platform. The defect was activated by a Network Time Protocol (NTP) amplification distributed denial of service (DDOS) attack (https://www.cloudflare.com/learning/ddos/ntp-amplification-ddos-attack/).
There was no breach of the security apparatus protecting the Quick Base platform and no theft of any data.
There was no loss of data during the incident.
Resolution: We worked with our colocation hosting provider to identify the specific type of offending traffic coming from the DDOS attack, i.e., UDP traffic on port 123 which is the port used by the Network Time Protocol (NTP). Our hosting provider then routed all traffic from the Internet destined for the Quick Base platform through their secure “scrubbing network” that filtered out the DDOS traffic and allowed all other traffic through to the Quick Base platform. Once the traffic filtering took effect, the network firewall immediately returned to normal. With that, we were able to safely open access to Quick Base for customers.
Note, the Quick Base platform does not allow UDP traffic but our firewall still has to see the UDP traffic and drop it. The volume of UDP traffic that triggered the software defect in the firewall was less than one tenth of the traffic volume that our carrier grade firewalls are rated to handle. Therefore, when working as designed, our firewalls should have been able to easily handle the DDOS attack without any noticeable impact to the Quick Base platform or our customers.
1. We are actively working with our firewall vendor to get them to develop a patch for the software defect that was at the root of this incident. We do not yet have an ETA on the software fix from the vendor. In the meantime, we are also working with them to determine alternative mitigation steps we can take to return a firewall to normal service if the defect is activated again.
2. We are actively working with our hosting provider to ensure future DDOS attack traffic can be more quickly routed through their secure scrubbing network.
3. We have been actively testing a DDOS protection service from Cloudflare since October and were previously planning a platform wide rollout starting in late January. Cloudflare would have automatically prevented this specific DDOS attack. We are working to accelerate the rollout of Cloudflare and hope to have more news on that in early January.
4. We spent considerable time during the incident, including switching to our backup data center, working to remediate what we first believed was a hardware failure in our highly available firewall implementation. The aforementioned firewall defect masked the true nature of the issue. We have already made several adjustments to our monitoring systems that will allow us to more quickly diagnose root cause in the future.
Document Edit History:
Version 1.0, Created December 19, 2017
There are no replies.