Notes on Distributed Denial of Service (DDoS) Hari Balakrishnan November 2005 Machines on the Internet are notoriously insecure and it is relatively straightforward to compromise them. Once compromised, these "zombie" or "bot" hosts can be used to mount a variety of attacks on Internet services and Web sites. The creativity of attacks observed in-the-wild has been impressive. There are numerous kinds of DDoS attacks. We divide them into three broad categories; each group has several examples. 1. Packet floods: Packet flooding attacks (e.g., TCP SYN floods, ping floods, ICMP reply floods, etc.) attempt to exhaust link bandwidth (usually at a server's access link). 2. Kernel state/resources attacks: These attacks target the kernel state at a machine (e.g., causing a server to create a large number of TCP control blocks, use up other kernel resources, etc.). 3. Attacks on higher-level services running on a machine: Examples include making search or other queries at a site that involve expensive database operations, reducing a server's available CPU or memory resources by causing it to perform expensive operations, etc. These attacks are increasingly being carried out by criminal elements with nefarious intent that goes beyond just "bragging rights". For instance, extortion by "cybermafia" is an increasingly significant cause for attacks. Several recent FBI cases attest to this trend. DDoS has motivated many recent research and commercial projects. We categorize the flood of work in this area into two broad categories: 1. Attack prevention schemes. 2. Attack detection schemes. Attack prevention: ------------------ -- Ingress filters. RFC 2827 advocates ingress filtering to prevent source IP address spoofing. If widely deployed at the "edges" of the Internet, they can greatly reduce the damage done by spoofed addresses. Deployment today appears to be mixed. Because there is not much incentive for an ISP at the edge to deploy filters to protect a remote site that isn't even a customer, such filters aren't widely deployed. But it does not look like there's much spoofing going on (thresholds for detecting anomalous activity at an edge are high), and because there isn't much spoofing going on, ISPs don't feel the need to deploy filters. -- If ingress filters are not widely deployed, and if source address spoofing is common, then the general principle to protect server resources (e.g., kernel state) is to force a three-way handshake. State is created only after the completion of this handshake. As long as the attacker does not have the ability to receive packets at a spoofed address, this scheme can prevent spoofed packets from causing damage at a server. Protecting link bandwidth is not possible with this method, of course. But given the ease with which machines are compromised, it appears that most attacks end up saturating links by coming from thousands of machines. -- Filters/pushback: Deploy filters based on schemes that detect anomalous activity. E.g., Intrusion Detection System (IDS) on a site's network detects anomaly, and causes filters to be installed. Commercial products from Mazu, Arbor, etc. work in this fashion. (Anomalies are detected using a variety of schemes and heuristics.) The protection mechanism itself is quite simple: filters implemented in router access control lists. Research versions of this idea include "pushback" and AITF, which both advocate pushing filters as close to the source as possible. Of course, one needs to know where the attack traffic is coming from. If source addresses are not spoofed, this part is trivial (it still doesn't mean that a victim will have the right or ability to have filters installed at third-party sites, but at least it can install local filters). But what happens when source addresses _are_ spoofed? IP traceback schemes address this problem (next section). In general, for filters/pushback to really work, one needs a way to have requests for filter deployment be authenticated. Making that work at filters far from the victim, at locations with whom the victim does not have a commercial or other form of relationship, isn't easy. Hence, in practice, these schemes are deployed near the victim (unless the filter requests are made at the behest of entities elsewhere on the attack path). -- Capabilities: In operating systems, ACLs and capabilities are duals, and recent work has shown the same duality for protecting networks against DDoS. One such scheme is TVA. The high-level idea in these schemes is for a sender to first make a request to the destination to send packets to it. If the destination agrees, then it grants a {\em capability} to the sender. The nice property of this capability is that during the request phase, routers along the path from sender to destination place information in the packets, and the resulting capability is conceptually just the sequence of these {\em pre-capabilities}. The capability can be authenticated by each router, and can't be forged easily. TVA details are as follows. Each router places a pre-capability in the request packet: Timestamp | H(src, dst, timestamp, router_secret) The router_secret changes (slowly) with time. The sender, with each packet, sends a sequence of these pre-capabilities. The only way the sender could have gotten this capability is if the destination sent it back to the sender (or if there was some malicious node on the path, but in that case far worse damage can be done by that entity). Each router verifies its portion of the capability based on information in every packet (modulo caching optimizations, discussed in the paper). The other important detail is that the destination-granted capability is not for all time, but only allows N bytes to be sent in time T. Both N and T are included in the response to the sender requesting a capability, and the actual capability returned to the requestor is: Timestamp | H(pre-capability-sequence, N, T) I.e., what's returned to the requestor is the above capability, as well as N and T. It's easy to see that a malicious sender can't forge N and T and make them bigger than what was granted to it by the destination. (This discussion presumes that Internet routing is working properly and that the request packets sent to a destination can't be diverted to some malicious party. In any case, as long as some router on the true path between sender and destination did not correctly place its pre-capability, it will either drop packets not containing the proper capability or send them at low priority.) The TVA paper talks about how to deal with 8-bit timestamps that wrap-around. More issues: -- How to monitor that a host with capability to send N bytes in time T to a destination does not violate that quota? The problem is that the space for this task at routers can be exhausted. The solution in TVA is to monitor only "high rate" flows that are sending more than a certain "leaky bucket" rate. The idea is similar in spirit to the credit-based implementation of fair queueing we saw several lectures ago. -- What about link bandwidth exhaustion attacks where K bad nodes mutually give each other capabilities to send huge numbers of packets across various links? The solution, ultimately, is to rely on some form of fair queueing. -- How to cope with route changes? The solution is to re-request a capability. Attack detection: ----------------- -- Detecting anomalies, intrustions, etc. at IDSs. That could directly trigger filters, or could trigger traceback (because you may want to know where attack is coming from, for out-of-band or in-band defenses). -- Traceback. Two ideas: Idea 1: Carry state in packets: E.g., probabilistic traceback. Suppose each router writes its address in the IP packet header with probability $p$, and also writes a distance $d$ from the source in the header. $d$ increments at each router. Over a sequence of packets, the receiver can obtain a list of all routers. (It is easy to work out the expected number of packets for all routers to be heard from.) This idea is at the core of probabilistic IP traceback schemes. An improvement to this "node sampling" scheme is "edge sampling", where pairs of routers write the "start" and "end" of a link with probability $p$. One drawback of this scheme is that it requires changes to the IP header format. Idea 2: Maintain "compressed" packet-digest state efficiently in routers. This is the idea in the hash-based traceback paper that describes the SPIE (source path isolation engine) system. Use a Bloom filter; apply k hash functions on portions of a packet, and set k bits in a 2^n bit array (the hash function hashes to an n-bit value). Turns out modern routers can handle the necessary memory sizes, and the scheme is fairly efficient if traceback requests aren't too common. The Snoeren et al. paper discusses a concrete engineering design of this approach. In response to detecting where attacks are coming from, filters/pushback could be placed at the appropriate network locations. Capability schemes suggest that the receiver would know what capability to grant a sender based on an initial packet (or initial set of packets). It is not clear that one can always do that. In contrast, schemes that place filters do so {\em after} detecting an anomaly.