Notes on Distributed Denial of Service (DDoS)
		       Hari Balakrishnan
		         November 2005

Machines on the Internet are notoriously insecure and it is relatively
straightforward to compromise them.  Once compromised, these "zombie"
or "bot" hosts can be used to mount a variety of attacks on Internet
services and Web sites.  The creativity of attacks observed
in-the-wild has been impressive.

There are numerous kinds of DDoS attacks.  We divide them into three
broad categories; each group has several examples.

1. Packet floods: Packet flooding attacks (e.g., TCP SYN floods, ping
floods, ICMP reply floods, etc.) attempt to exhaust link
bandwidth (usually at a server's access link).

2. Kernel state/resources attacks: These attacks target the kernel state
at a machine (e.g., causing a server to create a large number of TCP
control blocks, use up other kernel resources, etc.).

3. Attacks on higher-level services running on a machine: Examples
include making search or other queries at a site that involve
expensive database operations, reducing a server's available CPU or
memory resources by causing it to perform expensive operations, etc.

These attacks are increasingly being carried out by criminal elements
with nefarious intent that goes beyond just "bragging rights".  For
instance, extortion by "cybermafia" is an increasingly significant
cause for attacks.  Several recent FBI cases attest to this trend.

DDoS has motivated many recent research and commercial projects.  We
categorize the flood of work in this area into two broad categories:

1. Attack prevention schemes.

2. Attack detection schemes.


Attack prevention:
------------------

-- Ingress filters.  RFC 2827 advocates ingress filtering to prevent
source IP address spoofing.  If widely deployed at the "edges" of the
Internet, they can greatly reduce the damage done by spoofed
addresses.

Deployment today appears to be mixed.  Because there is not much
incentive for an ISP at the edge to deploy filters to protect a remote
site that isn't even a customer, such filters aren't widely deployed.
But it does not look like there's much spoofing going on (thresholds
for detecting anomalous activity at an edge are high), and because
there isn't much spoofing going on, ISPs don't feel the need to deploy
filters.

-- If ingress filters are not widely deployed, and if source address
spoofing is common, then the general principle to protect server
resources (e.g., kernel state) is to force a three-way handshake.
State is created only after the completion of this handshake.  As long
as the attacker does not have the ability to receive packets at a
spoofed address, this scheme can prevent spoofed packets from causing
damage at a server.  Protecting link bandwidth is not possible with
this method, of course.  But given the ease with which machines are
compromised, it appears that most attacks end up saturating links by
coming from thousands of machines.

-- Filters/pushback: Deploy filters based on schemes that detect
anomalous activity.  E.g., Intrusion Detection System (IDS) on a
site's network detects anomaly, and causes filters to be installed.
Commercial products from Mazu, Arbor, etc. work in this fashion.
(Anomalies are detected using a variety of schemes and heuristics.)
The protection mechanism itself is quite simple: filters implemented
in router access control lists.

Research versions of this idea include "pushback" and AITF, which both
advocate pushing filters as close to the source as possible.

Of course, one needs to know where the attack traffic is coming from.
If source addresses are not spoofed, this part is trivial (it still
doesn't mean that a victim will have the right or ability to have
filters installed at third-party sites, but at least it can install
local filters).  But what happens when source addresses _are_ spoofed?
IP traceback schemes address this problem (next section).

In general, for filters/pushback to really work, one needs a way to
have requests for filter deployment be authenticated.  Making that
work at filters far from the victim, at locations with whom the victim
does not have a commercial or other form of relationship, isn't easy.
Hence, in practice, these schemes are deployed near the victim (unless
the filter requests are made at the behest of entities elsewhere on
the attack path).

-- Capabilities: In operating systems, ACLs and capabilities are
duals, and recent work has shown the same duality for protecting
networks against DDoS.  One such scheme is TVA.  The high-level idea
in these schemes is for a sender to first make a request to the
destination to send packets to it.  If the destination agrees, then it
grants a {\em capability} to the sender.  The nice property of this
capability is that during the request phase, routers along the path
from sender to destination place information in the packets, and the
resulting capability is conceptually just the sequence of these {\em
pre-capabilities}.  The capability can be authenticated by each
router, and can't be forged easily.

TVA details are as follows.  Each router places a pre-capability in
the request packet:

    Timestamp | H(src, dst, timestamp, router_secret)

The router_secret changes (slowly) with time.  The sender, with each
packet, sends a sequence of these pre-capabilities.  The only way the
sender could have gotten this capability is if the destination sent it
back to the sender (or if there was some malicious node on the path,
but in that case far worse damage can be done by that entity).  Each
router verifies its portion of the capability based on information in
every packet (modulo caching optimizations, discussed in the paper).

The other important detail is that the destination-granted capability
is not for all time, but only allows N bytes to be sent in time T.
Both N and T are included in the response to the sender requesting a
capability, and the actual capability returned to the requestor is:

    Timestamp | H(pre-capability-sequence, N, T)

I.e., what's returned to the requestor is the above capability, as
well as N and T.  It's easy to see that a malicious sender can't
forge N and T and make them bigger than what was granted to it by the
destination.  (This discussion presumes that Internet routing is
working properly and that the request packets sent to a destination
can't be diverted to some malicious party.  In any case, as long as
some router on the true path between sender and destination did not
correctly place its pre-capability, it will either drop packets not
containing the proper capability or send them at low priority.)

The TVA paper talks about how to deal with 8-bit timestamps that
wrap-around.

More issues:

-- How to monitor that a host with capability to send N bytes in time
T to a destination does not violate that quota?  The problem is that
the space for this task at routers can be exhausted.

The solution in TVA is to monitor only "high rate" flows that are
sending more than a certain "leaky bucket" rate.  The idea is similar
in spirit to the credit-based implementation of fair queueing we saw
several lectures ago.

-- What about link bandwidth exhaustion attacks where K bad nodes
mutually give each other capabilities to send huge numbers of packets
across various links?

The solution, ultimately, is to rely on some form of fair queueing.

-- How to cope with route changes?

The solution is to re-request a capability.

Attack detection:
-----------------

-- Detecting anomalies, intrustions, etc. at IDSs.  That could
directly trigger filters, or could trigger traceback (because you may
want to know where attack is coming from, for out-of-band or in-band
defenses).

-- Traceback.

Two ideas:

Idea 1: Carry state in packets: E.g., probabilistic traceback.
Suppose each router writes its address in the IP packet header with
probability $p$, and also writes a distance $d$ from the source in the
header.  $d$ increments at each router.  Over a sequence of packets,
the receiver can obtain a list of all routers.  (It is easy to work
out the expected number of packets for all routers to be heard from.)
This idea is at the core of probabilistic IP traceback schemes.  An
improvement to this "node sampling" scheme is "edge sampling", where
pairs of routers write the "start" and "end" of a link with
probability $p$.

One drawback of this scheme is that it requires changes to the IP
header format.

Idea 2: Maintain "compressed" packet-digest state efficiently in routers.
This is the idea in the hash-based traceback paper that describes the
SPIE (source path isolation engine) system.

Use a Bloom filter; apply k hash functions on portions of a packet,
and set k bits in a 2^n bit array (the hash function hashes to an
n-bit value).  Turns out modern routers can handle the necessary
memory sizes, and the scheme is fairly efficient if traceback requests
aren't too common.  The Snoeren et al. paper discusses a concrete
engineering design of this approach.

In response to detecting where attacks are coming from,
filters/pushback could be placed at the appropriate network locations.

Capability schemes suggest that the receiver would know what
capability to grant a sender based on an initial packet (or initial
set of packets). It is not clear that one can always do that.  In
contrast, schemes that place filters do so {\em after} detecting an
anomaly.