DQE: Distributed Quota Enforcement

Note: here is the supplement to our conference paper.

Description

DQE is a spam control system, built as an academic research project.

What's the high-level idea?

DQE is a proposal based on quotas (or bankable postage; the term was introduced by Microsoft Research's Penny Black project in this paper). The goal is to limit the number of messages any sender can send by making it expensive to send mail in bulk. In general, quota-based proposals work as follows: every sender gets a quota of stamps. How this quota is determined varies among proposals; options include proof of CPU or memory cycles (as in the Penny Black project), having an email account with an ISP (as in the SHRED project), having a driver's license, etc. The sending host or its email server attaches a stamp to each email message, and the receiving host or its email server tests the incoming stamp by asking a quota enforcer whether the enforcer has seen the stamp before. If not, the receiving host infers that the stamp is "fresh" and then cancels it by asking the enforcer to store a record of the stamp. The receiving host delivers only messages with fresh stamps to the human user; messages with used stamps are assumed to be spam. The hope is that allocating reasonable quotas to everyone and then enforcing those quotas would cripple spammers, who need huge volumes to be profitable, while leaving legitimate users largely unaffected.

Here is a depiction of the DQE system architecture (TEST and SET are the messages that the recipient uses to test stamps for freshness and then to cancel them):


DQE Architecture

What do certificates and stamps look like?

The Quota Allocators (QAs) have distinct public/private key pairs (QApub,QApriv); the QApub are well known. A participant S constructs public/private key pair (Spub,Spriv) and presents Spub to a QA. The QA determines a quota for S and returns the signed certificate to S (the notation {A}B means that string A is signed with key B):
CS={ Spub,expirationtime,quota}QApriv.
This certificate expresses to the world that sender S has the right to "mint" quota stamps in a given era (e.g., per day).

Each stamp has the form
{ CS,{i, t}Spriv}.
Each i in [1,quota] is supposed to be used no more than once in the current epoch. t is a unique identifier of the current epoch.

Given a stamp and given QApub, a receiver can check whether a given stamp is authentic and "below quota". To check that the stamp has not been used before, the receiver communicates with the enforcer, as mentioned above.

How is DQE different from the other quota (or bankable postage) schemes?

It differs in two major ways (and many minor ways):

Of course, there are many other spam-fighting proposals. Our conference paper, below, mentions some of them. (A related comment is that DQE probably fits into this litany somewhere!)

Goals and Technical Challenges

We separate the goals and technical challenges into two categories: general requirements for DQE and specific challenges for the enforcer. These requirements all concern quota enforcement; indeed, in this work we mostly do not address quota allocation. The reason for this focus is that these two are different concerns: the former is a purely technical matter while the latter involves social, economic, and policy factors.

General DQE Requirements

No false positives   Our high-level goal is reliable email. We assume reused stamps indicate spam. Thus, a fresh stamp must never appear to have been used before.
Untrusted enforcer   We do not know the likely economic model of the enforcer, whether monolithic (i.e., owned and operated by a single entity) or federated (i.e., many organizations with an interest in spam control donate resources to a distributed system). No matter what model is adopted, it would be wise to design the system so that clients place minimal trust in the infrastructure.
Privacy   To reduce (already daunting) deployment hurdles, we seek to preserve the current "semantics" of email. In particular, queries of the quota enforcer should not identify email senders (otherwise, the enforcer knows which senders are communicating with which receivers, violating email's privacy model), and a receiver should not be able to use a stamp to prove to a third party that a sender communicated with it.

Challenges for the Enforcer

Scalability   The enforcer must scale to current and future email volumes. Studies estimate that 80-90 billion emails will be sent daily this year (see, for example,
a report from IDC). We set an initial target of 100 billion daily messages (an average of about 1.2 million stamp checks per second) and strive to keep pace with future growth. To cope with these rates, the enforcer must be composed of many hosts.
Fault-tolerance   Given the required number of hosts, it is highly likely that some subset will experience crash faults (e.g., be down) or Byzantine faults (e.g., become subverted). The enforcer should be robust to these faults. In particular, it should guarantee no more than a small amount of stamp reuse, despite such failures.
High throughput   To control management and hardware costs, we wish to minimize the required number of machines, which requires maximizing throughput.
Attack-resilience   Spammers will have a strong incentive to cripple the enforcer; it should thus resist denial-of-service (DoS) and resource exhaustion attacks.
Mutually untrusting nodes   In both federated and monolithic enforcer organizations, nodes could be compromised. In the federated case, even when the nodes are uncompromised, they may not trust each other. Thus, in either case, besides being untrusted (by clients), nodes should also be untrusting (of other nodes), even as they do storage operations for each other.

Our conference paper at NSDI 2006, below, describes how DQE addresses the above challenges.

Papers

The following papers give more information about DQE. The first is a position paper for a workshop. The second is a full-length conference paper that substantially updates the workshop paper with a much simpler (and more practical) design. The conference paper also has an analysis, implementation, and evaluation. The third paper is a supplement to the conference paper; it contains a few explanations that we were not able to fit into the conference paper.
If you would like to cite this work, please cite the NSDI conference paper. Thank you!

Talks

Software

Here are snapshots of the code that implements the components described in Sections 5 and 6 of our NSDI paper. Here is a HOWTO for getting the code working. The HOWTO is still quite rough, so please email us with questions, feedback, bugs, etc.

People

Michael Walfish   J.D. Zamfirescu   Hari Balakrishnan   David Karger   Scott Shenker

Funding

This work is supported by the National Science Foundation under grants CNS-0225660 and CNS-0520241, by British Telecom, and by an NDSEG Graduate Fellowship.

Contact Us

We welcome comments, questions, and feedback. Please send mail to