Blog

# Cryptography for Mere Mortals #5

An occasional feature, Cryptography for Mere Mortals attempts to provide clear, accessible answers to questions about cryptography for those who are not cryptographers or mathematicians.

Q: How do we know whether a given encryption solution will produce unique results—that is, that for each unique input, a unique output will result? If, say, two SSNs both encrypt to the same value, we’ll have a huge mess on our hands!

A: This is one of the reasons that a security proof is important. It’s easy to say “Sure, it’ll be unique”, but without a careful, peer-reviewed security proof, such a statement has no meaning. The next question is, obviously, “What is a security proof?”

The security of cryptographic schemes is often witnessed by the absence of attacks: a scheme is considered to be secure if it has been around for a long period, has received widespread attention, and could nonetheless not be significantly weakened by any attack. Famous examples of this “negative” security argument are the Data Encryption Standard (DES) and RSA. As opposed to this, provable security is often considered as a decisive advantage for a cryptographic scheme. In this context, it is proved (in a precise mathematical sense) that the scheme is secure, provided some assumptions are satisfied. These “basic” assumptions are typically the security of the primitives the scheme is based on (e.g., the Diffie-Hellman key agreement), or the difficulty of some well-known problem (e.g., factorization, discrete log computation, … ). Typically, security is proved by exhibiting a reduction that turns a breaking of the scheme into a solution of the underlying problem. By contraposition, this proves that, if the underlying problem is actually hard, then the scheme is secure.

In other words, a security proof is just like the geometric proofs we did back in high school: given some basic assumptions that nobody is likely to argue with (“A straight line segment can be drawn joining any two points” et al.), logically show that a given algorithm has specific properties. Peer review of security proofs simply provides improved assurance that the proof is correct: even without any deliberate misrepresentation, the process is complex enough that it makes sense for other, perhaps smarter people to check for mistakes.

One of the things that a good security proof will demonstrate, for any encryption algorithm that is intended to be reversible (that is, to allow decryption), is uniqueness of outputs. A non-unique output would be a collision. Collisions are not just bad, they’re crossing-the-streams bad, because they mean that you cannot reliably decrypt: if x and y both encrypt to z, then when you decrypt z, will you get x or y? You likely want this to be deterministic!

It’s important to note that “no collisions” does not necessarily translate to “input never equals output”. For example, if you use Format-Preserving Encryption on a gender field (M/F), there are only two possibilities:

· M maps to F

· M maps to M

(and vice-versa for F). This is one of the reasons that Voltage SecureData requires specification of a flag to use FPE on very short data: such cases are considered cryptographically weak, in that methods such as statistical analysis may allow an attacker to figure out the matching plaintext for that field (there are ways around this, which will be covered in a later post).

But this goes beyond such obvious cases: if you use FPE on, say, a four-digit number, then it is possible—not likely, but possible—that at least one of those values will encrypt to itself (a “fixpoint”, in crypto-speak). This is not a flaw in the algorithm: indeed, to a cryptographer, such behavior is not only obvious, but desirable. They even have a fun term for an algorithm that does not contain any fixpoints: a derangement.

Note that the input set of four-digit values has 10,000 elements (0 through 9999); accordingly, the output set must have 10,000 elements, so there’s no obvious action that can be taken if a fixpoint is found—changing the value means a collision. Moreover, if we know that a value can never encrypt to itself, we are actually weakening the encryption, because given a ciphertext, we now only have 9,999 possible values for the matching plaintext.

This might seem like a trivial weakening—but is exactly the kind of thing that security proofs deal with. “A few bits here, a few bits there, pretty soon, you’re talking real weakness”, to misquote Everett Dirksen’s famous line.