Defining tokenization

Tokenization and encryption are the two technologies that are most commonly used to protect sensitive cardholder data like the PCI DSS requires. Encryption is very well defined and understood, but tokenization isn’t. Exactly what is tokenization? Here’s my definition. You can read more about this, including why I believe that this definition makes sense and how this definition compares to the definition of encryption, in "Defining Tokenization and the Security Provides" this month’s ISSA Journal.

A tokenization scheme comprises two stateful, deterministic algorithms: tokenize and detokenize. These operate on two strings called a plaintext and a token. The tokenize algorithm produces a token from a plaintext. The detokenize algorithm produces a plaintext from a token that has already been created by the tokenize algorithm.

A secure tokenization scheme is one in which the mutual information between a plaintext and the token that the tokenize algorithm creates from it is zero.

  • Greg

    Please give an example of a tokenization function that is (beleived to be) secure by your definition. Can you also give examples of what functions people are using?

    Reply

  • Luther Martin

    By this definition, a one-up counter would be secure, because the mutual information between the token and the plaintext is zero. A random value would also be secure.
    The tokenization product that Voltage sells uses a FIPS-validated PRNG to create tokens. As to what other people are using, that’s a tough one. Tokenization vendors typically keep all the workings of their systems proprietary, so it’s not at all clear how they create tokens.

    Reply

  • Greg

    Thanks Luther. So in the example when tokenize outputs a random value as the token for an input, how does detokenize work?

    Reply

  • Greg

    Also, is a detailed description of the Voltage product available?

    Reply

  • Luther Martin

    Detokenization is typically done by a database lookup. When a token is created, an encrypted copy of the plaintext is archived along with the token. Then to detokenize, you lookup the ciphertext that corresponds to the token, decrypt the ciphertext, and provide the decrypted plaintext to the requesting application.

    Reply

  • Luther Martin

    I’m sure that our sales guys have lots of information about our tokenization product. You can reach them at sales@voltage.com.

    Reply

  • Greg

    Ah. So the state is shared between tokenize and detokenize, and kept secret from the adversary. Thanks, I get it now.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *