# CIS 556, Fall 2015 Homework 4: Computing discrete logs

The Pretty Bad Privacy encryption tool can be used to insecurely encrypt files to a 512-bit ElGamal public key using 128-bit AES.

The pdf file for your next homework assignment has been encrypted using PBP to the following ElGamal public key:

```-----BEGIN PRETTY BAD PUBLIC KEY BLOCK-----
E3bdvhgJAlNSZtODAQAAAAJBAAAAAwP9JKSj4vzGV029eZrGb/o+Xo1zRV1dnDpmRhXEKvV38cO7
R5T7Z3lKO59zPwKWzQIhigS4wrsGy5+l8318I9k=
-----END PRETTY BAD PUBLIC KEY BLOCK-----
```

The encrypted file is available here. Your task is to use the attack described in this paper by van Oorschot and Wiener to compute the discrete log of the public key and use it to decrypt the homework so you can do the rest of the problems. This is a real vulnerability when implementations don't generate primes carefully: your instructors just published a paper this year where we used this exact attack to compromise several hundred hosts on the Internet. (See Section 3.5, "Attacks on composite-order subgroups".)

The following steps may be helpful:

1. Implement a brute force discrete log algorithm. Your function should take a generator g, a prime p, a target t, and a subgroup order q, and brute force all possible exponents until it finds an x such that g^x = t mod p, when g generates a subgroup of order q mod p.
2. Implement Pollard rho or baby step-giant step. Your function should take a generator g, a prime p, a target t, and a subgroup order q, and in expected time sqrt(q) output x such that g^x = t mod p.
3. Implement the Chinese remainder theorem. Your function should take a list of pairs (xi,pi) of residues xi and primes pi and output a z such that z = xi mod pi for each pi.
4. Use your functions above to implement the Pohlig-Hellman algorithm when the prime factorization of the group order is known.
5. Factor p-1 for the public key (you can use Sage's factor() function or any other computer algebra package or implementation you like) and identify a subgroup with the properties that (1) you can efficiently compute discrete logs in this subgroup (2) the order of your subgroup is large enough that you can uniquely recover a private key as short as PBP uses.
6. Use your Pohlig-Hellman implementation to recover the private key, and then use the private key to decrypt the ciphertext. You can use our code to generate your own keys with known private keys and plaintext to test your implementation.

You may use any programming language you like. Please implement your own discrete log algorithms. You do not need to implement your own factoring algorithm. The code used to encrypt the homework is here. It uses Sage for mathematical calculations and PyCrypto for symmmetric encryption. I recommend you use a library with fast arithmetic; Sage uses GMP for its integer types.

Please submit your code and a short description of how you solved the problem along with a PDF of your LaTeXed solutions to the other problems to Canvas before class on October 20. You may discuss this assignment in small groups with classmates, but please code and write up your solutions yourself. Please credit any collaborators you discussed with and any references you used.

For reference, we give some excerpts from the OpenPBP RFC, inspired by the OpenPGP RFC.

```3.2.  Multiprecision Integers

Multiprecision integers (also called MPIs) are unsigned integers used
to hold large integers such as the ones used in cryptographic
calculations.

An MPI consists of two pieces: a four-octet little-endian scalar
that is the length of the MPI followed by a string of octets that
contain the actual integer.

5.5.2.  Public-Key Formats

A public key contains:
- MPI of Elgamal prime p;

- MPI of Elgamal group generator g;

- MPI of Elgamal public key value y (= g**x mod p where x
is secret).

5.1.  Public-Key Encrypted Messages

The body of the message consists of a string of octets that is the
encrypted session key, followed by the symmetrically encrypted data.
The symmetric session key is derived from m by interpreting m as an
appropriate length string of octets.

- MPI of Elgamal (Diffie-Hellman) value g**k mod p.

- MPI of Elgamal (Diffie-Hellman) value m * y**k mod p.

- Encrypted data, the output of the AES symmetric-key cipher
operating in CBC mode, with PKCS 7 padding.

6.2.  Forming ASCII Armor

When PBP encodes data into ASCII Armor, it puts specific headers
around the Radix-64 encoded data, so PBP can reconstruct the data
later.  A PBP implementation MAY use ASCII armor to protect raw
binary data.  PBP informs the user what kind of data is encoded
in the ASCII armor through the use of the headers.

Concatenating the following data creates ASCII Armor:

- An Armor Header Line, appropriate for the type of data

- The ASCII-Armored data

- The Armor Tail, which depends on the Armor Header Line

surrounded by five (5) dashes ('-', 0x2D) on either side of the
header line text.  The header line text is chosen based upon the type
of data that is being encoded in Armor, and how it is being encoded.
Header line texts include the following strings: