Skip to main content

0. Getting Started

Section 0 of this assignment is equivalent to section 0.A-0.C of hw06 - we repeat it here as a refresher.


A. Goals


The purpose of this assignment, a continuation of hw06, is to gain practice with object-oriented programming, bitwise operators, number systems, and cryptography. The specific goals are to:


B. Background


Encryption is the process of encoding messages or information that can only be decoded by people with authorization (usually some sort of key). Good encryption algorithms produce output that looks random to a bystander but is easily decipherable with the correct key.

In Hail, Caesar!, the key was a single character which shifted all characters in the message up the alphabet. This algorithm was far from perfect, and you demonstrated this by writing a function to crack the cipher without having authorization. This exploited the fact that the outputted cipher wasn’t entirely random – some letters appeared an abnormal amount, and this allowed you to find the offset key.

Over the two millennia since Julius Caesar, cryptography has gone a long way. Computers have played an enormous role in developing information communications as well as keeping those communications safe. In this assignment, you will implement and use a linear feedback shift register (LFSR) to create a stream of pseudo-random bits. This will help you create a significantly more random cipher.

Once you have written a crypto library, with the necessary functions, you will use it for a practical application: steganography — the practice of hiding secret messages in images in plain sight.


C. Understand the Problem


In hw06, you wrote an LFSR class. Once given a “seed”, it produces a boundless stream of seemingly random bits (1s and 0s). The key feature is that given the same seed, the LFSR will produce the same stream of bits. This means that if Alice and Bob both know the same seed, they can produce identical random bit streams.

To encrypt the message, you will perform an exclusive or (^) operation between each bit in the message and the LFSR’s sequence of pseudo-random bits. The resulting cipher, after you perform the XOR operations will appear to be nonsense, like Kao y{u(x kp }1rkz~|g2rjf@r), but when it is XORed again with the same sequence of pseudo-random bits as the first time, the encrypted message can be decrypted into the original message.

Since an LFSR’s bit stream is completely determined by its given parameters, Alice can produce the same bit stream that Bob used to encrypt a cipher if she knows the seed, and so can decrypt the message through the same process. For this reason, an LFSR encryption scheme is a symmetric key encryption technique (the alternative is an asymmetric scheme, in which different passwords are used for encrypting and decrypting).

You will need to make use of the LFSR class you wrote in hw06. Copy this into your Codio, but note that you do not need to submit this file to Gradecope.


D. Designing the Requirements and Interface


All of the extra files (image data and readme) you need will be in codio when you start the assignment, and you can download them here as well. You will be writing the following three classes from scratch. Their APIs are listed below, but each will be elaborated on in later sections of the homework. You must include all of the functions listed below in the class under which they are listed.


public class Codec
------------------
public static int[] encode(String str);
public static String decode(int[] bits);
public static void encrypt(int[] message, String seed, int tapPosition);
public static void decrypt(int[] cipher, String seed, int tapPosition);

public class RetrieveMessage
----------------------------
public static void main(String[] args);

public class HideMessage
----------------------------
public static void main(String[] args);

In any class, you may add additional functions and/or instance variables you like, but they must be private. A public main for testing is fine, but no additional public functions or instance variables may be added.


1. Codec Library


A. ASCII Conversion


Before we can begin encrypting and decrypting using our LFSR, we need to implement two functions to encode and decode our messages into ASCII. For instance, the string SENDMONEY maps to the following sequence of ASCII codes:

{ 83, 69, 78, 68, 77, 79, 78, 69, 89 }

Remember that you can convert characters in a String to their ASCII codes quite easily: use the charAt() method to extract a single character, and cast the result to an int. (Technically, you’re getting the character’s Unicode code, but the first 128 characters in the Unicode character set match the ASCII character set exactly.)

Since you will encrypt messages bit-by-bit, not character-by-character, you will need to expand each ASCII code into 7 1s and 0s that correspond to the code’s binary representation. (By the time ASCII became the universal standard for encoding characters on computers, most computers used 8-bit bytes. However the communication lines to keyboards, printers, screens, and other computers tended to be very unreliable. To help detect transmission errors, the ASCII standard only uses 7 bits to encode a character and leaves the 8th bit free for error-checking purposes.)


B. Helper Functions


To help implement your Codec library, begin by creating two helper functions. Remember, helper functions should be private!

private static int[] charToIntArray(char ch)
private static char intArrayToChar(int[] bitString) 

private static int[] charToIntArray(char ch) takes in a character and converts it to a binary representation of its ASCII value. Specifically, it will return an int[] where each element is a single bit (i.e. 1 or 0) in the ASCII encoding of the character ch using 7 bits.

For example, charToIntArray('C') should output the array { 1, 0, 0, 0, 0, 1, 1 } because ‘C’ has an ASCII value of 67 and 67 in binary is 1000011.

Hint: first focus on getting the ASCII representation of the char as an int (in decimal) and then work out how to find the 7 bits that represent that int. You can do this using standard arithmetic operations or bit-level operators. It’s exactly the same process we covered in lecture and recitation for converting decimal numbers to binary or hex.

private static char intArrayToChar(int[] bitString) does the opposite of charToIntArray(char ch). It takes in an int[] of length 7 where each element is a single bit (i.e. 1 or 0) in the ASCII encoding of a character ch and outputs the character that is encoded.

For example, intArrayToChar({ 1, 0, 0, 0, 0, 1, 1 }) should output the character 'C'.

Make sure you extensively test both helper functions before moving on. You should test each function individually as well as together (i.e. make sure that they perform the inverse operation of each other).


C. Encode and Decode


For your Codec.java library, begin by writing two functions, making use of the helper functions you created in the previous section.

public static int[] encode(String str) takes in a String and will return an int[] where each element is a single bit in the ASCII encoding of the String. Each character in str will encode into 7 bits, so the output array should be 7 times larger than str.length().

For example, encode("C") should output the array { 1, 0, 0, 0, 0, 1, 1 }

Hint: Use the helper functions written above to help with this.

You should test encode() and be confident before you move on to the next function.

public static String decode(int[] bits) will accept an array of ints representing a message in ASCII, and will return the decoded String. This is simply a reversal of the encode() process. Remember, every 7 bits corresponds to one char.

Once you have written encode() and decode(), you should be able to encode any String into binary and decode it back into the original string.


D. Encrypt and Decrypt


Your next task is to write functions to encrypt and decrypt messages by xor-ing them with a sequence of pseudo-random bits generated by the LFSR. The “password” is the set of parameters that define the LFSR (the seed and tap position).

For instance, consider the letter “C” with binary representation 1000011. Say we construct an LFSR with 01101000010 as the seed and 8 as the tap. This will produce a bit stream of 1100100100.

Note: The bit stream is not the changed register - it is the compiled string of pseudo-random bits that come from continually calling nextBit() many times on our register. You should use the output of nextBit() as your bit stream.

Encrypting the letter “C” (67 in ASCII, which is represented in binary as 1000011) with this LFSR would be done by:

message bits   1000011
random  bits   1100100
----------------------
encrypted bits 0100111

Note that the ith bit in the cipher is the XOR of the ith bit in the message and the ith bit in the random bit stream.

Write public static void encrypt(int[] message, String seed, int tapPosition) to perform this encryption. message is the message in binary to encrypt, and seed and tapPosition are the parameters for the LFSR to be used when encrypting. You should not be returning anything. Instead, you should be changing the message array in place. Implement the following error checks, in order:

  1. If the seed is null, throw an IllegalArgumentException with an appropriate error message.
  2. If the tap position is impossible, throw an IllegalArgumentException with an appropriate error message.
  3. If the seed contains any characters other than '0' and '1', throw an IllegalArgumentException with an appropriate error message.
  4. If the message is null, do nothing.
  5. If the message length is not a multiple of 7, throw an IllegalArgumentException with an appropriate error message.
  6. If any entry in the message array contains a value other than 0 or 1, throw an IllegalArgumentException with an appropriate error message.

Once you have tested and are confident in encrypt() you should write public static void decrypt(int[] cipher, String seed, int tapPosition). This function should be one line long.

Hint: Exclusive or is symmetric. In practice, this means that if c == r ^ m then m == r ^ c.


2. Image Steganography


A. Understanding the Problem


Now that you have written a more modern encryption library, you will use it for a practical application: steganography. Image steganography is the science of hiding secret messages inside of images. Think of it as 21st century disappearing ink. The casual observer simply sees an ordinary image; only someone who knows how to look for it will notice or find the message.

Image steganography has many practical uses, particularly for digital watermarking, where a message is hidden so that the image source can be tracked or verified. The FBI has even alleged that Russian intelligence services have used steganography to communicate with agents abroad.

You will implement a simple, but very effective form of image steganography. The idea is to fiddle with each pixel’s color in a way that isn’t perceivable to the human eye, but that the computer can interpret. Since the human eye is least sensitive to blue wavelengths, we’ll slightly adjust the amount of blue in each pixel in a way that encodes a single bit of information.

As far as the computer is concerned, an image is just a 2-D array of pixels. The color of each pixel is an integer between 0 and 16.8 million (224 - 1 to be precise), with 8 bits dedicated to each of the red, green, and blue primaries.

image colors

Flipping the ones bit of this number (i.e. subtracting 1 from an odd number or adding 1 to an even number) changes the amount of blue in the color by a minute amount that is indistinguishable to the human eye. These two blues are 001001011000010110101110 and 001001011000010110101111.

blues

Using your Codec.java library, you will be decoding a message hidden inside of an image.


B. ImageData.java


We have provided you with the ImageData.java library for reading and writing images as 2D integer arrays. ImageData.java provides three functions:

public static int[][] load(String filename)              // load image filename and return it as a 2-D array of (base 10) integers
public static void show(int[][] img)                     // display img in a window
public static void save(int[][] img, String filename)    // save img to filename

Each element in the 2D integer array corresponds to one pixel in the image. Refer to section 2A to see how each pixel is represented as an int.


C. Retrieve Message


RetrieveMessage.java should load the specified image using ImageData.load(), extract the embedded cipher, and print out the decrypted message. There are two ways that this class is called from the command line.

Note that for this class, we leave the implementation almost entirely up to you beyond the requirements for running the class above.

RetrieveMessage.java should accept three command-line arguments:

Read the bits from left to right, top to bottom, meaning start with the upper-left pixel, and work across rows. You should be extracting one bit per pixel (the least significant bit). In the 2D array returned by ImageData.load(), the first index corresponds to rows and the second index corresponds to columns. If the number of pixels in the image is not a multiple of 7, ignore the extra pixels at the end (i.e. for a 5x10 pixel image, extract from the first 49 pixels and ignore the last).

Once you have extracted, decrypted, and decoded the message, if there is a character whose numeric code is 0, only print out the characters up to, but not including that character. The 0 character is called the NULL character or ASCII NULL, and can be written in Java as ’\0’. Do not confuse the NULL character with the character ’0’, which corresponds to the ASCII code 48 and represents the digit zero. Also note that the ASCII NULL character is different from the null value in Java. You may decrypt characters one by one until you reach your NULL character, or you may decode and decrypt all possible characters, then search for a NULL in the result. If no NULL character is found, simply print the entire retrieved message. Using String.indexOf() and String.substring(beginIndex, endIndex) will be useful if you choose the second route.

Now add functionality so RetrieveMessage.java can take in only an image file and no seed or tap position. You should still be extracting and printing the embedded message, but not decrypting it.

In order to test your RetrieveMessage.java, we have provided stegosaur_embedded.png which has a hidden message which is not encrypted. We also have stegosaur_encrypted.png which contains the same hidden message, but this time encrypted with the seed 01101000010 and tap 8.


D. Hide Message


For the final part of the assignment, you will be encoding a message into binary, encrypting the message using an LFSR, and hiding the message in the least significant bits of an image.

Write HideMessage.java that should accept four command-line arguments:

HideMessage should load the specified image using ImageData.load(), encrypt the message using your Codec.java library, embed the encrypted cipher into the image and save it using ImageData.save(), and display the image result using ImageData.show(). When you embed the encrypted cipher into the image, you should set the least significant bit to be the bit being encoded, but make sure to not change any of the other bits.

Before the message is encrypted, it should be NULL terminated. This simply means adding the ASCII NULL character to the end, which can be done by adding "\0" to the end of the message String.

Notes:

Once you are displaying embedded messages, you should be able to save the image and retrieve the message using RetrieveMessage.java.


3. README and Submission


A. README


Complete readme_steg_pt2.txt in the same way you have done for previous assignments.


B. Submission


Submit Codec.java, RetrieveMessage.java, HideMessage.java and readme_steg_pt2.txt on Gradescope.

Before submission, comment out any print statements that were used for debugging or testing your functions and not any print statements that we asked you to insert.

Be sure that every method has an appropriate header comment, and that your code is well-documented.