The purpose of this assignment is to gain practice with object-oriented programming, bitwise operators, number systems, and cryptography. The specific goals are to:
At the end of the assignment, you will be able to encrypt and decrypt text using a pseudo-random number stream, and reveal messages hidden into images.
Encryption is the process of encoding messages or information that can only be decoded by people with authorization (usually a passkey or the like). Good encryption algorithms produce output that looks random to a bystander but is easily decipherable with the correct passkey.
In Hail, Caesar!, the key was a single character which shifted all characters in the message up the alphabet. This algorithm was far from perfect, and you demonstrated this by writing a function to crack the cipher without having authorization. This exploited the fact that the outputted cipher wasn’t entirely random – some letters appeared an abnormal amount, and this allowed you to find the offset passkey.
Over the two millennia since Julius Caesar, cryptography has gone a long way. Computers have played an enormous role in developing information communications as well as keeping those communications safe. In this assignment, you will implement and use a linear feedback shift register (LFSR) to create a stream of pseudo-random bits. This will lead to a significantly more random cipher.
Once you have written a crypto library, you will use it for a practical application: steganography — the practice of hiding secret messages in images in plain view.
First you will write an LFSR object. Once seeded, it produces a boundless stream of seemingly random bits (1s and 0s). The key feature is that given the same seed, the LFSR will produce the same stream of bits. This means that if Alice and Bob both know the same seed, they can produce identical random bit streams.
To encrypt the message, each bit in the message will
be exclusive ored with the LFSR’s sequence
of pseudo-random bits. The resulting cipher will appear to
be nonsense, like Kao y{u(x kp
}`1rkz~|g2rj`f@r)
, but when it is XORed
by the same sequence of bits, the encrypted
message can be decrypted into the original. Since an LFSR’s
bit stream is completely determined by its given parameters,
Alice can produce the same bit stream that Bob used to
encrypt a cipher if she knows the seed, and so can decrypt
the message through the same process. For this reason, an
LFSR encryption scheme is symmetric key encryption technique
(the alternative is an asymmetric scheme, in which different
passwords are used for encrypting and decrypting).
You will be writing four classes. Their APIs are listed below, but each will be elaborated on in later sections of the homework.
public class LFSR ----------------- public LFSR(String seed, int tapPosition); public LFSR(int seedLength, int tapPosition); public String toString() public int getTapPosition() public int nextBit()
public class Codec ------------------ public static int[] encode(String str); public static String decode(int[] bits); public static void encrypt(int[] message, String seed, int tapPosition); public static void decrypt(int[] cipher, String seed, int tapPosition)
public class RetrieveMessage ---------------------------- public static void main(String[] args)
public class HideMessage ------------------------ public static void main(String[] args)
In any class, you may add additional any functions and/or instance variables you like, but they must be private. No additional public functions or instance variables may be added.
Before we move onto writing code, it may be helpful to review number systems. You have been dealing with numbers your entire life - 10, 12, 100, 1, 50001. Something that is often missed in this numerical notation is the base of this number system which is obviously 10. For example 1210 = 1 * 101 + 2 * 100. Likewise 5030110 = 5 * 104 + 0 * 103 + 3 * 102 + 0 * 101 + 1 * 100. This is inherent to us, we have ten toes and ten fingers. However, computers only understand 1s and 0s (it's got to do with electric signals). Hence, we must be able to communicate all data to a computer in 1s and 0s.
The binary number system represents numbers to the base 2 and deals with 2's powers. For example 310 = 1 * 21 + 1 * 20 = 112. Likewise 5010 = 1 * 25 + 1 * 24 + 0 * 23 + 0 * 22 + 1 * 21 + 0 * 20 = 1100102. In this way we can translate numbers from the binary system to the decimal and vice-versa.
The diagram above illustrates an easy way to convert from decimal to binary. Here we use long division as a manner of conversion. In the diagram shown above, take a look at the two examples on the right. Either of these examples can be divided into four diagonals of number. Consider the rightmost example. The uppermost diagonal is the divisor which will always be 2. The second diagonal (43, 21, 10.....) is the diagonal of dividends/quotients and the bottom most diagonal (1, 1, 0.....) is the diagonal of remainders. The way we proceed is - we divide 43 by 2 which gives 21 as the quotient and 1 as the remainder. 21 then becomes the new dividend and we divide it by 2 to get 10 as the quotient as 1 as the remainder. Progressing in this manner, we end up with 1 as the dividend which we divide by 2 to get 0 as the quotient and 1 as the remainder. We now take the remainders in the opposite direction to get our binary number which is 1010112 for the decimal number 4310.
In the first part of this assignment, you will be creating a linear feedback shift register (LFSR). An LFSR is a structure that can produce a stream of pseudo-random bits, which has many practical uses, particularly in cryptography.
The LFSR consists of a register of bits and a tap position. The register is simply a list of bits that has a fixed size (which should suggest to you a good data structure to implement it). The tap position is simply an index in the register that will be used to create the pseudo-random bits. When we create an LFSR, we must seed it by providing the initial values in the register.
There are two steps in producing a pseudo-random bit with an LFSR:
The new least significant bit will be the pseudo-random bit produced by the LFSR.
Consider an example. The following figure shows an LFSR
seeded with the initial seed 01101000010
and tap
position 8
during the process of producing one
pseudo-random bit. Keep in mind that the tap position
is counted from the rightmost (least significant)
bit! This is opposite from the way positions are
counted in arrays and strings, but is consistent with the way
bit/digit positions are counted in numbers.
NOTE THAT THE ARRAY INDICES ARE FROM 10 (left) to 0 (right). This is opposite from how most arrays are diagramed
Before you begin writing your LFSR.java
class, you will want to decide how you want to represent the
shift register within your class. We are leaving this decision
up to you, and any working implementation will be
accepted. Here are our suggestions:
int[]
containing only 1
s
and 0
s;char[]
containing only '1'
s
and '0'
s;boolean[]
containing
only true
and false
;int
, where each bit represents a
position in the shift register. This is a more challenging
implementation, but will give you a chance to play around
with and understand bits and binary better. In this case,
your shift register will be limited to 32 bits, and you will
need to use the bitwise operators &
(bitwise AND), |
(bitwise
OR), <<
(shift left),
and >>
(shift right).Your LFSR class will implement the API described below. This section will provide the details for the constructor and methods:
public class LFSR ----------------- public LFSR(String seed, int tapPosition) // constructor public LFSR(int seedLength, int tapPosition) // constructor for a random seed public String toString() // string representation of the LFSR public int getTapPosition() // return the tap position public int nextBit() // return a random bit and update the LFSR
public LFSR(String seed, int tapPosition)
takes a String
parameter seed
whose
characters are a sequence of 0
s
and 1
s, and an int
parameter tapPosition
specifying which position
in the register to use as the tap.
The constructor should throw a RuntimeException
with a useful error message if seed
contains
any characters other than 0
or 1
,
if tapPosition
refers to an impossible position
in the register (for example, -1
), or
if seed
is null
.
Note: remember
the String.charAt()
method, which will be helpful
when parsing through seed
.
public LFSR(int seedLength, int tapPosition)
takes an int
parameter seedLength
,
and generates a random seed (i.e. a random string
of 0
s and 1
s of
length seedLength
.
The constructor should throw
a RuntimeException
with a useful error message
if seedLength
is not positive or
if tapPosition
refers to an impossible position
in the register.
public String toString()
returns the current
bit sequence in the shift register as a String
of 1
s and 0
s. For example:
Should printLFSR lfsr = new LFSR("101011", 3); System.out.println(lfsr.toString());
101011
.
This method will help you when debugging.
Notes:
toString()
method works! If
it doesn't, all of our automated tests will fail and you
will lose lots and lots of points.Integer.toBinaryString()
method within toString()
to quickly get a
working version. You will not receive credit for
the toString()
method if your final version
calls Integer.toBinaryString()
because we
want you to implement this yourself. However you will
still be able to receive full credit on the rest of the
assignment. Therefore it is much better to
use Integer.toBinaryString()
in
your toString()
method than it is to submit
your code with a non-working toString()
(which will affect your grade on all the other methods as
well).public int getTapPosition()
returns the tap
position, as given by the constructor.
public int nextBit()
performs one step of the
LFSR, as described in section 1A and returns the least
significant bit (the rightmost bit) in the shift
register after the step has been performed as
an int
with the value 0
or1
.
For testing, you should ensure you can run
LFSR lfsr = new LFSR("01101000010", 8); for (int i = 0; i < 10; i++) { int bit = lfsr.nextBit(); System.out.println(lfsr.toString() + " " + bit); }
should print
11010000101 1 10100001011 1 01000010110 0 10000101100 0 00001011001 1 00010110010 0 00101100100 0 01011001001 1 10110010010 0 01100100100 0
Before moving on, you should have
tested all methods in LFSR.java
and be
confident that it is correctly implemented.
Before we can begin encrypting and decrypting using our
LFSR, we need to implement two functions to encode and
decode our messages into ASCII. For instance, the
string SENDMONEY
maps to the following sequence
of ASCII codes:
Remember that you can convert characters in a{ 83, 69, 78, 68, 77, 79, 78, 69, 89 }
String
to their ASCII codes quite easily: use
the charAt()
method to extract a single
character, and cast the result to an int
.
(Technically, you're getting the character's Unicode code,
but the first 128 characters in the Unicode character set
match the ASCII character set exactly.)
Since you will encrypt messages bit-by-bit, not character-by-character, you will need to expand each ASCII code into 7 1s and 0s that correspond to the code's binary representation. (By the time ASCII became the universal standard for encoding characters on computers, most computers used 8-bit bytes. However the communication lines to keyboards, printers, screens, and other computers tended to be very unreliable. To help detect transmission errors, the ASCII standard only uses 7 bits to encode a character and leaves the 8th bit free for error-checking purposes.)
For your Codec.java
library, begin by writing two
functions:
public static int[] encode(String str)
takes in
a String
and will return an int[]
where each
element is a single bit in the ASCII encoding of
the String
. Each character in str
will
encode into 7 bits, so the output array should be 7 times larger
than str.length()
.
For example, encode("C")
should output the
array { 1, 0, 0, 0, 0, 1, 1 }
str
is null
, encode()
should
return null
.str
has a unicode
value >127 (i.e. the character is not part of the ASCII
subset of Unicode), your program should throw
a RuntimeException
with an appropriate error
message. Your RingBuffer.java
file from the
Harper's Bazaar assignment contains examples you can refer
to.Hint: once you have the int
ASCII
representation of a char
, you will need to find
the 7 bits representing that number. You can do this using
standard arithmetic operations or bit-level operators. It's
exactly the same process we covered in lecture for converted
decimal numbers to binary or hex. You’ll need to think
about how to do this exactly, but here are some hints:
0000011
has
a 1
as the least significant bit. Consider what kind
of numbers have 1
and what kind of numbers
have 0
.0011000
and 0001100
(begin by figuring out what these are in
decimal!).You should test encode()
and be confident
before you move on to the next function.
public static String decode(int[] bits)
will accept
an array of int
representing a message in ASCII, and
will return the decoded String
. This is simply a
reversal of the encode()
process. Remember, every 7
bits corresponds to one char
.
bits
is null
, return null
.bits.length
is not a multiple of 7,
throw a RuntimeException
with an appropriate
error message.bits
contains any values other than 0
and 1, throw a RuntimeException
with an
appropriate error message.Once you have written encode()
and decode()
, you should be able to encode
any String
into binary and decode it back into
the original string.
Your next task is to write functions to encrypt and decrypt messages by xor-ing them with a sequence of pseudo-random bits generated by the LFSR. The "password" is the set of parameters that define the LFS (the seed and tap position).
For instance, consider the letter “C" with binary
representation 1000011
. Say we construct an
LFSR with 01101000010
as the seed and 8
as the tap. This will
produce a bit stream of 1100100100
. Encrypting
the letter “C" (67 in ASCII, which is represented in binary
as 1000011) with this LFSR would be done by:
message bits 1000011 random bits 1100100 ---------------------- encrypted bits 0100111
Note that the ith bit in the cipher is the XOR of the ith bit in the message and the ith bit in the random bit stream.
Write public static void encrypt(int[] message,
String seed, int tapPosition)
to perform this
encryption. message
is the message in binary
to encrypt, and seed
and tapPosition
are the parameters for the
LFSR to be used when encrypting. You should not be
returning anything. Instead, you should be changing
the message
array in place. Implement
the following error checks, in order:
null
, throw
a RuntimeException
with an appropriate error
message;RuntimeException
with an appropriate error
message;'0'
and '1'
, throw
a RuntimeException
with an appropriate error
message;null
, do nothing;
RuntimeException
with an appropriate error
message;0
or 1
, throw
a RuntimeException
with an appropriate error
message;Once you have tested and are confident
in encrypt()
you should
write public static void decrypt(int[] cipher, String seed, int tapPosition)
. This
function should be one line long.
Hint: Exclusive or is symmetric. In practice, this
means that if c == r ^ m
then m == r ^ c
.
Now that you have written a more modern encryption library, you will use it for a practical application: steganography. Image steganography is the science of hiding secret messages inside of images. Think of it as 21st century disappearing ink. The casual observer simply sees an ordinary image; only someone who knows how to look for it will notice or find the message.
Image steganography has many practical uses, particularly for digital watermarking, where a message is hidden so that the image source can be tracked or verified. The FBI has even alleged that Russian intelligence services have used steganography to communicate with agents abroad.
You will implement a simple, but very effective form of image steganography. The idea is to fiddle with each pixel's color in a way that isn't perceivable to the human eye, but that the computer can interpret. Since the human eye is least sensitive to blue wavelengths, we'll slightly adjust the amount of blue in each pixel in a way that encodes a single bit of information.
As far as the computer is concerned, an image is just a 2-D array of pixels. The color of each pixel is an integer between 0 and 16.8 million (224 - 1 to be precise), with 8 bits dedicated to each of the red, green, and blue primaries.
Flipping the ones bit of this number (i.e. subtracting 1
from an odd number or adding 1 to an even number) changes
the amount of blue in the color by a minute amount that is
indistinguishable to the human eye. These two blues
are 001001011000010110101110
and 001001011000010110101111
:
Using your Codec.java
library, you will be
decoding a message hidden inside of an image.
ImageData.java
We have provided you with
the ImageData.java
library for reading and writing images as 2D integer
arrays. ImageData.java
provides three
functions:
public static int[][] load(String filename) // load image filename and return it as a 2-D array of integers public static void show(int[][] img) // display img in a window public static void save(int[][] img, String filename) // save img to filename
Each element in the 2D integer array corresponds to one
pixel in the image. Refer to section 3A to see how each
pixel is represented as an int
.
First, write RetrieveMessage.java
that should
accept three command-line arguments:
0
s
and 1
sRetrieveMessage.java
should load the specified
image using ImageData.load()
, extract the
embedded cipher, and print out the decrypted message.
Read the bits from left to right, top to bottom, meaning start with the upper-left pixel, and work across rows. You should be extracting one bit per pixel (the least significant bit). If the number of pixels in the image is not a multiple of 7, ignore the extra pixels at the end (i.e. for a 5x10 pixel image, extract from the first 49 pixels and ignore the last).
Once you have extracted and decrypted the message, if there
is a character whose numeric code is 0
, only
print out the characters up to, but not including
that character. The 0
character is called
the NULL character or ASCII NULL, and can be
written in Java as ’\0’
. Do not confuse the
NULL character with the character ’0’
, which
corresponds to the ASCII code 48
and represents
the digit zero. You may decrypt characters one by one until
you reach your NULL character, or you may decode and
decrypt all possible characters, then search for a
NULL in the result. Using String.indexOf()
and String.substring()
will be useful if you
choose the second route.
Notes:
seed
or tapPosition
command line arguments are given (there is only one
arguments), retrieve the message and print without
decrypting.In order to test your RetrieveMessage.java
, we
have
provided stegosaur_embedded.png
which has a hidden message which is not encrypted. We
also
have stegosaur_encrypted.png
which contains the same hidden message, but this time
encrypted with the seed 01101000010
and
tap 8
.
You will be able to test more after completing the next part.
Write HideMessage.java
that
should accept four command-line arguments:
0
s
and 1
sHideMessage
should load the specified image
using ImageData.load()
, encrypt the message
using your Codec.java
library, embed the
encrypted cipher into the image, and display the image
result using ImageData.show()
.
Before the message is encrypted, it should be NULL
terminated. This simply means adding the ASCII
NULL character to the end, which can be done by
adding "\0"
to the end of the
message String
.
Notes:
seed
or tapPosition
command line arguments are
given (there are only two arguments), simply embed the
message without encrypting.RuntimeException
with an appropriate error
message.In
class to read from a file, just
like in the N-Body assignment. There is a
convenient readAll()
method you can use to
return the entire contents of the file as one
big String
.Once you are displaying embedded messages, you should be
able to save the image and retrieve the message
using RetrieveMessage.java
.
Complete readme_steg.txt
in the same way you have done for previous assignments.
Submit LFSR.java
, Codec.java
, RetrieveMessage.java
, HideMessage.java
,
and readme_steg.txt
on the course website.
Before submission, comment out any print statements that were used for debugging or testing your functions and not any print statements that we asked you to insert.
Be sure that every method has an appropriate header comment, and that your code is well-documented.