CIS 110Hamming Codes in TOY |
Programming Assignment |

Write a TOY program to encode data using Hamming codes. Then write a TOY program to correct encoded data that has been corrupted.

**Perspective.**
*Error-correcting codes* enable data to be sent through a
noisy communication channel without corruption.
To accomplish this, the sender appends redundant
information to the message, so that even if some of the original
data is corrupted during transmission, the receiver can still recover
the original message intact.
Transmission errors are common and can arise from scratches on a CD,
static on a cell phone, or atmospheric interference.
In a noisy environment, error-correcting codes can increase the throughput of
a communication link since there is no need to retransmit the message
if it becomes partially corrupted during transmission.
For this reason, error-correcting codes are used in many common systems
including:
storage devices (CD, DVD, DRAM), mobile communication (cell phones,
wireless, microwave links), digital television,
and high-speed modems (ADSL, xDSL).

**Hamming Codes. **
A *Hamming code* is a specific type of
error-correcting code that allows the detection and correction
of *single-bit* transmission errors.
Hamming codes are used in many applications where such errors are
common, including
DRAM memory chips and satellite communication hardware.
Hamming codes work by repeatedly reading four *message bits*,
which we denote by *m1*, *m2*, *m3*, *m4*, and then
inserting three *parity bits*, which we denote by
*p1*, *p2*, and *p3*.
If any one of these seven bits is corrupted during transmission, the receiver
can detect the error and recover the original four message bits intact.
This is called *single-bit error correction* because at most one
bit can be corrected per unit of data sent.
The overhead for using this method is a 75% increase in bandwidth
because it requires three extra parity bits for every four message bits.
This compares favorably with the naive approach of sending three copies of
each bit (and using the one that appears most frequently), which results
in a 200% increase in bandwidth.

Before we describe the algebra of Hamming codes,
we first visualize the calculation of the parity bits using Venn diagrams.
As an example, suppose we wish to send the message `1101`.
We associate each of the four message bits with a specific intersection region of three
pairwise overlapping circles, as illustrated below:

The Hamming code adds three parity bits so that each of the three circles has

That is, the sum of the four bits contained in each of the three circles is even:

In this case, to send the message

Now imagine this picture is transmitted over a noisy communication channel, and that one bit is corrupted so that the following picture arrives at the receiving station (corresponding to 1001100):

The receiver discovers that an error occurred by checking the parity of the three circles. Moreover, the receiver can identifySince the parity check for the top circle and right circle failed, but the left circle was ok, there is only one bit that could be responsible, and that's

Of course, in practice, only the 7 bits are transmitted, rather than the circle diagrams.

**Part 1.**
Write a TOY program `encode.toy` to encode a binary message
using the scheme described above. Your program should *repeatedly* read four
bits
*m1*,
*m2*,
*m3*, and
*m4*
from TOY standard input until reading in `FFFF` and write the seven bits
*m1*,
*m2*,
*m3*,
*m4*,
*p1*,
*p2*,
*p3*
to TOY standard output, where

**Part 2.**
Write a TOY program
`decode.toy` to correct a Hamming encoded message. Your program should
*repeatedly* read seven bits
*m1*,
*m2*,
*m3*,
*m4*,
*p1*,
*p2*,
*p3*
from TOY standard input until reading in `FFFF`
and write four bits to TOY standard output.
Recall, to determine which one, if any, of the message bits is
corrupted, perform the following three parity checks:

*p1*=*m1*^*m2*^*m4**p2*=*m1*^*m3*^*m4**p3*=*m2*^*m3*^*m4*

**Input format.** For convenience, we'll transmit each bit
individually (as `0000` or `0001`), instead of packing
16 bits per TOY word as would be done in a real application. Your
programs should repeatedly read in four or seven bits from standard
input until `FFFF` appears on TOY standard input. The number
of transmitted bits (not including the terminating `FFFF`) will
be a multiple of four when encoding and a multiple of seven when
decoding. You do not need to do any error checking on the data files.
The data.zip contains a few sample input files
for `encode.toy` and `decode.toy`:

%more encode3.txt%more decode3.txt0001 0001 0000 0001 0001 0000 0000 0001 0001 0000 0000 0001 0001 0001 0000 0000 0001 0001 0000 0000 0000 0000 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0000 FFFF FFFF

**Output format.** The output format should be the same as the
input format, i.e. a series of bits encoded as `0000`
or `0001` followed by `FFFF`.

**TOY simulator. **
To execute your TOY program, use either the command-line
simulator TOY.java
or the graphical
Visual X-TOY simulator.

%java TOY encode.toy < encode3.txt%java TOY decode.toy < decode3.txt... ... Terminal Terminal --------------------------------------- --------------------------------------- 0001 0001 0001 0001 0000 0000 0001 0001 0001 0001 0000 0001 0000 0001 0001 0000 0001 0001 0001 0001 0000 0001 0000 0001 0000 FFFF 0000 ... 0001 0001 0001 0001 0001 0001 0001 FFFF ...

**Checklist. ** Don't forget to read
the checklist. Many of the
Q&As are repeated from the TOY Encryption assignment since they
apply to this one too, but there are also Hamming-specific hints and
suggestions.

**Submission. ** Submit `encode.toy`,
`decode.toy`,
and readme_hamming.txt. Be sure to
comment your programs to make it clear to the grader what you are
doing and what each variable represents.

**Extra credit.** Rewrite `decode.toy` using the fewest
number of (nonzero) words of TOY main memory. We will award extra
credit for any solution that uses 40 (in decimal) or fewer word of
memory. Be sure to save your working version before attempting this!