CIT 593: Computer Systems I

Homework #7: LC-3 Assembler, Part I

Due December 5, 2011, 3:00pm


Over the course of the next two homework assignments, you will write a C implementation of a simple LC-3 assembler that supports almost all of the ISA. Users of the program will provide a text file containing the LC-3 assembly language instructions, and your assembler will read the file, create a symbol table, and then convert the instructions to machine language.

This may sound like a fairly daunting task. And, well, it kind of is. But it's not so bad if you do it step-by-step, tackling the different parts of the program before finally putting it all together at the end. Which is why you'll do some of the core parts now, and then finish it in the next homework assignment.

To simplify things a bit, we've provided "skeleton code" which contains headers for the different functions that you need to implement, and test code that you can run to make sure everything's working correctly. Your job is just to implement the different functions.


Part I (15 points)
Your LC-3 assembler will read instructions such as "ADD R0 R1 R2" or "LD R0 DATA" from the input file. In order to figure out which operation to encode, and what operands should be used, your program will need to separate those strings into their constituent substrings.

In the first part of this assignment, you will write a function "parse" that takes two parameters: a string s, and an array of strings called strings. The string s will contain anywhere from one to five substrings, separated by whitespace. The function should parse s and put the substrings into strings.

For instance, if s is the string "this is a test!", then after the function executes, the array strings should contain elements "this", "is", "a", and "test!". Note that strings is not the return value of the function; rather, it is a parameter that gets modified by the function.

You can assume that each substring in s will be no more than 10 characters long, including the null at the end (this limits what we can use for labels, but it's okay for now). You can also assume that there will never be more than five substrings in s.

The return value of the function "parse" should be the number of substrings that were put into strings. However, if s is null or an empty string, the function should return 0.

To help get you started, we have provided skeleton code that contains the function header. Please do not change the function name, parameters, etc.

To check that your code is working correctly, you may write your own tests, of course, but your code must pass these tests on the SEAS UNIX machines in order to get full credit. Note that the tests we provide you include a case in which the input string s ends with a newline character (ASCII value #10). Your "parse" function must be able to ignore the newline and not treat it as a separate word.

Helpful hints:


Part II (15 points)
Your LC-3 assembler will represent the program's symbol table by using a linked list. Each node in the list will represent one entry in the symbol table, and will contain the label and the offset from the beginning of the program (note that we don't need to know the actual address, just the offset).

In this part of the assignment, you will implement the following linked list functions:

To help get you started, we have provided skeleton code that contains the definition of the Node struct, all function headers, and some other variables that you'll need to use. Please do not change the definition of the Node struct, any function names, parameters, etc.

To check that your code is working correctly, you may write your own tests, of course, but your code must pass these tests on the SEAS UNIX machines in order to get full credit.

Helpful hints:


Part III (20 points)
Last you'll write the code to create the symbol table, which requires reading the program file one line at a time.

Start with this skeleton code, which has a function "create_symbol_table". The argument to the function is a string representing a filename, and the function opens the file, reads one line, and prints it to the screen. Note that it assumes that lines are only 100 characters long (which is okay for us), and the character array line will include the newline character at the end.

Modify the sample code so that:

For simplicity, you can assume that:

To test your code, create a file called symboltabletest.c with a "main" function (i.e., do not put "main" in symboltable.c) that reads in the name of the LC-3 assembly language file as a command-line argument, and passes that to your create_symbol_table function. After create_symbol_table returns, call the "iterate" function in linkedlist.c to iterate over the linked list and print out all the symbols/labels and their offsets.


Submitting the homework
You will submit this homework through Blackboard, as usual.

You should submit parse.c, linkedlist.c, symboltable.c, and symboltabletest.c. Do not change the function headers of any of the functions that have been provided to you in the skeleton code; you can, however, add additional "helper functions" if necessary, but do not include any of your own test code or "main" functions.

Also, you must submit a Makefile that the TAs will use to compile your program. Your Makefile should have options for compiling the parsetest, linkedlisttest, and symboltabletest programs.

Last, please submit a plain-text readme file that describes any known issues in the programs, such as tests that the code does not pass.

Please zip (or tar) all files together and then submit.

This homework may be submitted late, subject to the standard penalty of 10% per day. However, you may not get an extension on homework #8 if you submit this assignment late.