cis3990-25fa (Fall 2025) Home Schedule Assignments Tools & Refs HW 01: Linked List and Hash Table

This assignment aims to be a rigorous refresher on pointers, dynamic memory allocation, and C style programming

Goals

Collaboration

For assignments in CIS 3990, you will complete each of them on your own. However, you may discuss high-level ideas with other students, but any viewing, sharing, copying, or dictating of code is forbidden. If you are worried about whether something violates academic integrity, please post on Ed or contact the instructor(s).

Setup

You can downlowd the starter files into your environment by running the following command. You should run it from inside your git repository since that is how you will submit your code.

curl -o llht_starter.zip https://www.seas.upenn.edu/~tqmcgaha/cis3990/current/projects/code/llht_starter.zip

You can also download the files manually here if you would like: llht_starter.zip

From here, you need to extract the files by running

unzip llht_starter.zip

You should now have a folder called llht that has the starter files. We expect this to be a folder inside the base of your github repository, similar to cowsay and check-setup. If you accidentally put the folder somewhere else, you can move it with the mv command in the terminal.

Starter Files

In the zip you will find the following files:

Part 1 Files:

Part 2 Files:

We recommend reading all of the files for a part before your start that part, especially the .hpp files listed. All of the code you will submit for this assignment should be contained in LinkedList.cpp and HashTable.cpp. This means that your code should work with the other files as they are when initially supplied. Note that you can modify the other files if you like, but we will be testing your LinkedList.cpp and HashTable.cpp against un-modified versions of the files.

Overview

In this assignment we want you to program a data structure in a mostly C style way. This course focuses a lot on C++ but it is important to get the fundamentals of C down since it is the foundation C++ is built on top of, and some of you may have to work in evironments that only use C or write C++ in a very “C” style.

If you’ve programmed in higher level programming languages like Java, Python, or others, then you probably made use of a large standard library, filled with data structure implementations ready for you to use. In C, we have to make data structures ourselves or use an external library. In this assignment, we will be implementing two fairly common data structures: A Linked-List, a HashTable and iterators for each data structure.

However, C is missing many common features of high-level programming languages, which will make this difficult. In particular, we want to make a generic data structure, with a private implementation, that doesn’t have any memory leaks.

This assignment is broken into two parts respectively, first you will implement the LinkedList and its corresponding iterator before moving on to the HashTable and its iterator.

Part 1: LinkedList

The linked list will be “doubly linked”, which means each node has both a next pointer and a prev pointer to access both the node before it and after it in the list.

At a high level, doubly-linked lists usually look like:

ll_abstract

Note in here that each node contains the a pointer to the next node, a pointer to the previous node, and the payload (data) that is stored in that node. We also use NULL as a value in the prev and next nodes to indicate the absence of a previous or next node.

However, due our goal of wanting to make a full data structure, we don’t want to just give the users of our code LinkedListNodes and tell them to manage the nodes themselves.

As a result, we are going to have a struct called LinkedList that is dedicated to managing the nodes into the form of a List. LinkedList contains a pointer to the front or “head” of the list, a pointer to the end or “tail” of the list, and the current number of elements in the list. We can update our drawing to look like this:

LinkedList

We also want to provide an iterator so that clients can read all the elements in the list, so we will need a third struct called LLIterator. LLIterator contains a pointer to the current node the iterator is referring to in the list, and a pointer to the LinkedList struct that manages the nodes we are iterating over. The reason for having a reference to the list itself is because we may have to update things in the LinkedList if we end up removing things with the LLIterator, so the num_elements of the LinkedList and possibly other data may have to change when we remove a node with LLIterator_Remove.

If we add our iterator to the drawing, we get:

LLIterator

In summary, there are three struct types that we will be using in this part:

We will also be using the type LLPayload_t which is a typedef (e.g. alias) for a void *. As mentioned before, this will allow users some flexibility in what the store in the linked list.

Here we have given an overview of the LinkedList structure, your job will be to implement the functions that suppor the implementaiton of our data structure.

Part 2: HashTable

For part 2, we are implementing a chained hash table. A HashTable allows the user to insert a key/value pair, look up already existing key/value pairs, and remove them. Note that the HashTable is a more complex data structure, and probably something you are not as familiar with. This is OK, this is not a data structures course, please let us know if you have questions on the structure of the HashTable and iterator.

The HashTable structure follows a “chaining” design. This means that our table maintains several LinkedLists which we call “chains” or “buckets”. When a user inserts a key/value pair, the pair will also come with some hash value of type HTHash_t. The hash table will use a deterministic function to associate that HTHash_t with one of the specified buckets. We can then insert that integer into our HashTable by inserting it into one of our “buckets”. If we later want to lookup that key/value pair or remove that key/value pair, we can simply look through the bucket associated with that key and avoid looking through the rest of the buckets in our hash table.

One important case to note though, is that if we insert a key/value pair into our HashTable and there is already an entry with the same key/value pair, we will return a copy of the already existing key-value pair to the caller and replace the old key-value pair in the HashTable with the new key/value pair.

To keep the HashTable quick for inserting/lookups/removals we try to keep the buckets short. So as we add more elements to the HashTable, the HashTable will resize to have more buckets and possibly redistribute which keys are associated with wich bucket to keep the buckets short. This resizing is already implemented for you, we just describe it here to give you an idea of what the code is doing.

To get a better understanding of how the HashTable works, we have a diagram below:

HashTable

We will also be implementing a HashTable Iterator HTIterator (not pictured) that will iterate through all of the buckets of the HashTable, and each element in each bucket.

More details on the HashTable and HTIterator can be found in HashTable.hpp and HashTable_priv.hpp respectively. We HIGHLY recommend you read these header files before beginning part 2.

Short summary of the structs used in part 2:

Grading & Testing

Compilation

We have supplied you with a Makefile that can be used for compiling your code into an executable. To do this, open the terminal and then type in make.

You may need to resolve any compiler warnings and compiler errors that show up. Once all compiler errors have been resolved, if you ls in the terminal, you should be able to see an executable called test_suite. You can then run this by typing in ./test_suite to see the evaluation of your code.

Note that your submission will be partially evaluated on the number of compiler warnings. You should eliminate ALL compiler warnings in your code.

Valgrind

We will also test your submission on whether there are any memory errors or memory leaks. We will be using valgrind to do this. To do this, you should try running: valgrind --leak-check=full ./test_suite

If everything is correct, you should see the following towards the bottom of the output:

 ==1620== All heap blocks were freed -- no leaks are possible
 ==1620==
 ==1620== For counts of detected and suppressed errors, rerun with: -v
 ==1620== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If you do not see something similar to the above in your output, valgrind will have printed out details about where the errors and memory leaks occurred.

catch2

As with mentioned above, you can compile the your implementation by using the make command. This will result in several output files, including an executable called test_suite.

After compiling your solution with make, You can run all of the tests for the homwork by invoking:

$ ./test_suite

You can also run only specific tests by passing command line arguments into test_suite

For example, to only run the LinkedList tests, you can type in:

$ ./test_suite [Test_LinkedList]

If you only want to test Push and Pop from LinkedList, you can type in:

$ ./test_suite [Test_LinkedList] PushPop

You can specify which tests are run for any of the tests in the assingment. You just need to know the names of the tests, and you can do this by running:

$ ./test_suite --list-tests

These settings can be helpful for debugging specific parts of the assignment, especially since test_suite can be run with these settings through valgrind and gdb!

Code Quality

For this assignment, we will be checking the quality of your code in two ways:

Clang Format and Clang Tidy

The makefile we provided with this assignment is configured to help make sure your code is properly formatted and follows good (modern) C++ coding conventions.

To do this, we make use of two tools: clang-format and clang-tidy

To make sure your code identation is nice, we have clang-format. All you need to do to use this utility, is to run make format, which will run the tool and indent your code propely. Code that is turned in is expected to follow the specified style, code that is ill-formated must be fixed and re-submitted.

clang-tidy is a more complicated tool. Part of it is checking for style and readability issues but that is not all it does. It also checks for and can detect possible bugs in your program.

Because of all this, we are enforcing that your code does not produce any clang-tidy errors. You can run clang-tidy on your code by running: make tidy-check. Note that you will have to fix any compiler errors before clang-tidy will run (and be useful).

If you have any questions about understanding an error, please ask on Ed discussion and we will happily assist you.

Manual Checks

For this assignment we will also be manually checking your code for code quality issues.

We have a code quality document that outlines what we are looking for, and the automated checks described above will be able to find most of these. However some of these will be manually checked.

Since this is the first time we are grading on code quality manually and since this is C-styled code, we are not requiring you to follow all parts of the style document. We will only be looking for issues that fall under the following categories:

Reminder that you can re-open assignments, so if you do not get the score you want on code quality, you can fix it and re-submit the assignment at a later point. More details on this are described in the syllabus

Another reminder that your style grade will be “bucketed” into one of four scor categories. Again, see the syllabus for more details.

Submission

Please submit your completed LinkedList.cpp and HashTable.cpp to Gradescope via your github repo. We will be checking for compilation warnings, valgrind errors, making sure the test_suite runs correctly, the quality of your code, and that you did not violate any rules established in the specification.