Towards Prioritizing Documentation Effort
This page is the online appendix to our paper documentation prioritization. If you have any questions,
please contact Paul W. McBurney at email@example.com.
Static Attributes and Textual Comparison Data
The following links contain zip files of our open and closed source projects. The files are broken into
.csv and .arff files. The files are also then broken into numeric (raw) data and data with a discrete target
class. For example, Top25, which we used in our correctness studies, is where the data is broken into "TOP"
for the highest scoring 25% and "MID" for the bottom scoring 75%. The average scores represent the scores
based on our user studies. All other columns are data columns.
Survey Response Data
We have included our survey response data. Because there are differences between the surveys, view the
readme files in both subfolders. In both cases, a larger number corresponds to higher importance. In the
open source survey, 4 was the highest score. In the close-source survey, 5 was the highest score.
Open-Source Survey Data
Open-Source Class Level Survey Data
Closed-Source Survey Data
Project-B Class Questions Data (Updated)
Project-D Class Questions Data (Updated)
This requires the NLTK package. This is a "slow" implementation that does not use caching. It will
take several hours to run on programs as large as jGraphT and jxl.