Back to the Table of contents
Part 2: File synchronizer implementation improvements
1. General ideas
2. Implementation improvements
Unfortunately, the time constraints and the deficiencies of the original design didn't
allow to fully implement the design described above. Although I tried to get as close as
possible...
NetFile
interface was eliminated along with the class that
implements it. Some of the methods were re-implemented in the
FileSystemImpl
class with the necessary modifications. Now all
files are referred to by the pair (FileSystem, filename)
.
main()
method of the Snc
class
uses the (new) FileSystemImpl.getDirtyTable()
method to
get images of the local and remote file systems. An image is just a hash table, where
each file is inserted with its path as the key, and the following information:
boolean dirty; boolean hasDirtyDescendant; String Path; boolean Exists; boolean IsFile; boolean IsDirectory; String[] List; // directories only Signature Sign; // files only
Snc.snc()
method uses these images to compare the file
systems. Instead of calling the remote method for each file (either to do something with
that file, or to mark it synchronized), it inserts the precessed files (sometimes with
their infos) into the task lists: the list of files to mark synchronized, and the list of
files to copy from the other file system, in the form of the array of file infos for these
files, and the list of the files.
Snc.snc
finishes, the main()
method calls FileSystemImpl.sncFinalize()
, passing it the
task lists and the source file system reference.
FileSystemImpl.sncFinalize()
, method performs the
tasks determined by its arguments. Note, that now all the
markSynchronized()
, calls are local.
Snc.snc
method still uses the recursion to descend down
the directory structure. There are no extra threads in the version that I'm turning in.
3. Timing results
The following programs have been compared:
The first presented set of the measurements was taken at school.cs.indiana.edu
at the busiest time (4 - 6 pm). As far as I could tell, there were no CPU-intensive
processes running there during my experiments, but the machine was a bit slow. This is
the graph, showing each program's time as the percentage of the original implementation:
The programs ran in the batch mode, but later I had to repeat some of the runs for the
following reasons: during one of the runs the original program somehow decided that there
was an inconsistency in the file systems and asked for the user decision (I noticed that
only a half an hour later); prof.Pierce's improved version once "hung", and I had to kill
it; each program once or twice had a time that was obviously a "glitch" (I re-run only
the cases that had time around twice the average). The programs didn't run in the order
presented in the tables. I was "mixing" them in the following way: after 4 runs of one
program another one was run for 4 times and so on. Time is given in seconds, the leftmost
column is the run number, two last rows give the average and the standard deviation.
Original
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 141.04 22.55 6.93 55.96 4.65 2.42
2 134.38 23.35 7.53 68.77 5.05 2.64
3 143.74 24.06 7.34 67.61 4.53 2.69
4 124.95 22.14 7.42 67.49 4.72 2.57
5 120.29 21.99 6.95 61.69 4.70 2.59
6 122.65 23.15 6.86 55.33 4.36 2.43
7 123.19 22.37 7.19 50.88 4.53 2.44
8 137.88 21.79 7.18 51.07 4.22 2.36
9 140.73 22.18 7.48 55.94 4.73 2.86
10 138.46 22.63 7.60 58.41 4.48 2.31
11 132.68 22.86 7.34 67.37 4.85 2.23
12 141.30 23.35 7.36 69.29 4.69 2.27
--------------------------------------------------
avrg 133.44 22.70 7.26 60.82 4.63 2.48
sdev 8.12 0.64 0.23 6.75 0.21 0.18
Pierce
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 83.12 14.38 6.28 40.36 7.70 5.08
2 62.52 15.49 6.60 37.77 6.91 4.41
3 68.27 15.64 6.58 42.35 7.13 4.93
4 70.81 14.65 6.49 36.28 6.99 4.83
5 85.02 15.05 6.35 37.93 6.83 4.56
6 62.26 14.70 6.48 33.69 6.40 4.05
7 54.01 13.49 5.93 31.96 6.62 4.48
8 63.99 15.01 6.24 35.43 6.60 4.45
9 76.37 14.79 6.80 68.33 8.34 4.91
10 78.15 15.28 6.86 43.39 7.51 5.47
11 66.83 13.98 6.63 56.14 8.21 5.48
12 98.89 16.05 7.62 52.62 7.69 5.56
--------------------------------------------------
avrg 72.52 14.88 6.57 43.02 7.24 4.85
sdev 11.85 0.68 0.40 10.32 0.61 0.46
Mod13
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 50.80 6.06 2.35 48.07 6.56 2.49
2 45.60 5.64 1.97 49.85 6.40 2.66
3 42.24 5.52 2.02 48.38 6.24 2.37
4 45.08 5.77 1.81 45.88 6.58 2.21
5 67.58 5.68 2.35 84.00 7.01 2.64
6 62.59 6.15 2.07 83.82 6.85 2.54
7 52.26 6.01 2.11 59.83 6.33 2.70
8 55.94 5.91 2.25 56.06 6.68 2.70
9 65.93 6.08 2.25 83.40 6.60 2.70
10 63.77 5.68 2.21 83.12 6.79 2.93
11 62.11 5.96 2.50 74.89 7.00 2.73
12 56.71 5.76 2.29 85.32 6.90 2.65
--------------------------------------------------
avrg 55.88 5.85 2.18 66.89 6.66 2.61
sdev 8.33 0.19 0.18 16.11 0.25 0.18
Extra
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 46.69 3.85 1.62 49.87 4.50 1.58
2 55.85 3.87 1.48 44.30 4.38 1.54
3 44.07 3.80 1.43 46.88 4.40 1.56
4 50.90 3.65 1.74 50.58 4.45 1.73
5 56.94 3.86 1.85 59.15 4.33 2.13
6 59.64 3.70 1.76 72.63 4.46 2.27
7 56.78 3.93 1.78 54.45 4.25 1.99
8 61.35 4.07 1.74 71.50 4.83 2.02
9 56.80 3.71 1.65 56.55 4.45 2.01
10 52.89 4.31 1.68 60.08 4.57 1.86
11 56.06 3.81 1.51 81.62 4.79 1.84
12 61.25 4.28 2.08 57.28 4.35 1.87
--------------------------------------------------
avrg 54.93 3.90 1.69 58.74 4.48 1.87
sdev 5.19 0.20 0.17 10.80 0.17 0.22
The second presented set of the measurements was taken at guitar.cs.indiana.edu
when the machine was idle (4 - 6 am). Nobody was logged in, and except for the processes
owned by root and me, there was only a couple of "abandoned" java's in the sleep state.
Therefore I tend to think that this set reflects the relative times better than the
previous. This is the percentage graph:
These runs went without large time variations, except for #1 in Pierce and #4
in Mod13, which apparently were due to some system daemons. None of the experiments
was re-run.
Original
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 129.85 30.84 8.43 45.06 6.41 3.43
2 109.88 29.13 9.03 45.39 6.33 3.55
3 115.28 29.88 8.74 45.24 6.48 3.46
4 107.67 29.41 8.29 44.31 6.03 3.55
5 108.93 28.89 9.28 45.08 6.83 3.33
6 107.06 29.44 8.60 44.99 5.87 3.64
7 110.03 29.21 8.89 48.00 6.42 3.78
8 111.24 30.61 8.80 45.87 6.33 3.57
9 112.71 29.57 8.64 47.19 6.71 3.45
10 110.13 29.24 8.89 44.94 6.39 3.63
11 129.22 30.52 9.29 44.51 6.23 3.41
12 108.10 29.93 8.57 47.04 6.33 3.83
--------------------------------------------------
avrg 113.34 29.72 8.79 45.63 6.36 3.55
sdev 7.56 0.61 0.30 1.11 0.25 0.14
Pierce
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 110.33 30.12 19.14 48.72 9.47 6.55
2 82.28 20.91 11.00 45.59 9.33 6.97
3 96.81 20.93 11.74 48.37 10.57 8.43
4 77.49 19.65 9.94 45.60 9.08 6.85
5 88.47 22.85 10.59 48.77 9.75 7.63
6 79.21 19.45 10.13 58.57 9.22 6.46
7 87.43 21.04 10.75 50.50 10.32 7.08
8 79.41 17.42 9.84 73.46 12.36 7.31
9 87.75 22.19 11.96 50.11 9.41 7.65
10 93.65 20.14 10.28 46.24 10.07 7.05
11 83.44 20.68 11.02 75.97 11.96 9.40
12 89.44 22.87 11.18 47.65 9.30 7.24
--------------------------------------------------
avrg 87.98 21.52 11.46 53.30 10.07 7.39
sdev 8.78 2.97 2.40 10.13 1.04 0.79
Mod13
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 67.26 8.15 3.54 69.51 9.54 4.03
2 62.88 8.31 3.40 77.07 9.49 4.02
3 70.35 8.14 3.08 86.49 9.16 4.23
4 91.94 8.80 3.34 123.81 9.21 4.34
5 60.43 8.43 3.52 68.25 9.54 3.86
6 57.43 8.03 3.36 63.25 9.11 4.05
7 55.19 7.84 3.30 60.27 9.28 4.18
8 52.44 7.71 3.33 66.78 9.45 3.85
9 54.61 7.76 3.30 65.17 9.23 4.56
10 65.87 7.45 3.37 64.88 9.29 3.98
11 54.19 8.05 3.08 65.72 9.54 4.16
12 57.69 7.90 3.33 65.87 9.13 4.01
--------------------------------------------------
avrg 62.52 8.05 3.33 73.09 9.33 4.11
sdev 10.41 0.34 0.13 16.69 0.16 0.19
Extra
everyting out of date everyting up to date
total user sys total user sys
--------------------------------------------------
1 54.33 5.00 2.86 56.11 5.78 2.88
2 54.48 5.42 2.59 59.92 5.92 3.22
3 54.97 5.11 2.80 67.18 5.87 3.24
4 55.46 4.94 2.71 56.44 5.78 3.12
5 55.18 5.34 2.59 55.40 5.86 3.03
6 53.98 5.39 2.61 53.92 5.60 3.29
7 56.00 5.18 2.92 54.17 5.94 3.12
8 51.07 5.05 2.93 53.76 5.75 3.19
9 53.22 5.42 2.60 54.07 6.04 3.12
10 51.33 5.31 2.91 57.47 6.01 2.96
11 50.91 5.08 2.78 53.90 5.90 3.26
12 52.37 5.17 2.62 54.10 5.85 3.28
--------------------------------------------------
avrg 53.61 5.20 2.74 56.37 5.86 3.14
sdev 1.72 0.16 0.13 3.72 0.12 0.13