SPECIFICATION OF A FILESYSTEM SYNCHRONIZER
                         (Draft: Version 10)

DEFINITION [Basic sets]:
   x,y : XX are FILE NAMES
   p,q,r : PP are PATHS
         where PP is the set of sequences of file names 
         (empty path is written <>)
   F,G : FF are FILE CONTENTS

Write q<=p for "q is a prefix of p," i.e. p = q.r for some r.

DEFINITION: A function S@ : PP -> ({NIL,DIR} union FF) represents a
FILESYSTEM CONTENTS if
   1) S@(p.x) != NIL  ==>  S@(p) = DIR 
   2) exists n, for all q, |q| > n ==> S@(q) = NIL 

DEFINITION: A FILESYSTEM S is a triple of functions (S@,S~,S#) satisfying the
following conditions:
  1) S@ is a filesystem contents    
  2) S~, S# : PP -> Boolean    
  3) S~(p)  ==>  S@(p) != NIL
  4) S#(p)  ==>  S@(p) = DIR
  5) S#(p)  ==>  S~(q)  for some q <= p   

Write FS for the set of all filesystems.

DEFINITION: Write |S| for the length of the longest path p such that
S@(p) != NIL. 

DEFINITION: The PARENT of a path p in a filesystem S, written
parent(S,p), is defined as follows:
  parent(S,p) = q    if p=q.x for some x
                        S@(q) = DIR
  parent(S,p) undefined otherwise.

DEFINITION: The set of  CHILDREN of a path p in a filesystem S,
written children(S,p) is defined as follows:
  children(S,p) = { q | q = p.x for some x } if S@(p) = DIR
                = {} otherwise.

DEFINITION: S is said to be (POSSIBLY) CHANGED at p, written changed(S,p), if
       S@(p) = DIR   /\  (S~(p) \/ S#(p))
   \/  S@(p) != DIR  /\  (S~(p) \/ 
                         (parent(S,p) defined /\
                          (S#(parent(S,p)) \/ S~(parent(S,p))))

DEFINITION: A filesystem S is (POSSIBLY) CHANGED BELOW p, written
changed*(S,p), if, for some q (possibly empty), changed(S,p.q).

DEFINITION: S is said to be OLD at p, written old(S,p), if !changed*(S,p)

FACT [OLD*]: If old(S,p) then old(S,p.q) for any q.

DEFINITION: When f is a function on paths, write f/p for "f after p",
defined as follows: (f/p)(q) = f(p.q).

DEFINITION [Declarative presentation]: The pair of new filesystem
contents (RA@,RB@) is said to be a SYNCHRONIZATION of original
filesystems (A,B) if, for each path p:

   1) A@(p) = B@(p)                   ==>  UNCHANGED-AT(p)
   2) old(A,p)                        ==>  CHOOSE-B-AFTER(p)
   3) old(B,p)                        ==>  CHOOSE-A-AFTER(p)
   4) changed(A,p)  /\ changed(B,p)   ==>  UNCHANGED-AFTER(p)
   5) changed(A,p)  /\ changed*(B,p)  ==>  UNCHANGED-AFTER(p)
   6) changed*(A,p) /\ changed(B,p)   ==>  UNCHANGED-AFTER(p)

where
  UNCHANGED-AT(p)     ==  RA@(p) = A@(p) /\ RB@(p) = B@(p)
  UNCHANGED-AFTER(p)  ==  RA@/p = A@/p /\ RB@/p = B@/p
  CHOOSE-A-AFTER(p)   ==  RA@/p = RB@/p = A@/p
  CHOOSE-B-AFTER(p)   ==  RA@/p = RB@/p = B@/p.

LEMMA: If !changed(S,p) /\ changed*(S,p), then S@(p) = DIR.

PROOF: Suppose that S@(p) != DIR and that !changed(S,p).  Then we must
show !changed*(S,p), i.e., that !changed(S,p.q) for any nonempty q.
Since S@(p.q)=NIL (by condition (1) in the definition of filesystem
contents), we must show 
  (a) !S~(p.q) 
  (b) !(parent(S,p.q) defined /\ (S#(parent(S,p.q)) \/ S~(parent(S,p.q)))).
But (a) holds by condition (3) in the definition of filesystem, while
(b) holds because parent(S,p.q) is always undefined.

DEFINITION: A filesystem S WAS SYNCHRONIZED AT O if !old(S,p) whenever
O@/p != S@/p.

FACT [OLD]: If A and B were both synchronized at O and old(A,p) and
old(B,p), then A@/p = B@/p.

DEFINITION: Let S and T be filesystems and p a path.  We write
S|p <-- T for the filesystem formed by COPYING S FROM T AFTER p, 
defined formally as follows:

  S|p <-- T  =  ((\q. if p <= q then T@(q) else S@(q)), 
                 S~, 
                 S#)

(That is, the "touched" and "id-changed" maps of S are unchanged,
while the contents part of S is overwritten with the contents of t for
all paths extending p.  The symbol \ stands for lambda: i.e., "\q. if
p <= q then T@(q) else S@(q)" is the function that, for each path q,
returns T@(q) if p<=q and S@(q) otherwise.)

DEFINITION [Algorithm Snc]: The SYNCHRONIZATION ALGORITHM
  Snc : FS * FS * PP -> FS * FS 
is defined as follows:
  
 Snc(A,B,p) = 
    1) if old(A,p) /\ old(B,p) 
         then (A,B) 
    2) else if A@(p) = B@(p) = DIR
         then let x1,x2,...,xn be some enumeration of the set
                  { x | A@(p.x) != NIL or B@(p.x) != NIL }
              let (A0, B0) = (A,B)
              let (Ai+1, Bi+1) = Snc(Ai, Bi, p.(xi+1))
              in (An,Bn)
    3) else if old(A,p)           
         then (A|p <-- B, B)
    4) else if old(B,p)           
         then (A, B|p <--A)
    5) else 
         (A,B)

FACT [Snc is a total function]: 
  1) For each A, B, and p, Snc(A,B,p) terminates.
  2) The result returned by Snc(A,B,p) is insensitive to the enumeration
     chosen in clause 2 of the definition.

PROOF: By induction on max(|A|,|B|) - |p|.

CONJECTURE [Equivalence of the declarative and algorithmic versions]: 
Suppose that A and B were both synchronized at O and that 
Snc(A,B,p) = (C,D). Then: 
  1) (C,D) is a synchronization of (A,B).
  2) If (C',D') is a synchronization of (A,B) after p, then 
     (C,D) = (C',D').