480 Lecture Notes January 16, 1996 Introduction to Distributed Systems Motivations: Why do we develop distributed systems? availability of powerful yet cheap microprocessors (PCs, workstations), continuing advances in communication technology, \bf{What is a distributed system?} A distributed system is a collection of independent computers that appear to the users of the system as a single computer. Examples: 1. Network of workstations 2. Distributed manufacturing system (e.g., automated assembly line) 3. Network of branch office computers \bf{Goals} \bf{Advantages of Distributed Systems over Centralized Systems} - Economics: a collection of microprocessors offer a better price/performance than mainframes. Low price/performance ratio: cost effective way to increase computing power. - Speed: a distributed system may have more total computing power than a mainframe. Ex. 10,000 CPU chips, each running at 50 MIPS. Not possible to build 500,000 MIPS single processor since it would require 0.002 nsec instruction cycle. Resource sharing. Enhanced performance through load distributing. - Inherent distribution: Some applictions are inherently distributed. Ex. a supermarket chain. - Reliability: If one machine crashes, the sytsem as a whole can still survive. Highter availability and improved reliability. - Incremental growth: Computing power can be added in small increments. Modular expandability Long term deriving force: the existence of large number of personal computers, the need for people to work together and share information. \bf{Advantages of Distributed Systems over Independent PCs} - Data sharing: allow many users to access to a common data base - Resource Sharing: expensive peripherals like color printers - Communication: enhance human-to-human communication - Flexibility: spread the workload over the available machines \bf{Disadvantages of Distributed Systems} - Software: little software exists for distributed systems OS, PL, degree of transparency, etc. - Network: saturation, loss transmissions - Security: easy access also applies to secrete data \bf{Hardware Concepts} Taxonomy (Fig. 1-4) MIMD Multiprocessors (shared memory) Bus Switched Mulicomputers (private memory) Bus Switched Bus versus Switched - Bus: a single network, backplane, bus, cable or other medium that connects all machines. E.g., cable TV - Switched: individual wires from machine to machine, with many different wiring patterns in use. Tightly Coupled versus Loosely Coupled - Tightly coupled systems: intermachine delay short, data rate high (multiprocessors) - Loosely coupled systems: intermachine delay long, data rate low (distributed systems) Bus-based multiprocessors (Fig. 1-5) - cache memory - hit rate - cache coherence - write-through cache - snoopy cache Switched Multiprocessors (Fig. 1-6) - for connecting large number (say over 64) of processors - crossbar switch: n**2 switch points - omega network: 2x2 switches for n CPUs and n memories, log n swiching stages, each with n/2 switches, total (n log n)/2 switches - delay problem: E.g., n=1024, 10 swiching stages from CPU to memory. a total of 20 switching stages. 100 MIPS 10 nsec instruction execution time need 0.5 nsec switching time - NUMA (NonUniform Memory Access): placement of program and data - building a large, tightly-coupled, shared memory multiprocessor is possible, but is difficult and expensive Bus-Based Multicomputers (Fig. 1-7) - easy to build - relatively slow speed LAN (10-100 MIPS, compared to 300 MIPS and up for a backplane bus) Switched Multicomputers (Fig. 1-8) - interconnection networks: E.g., grid, hypercube - hypercube: n-dimentional cube \bf{Software Concepts} Network Operating Systems - loosely-coupled software on loosely-coupled hardware - each machine has a high degree of autonomy - rlogin machine - rcp machine1:file1 machine2:file2 - file servers for shared file systems - a few system-wide requirements, format and meaning of all the messages exchanged Distributed Operating Systems - tightly-coupled software on loosely-coupled hardware - provide a single-system image or a virtual uniprocessor - a single, global interprocess communication mechanism, process management, file system; the same system call interface everywhere Multiprocessor Operating Systems (Fig. 1-11) - shared memory - single run queue - traditional file system as on a single-processor system: central block cache Fig. 1-12 \bf{Design Issues of DOS} + Transparency: how to achieve the single-system image, i.e., how to make a collection of computers appear as a single computer. Hiding all the distribution from the users as well as the application programs. Can be achieved at two levels: 1) hide the distribution from users 2) at a lower level, make the system look transparent to programs. 1) and 2) requires uniform interfaces such as access to files, communication. - Location Transparency: users cannot tell where hardware and software resources such as CPUs, printers, files, data bases are located. - Migration Transparency: resources must be free to move from one location to another without their names changed. E.g., /usr/lee, /central/usr/lee - Replication Transparency: OS can make additional copies of files and resources without users noticing. - Concurrency Transparency: The users are not aware of the existence of other users. Need to allow multiple users to concurrently access the same resource. Lock and unlock for mutual exclusion. - Parallelism Transparency: Automatic use of parallelism without having to program explicitly. The holy grail for distributed and parallel system designers. Users do not always want complete transparency: a fancy printer 1000 miles away + Flexibility: Make it easier to change. - Monolithic Kernel: systems calls are trapped and executed by the kernel. All system calls are served by the kernal. UNIX. - Microkernel: provides minimal services. 1) IPC; 2) some memory management; 3) some low-level process management and scheduling; 4) low-level i/o E.g., Mach can support multiple file systems, multiple system interfaces. + Reliability: Distributed system should be more reliable than single system. 3 machines with .95 probability of being up. 1-.05**3 probability of being up. - Availability: fraction of time the system is usable. Redundancy improves it. - Need to maintain consistency - Need to be secure - Fault tolerance: need to mask failures, recover from errors. + Performance: without gain on this, why bother with distributed systems. Performance loss due to communication delays: - fine-grain parallelism: high degree of interaction - coarse-grain parallelism Performance loss due to making the system fault tolerant. + Scalability: Systems grow with time or become obsolete. Techniques that require resources linerly in terms of the size of the system are not scalable. e.g., broadcast based query won't work for large distributed systems. ------------ System Architecture Types minicomputer model -- more users than processors workstation model -- equal number of users and processors (e.g., Athena, Andrew) processor pool model -- more processors than users (e.g., Amoeba) Distributed Operating Systems to manage resources efficiently to provide a friendly interface to users transparency -- make machine boundaries invisible manifest as extra dalay D1. Global Knowledge: up-to-date state information no global memory, no global clock, unpredictable message delay temporal ordering (clock synchronization) achieving concensus in absence of global knowledge (e.g., token lost dection and insertion -- agree that it is lost) theoretical limitation on maintenance of global knowledge D2. Naming: names to refer/identify objects such as computers, email addresses, files, etc name service: logical names to physical addresses centralized name server vs distributed name server -- single point of failure for distributed name server replicated directory -- need more storage, need to synchronize update partitioned directory -- difficult to know where to look D3. Scalability: Systems grow with time or become obsolete. Techniques that require resources linerly in terms of the size of the system are not scalable. e.g., broadcast based query won't work for large distributed systems D4. Compatibility: interoperablity among components in a system Three levels of compatibility binary level execution level -- source code level compatibility protocol level -- allow different operating systems D5. Process Synchronization: concurrency control -- access shared resource one process at a time the mutual exclusion problem deadlock detection and resolution algorithms deadlock prevention algorithms D6. Resource Management: make local and remote resources available to users data migration -- bring data to where it is needed distributed file system for files network transparency (for distributed file system) distributed shared memory for (shared) data in memory computation migration -- execute remotely e.g., query on directory (rather than copying the directory to a local site, execute the query at where the directory is) distributed scheduling -- processes (jobs) are moved for load sharing better utilization of computers D7. Security: authentication -- determine that an entity is what it claims to be, authorization -- decide what privileges an entity has and provide only those privileges D8. Structuring: how to organize various parts of the operating system 1. Monolithic Kernel vs Collective Kernel the monolithic kernel -- traditional approach, one big kernel e.g., UNIX the collective kernel -- a collection of largely independent processes, each process provides a different service (e.g., memory management, scheduling, name services, RPC, time management, etc.) microkernel -- the nucleus of the operating system supports the interaction between server processes and provides services to server processes (messages, task scheduling, processor management, virtual memeory) can support the policy and mechanism separation principle e.g., Mach, V-Kernel, Chorus, Galaxy 2. Process Model vs Object-Oriented Model Process Model: OS services are provided by a set of processes Object-Oriented Operating System: implement system services as objects e.g., Eden, Choices, x-kernel, Medusa, Clouds, Amoeba, Muse 3. Client-Server Computation Model processes are either servers or clients Server processes s provides services A client process sends a request message to a server process and waits for a reply message. A server waits for request messages and responds when a request arrives. Communication Networks computers are connected through a communication network C1. Wide Area Networks (WAN) connect computers spread over a wide geographic area point-to-point or store-and-forward -- data is transferred between computers through a series of switches switch -- a special purpose computer responsible for routing data (to avoide network congestion) data can be lost due to: switch crashes, communication link failures, limited buffers at switches, transmission errors, etc. Packet Switching versus Circuit Switching circuit switching -- a dedicated path between a source and a destination e.g., telephone connection. wastes bandwidth (bandwidth = amount of data transmitted in a given time period) packet switching -- message or data is broken into packets packets are routed indepentely better network utilizations disassemble and assember overheads The ISO OSI Reference Model C2. Local Area Netwoks (LAN) Communication Primitives