Operating Systems

by Tim Singleton

  Introduction

With the recent release of Windows 2000 and the anti-trust lawsuits, Microsoft Corporation has been in the news quite a lot. The term operating system has been thrown around a lot lately. However, many people have no idea what an operating system is or does, nor do they know that there are alternatives to Microsoft. The purpose of this paper is to inform the reader of some of the basic principles behind operating systems and the similarities and differences between various OS's.

Many computers today are shipped with Microsoft Windows already loaded. This gives the impression to some that an operating system is a static thing that only Microsoft makes, that cannot be changed even if they wanted to change it; if Windows were not there the computer would not and could not do anything. There are other choices than Windows. DOS, Linux, UNIX, SunOS, OS/2, BeOS, V2OS, Minix, and DECOS are just a few of the many. Some of these, such as UNIX and Linux are very similar to each other[1]. Others, such as Windows 2000 and V2OS are quite opposite from each other; one is designed for big business computer networks, the other for low level hardware development. There is a broad practical and theoretical spectrum that these operating systems represent. Some are meant to only be used on a network, some are meant to be used only on certain hardware. Some are free while others are very expensive. Generally, operating systems are used to allocate the various resources of the computer. In the more specific realm of multi-purpose PC operating systems, though, most do have some very important things in common. They all have a file system that controls and maintains files, directory structures, file permissions, and other things of the sort. Many have a virtual memory system, software that expands the computers random access memory (RAM) by using extra space on a hard drive. . Some operating systems choose to let certain resources, such as registers (small components of high-speed memory), memory, and files, be shared by multiple programs and applications at the same time. This creates many security issues that need to be dealt with.

This paper addresses three major areas of operating systems: the file system, virtual memory, and shared resources. It is by no means a complete discussion of the field, but provides an overview of some basic concepts and is a starting block for those interested in better understanding how their computer works underneath the glitz and glitter of Microsoft Windows.

File System

The first major portion of an operating system, the one the user interacts with the most, is the file system. The file system controls how files and data are stored, how they are retrieved, and how they are maintained. Data can be stored on many different media types: local hard drive, floppy disk, CD-ROM, DVD, tape and remote server are some examples. Generally, these storage devices can be divided into two types: sequential and random access. Sequential access stores data in chronological order. Two files that are stored one after another on a sequential device are guaranteed to be physically next to each other as well. Since files are stored in sequence, the only way to access them is sequentially. Microfilm is an example of sequential storage. When someone goes to the library to find a newspaper article from August of 1963, they must go through January to July sequentially before getting to August. Tape backup drives are an example of computers using sequential storage. Files are stored according to the date, which makes lookup very easy. If a user is looking at files from August 18 and wants to see August 19th's file, there is no question about where it is. On the other hand, sequential storage implies sequential lookup; there is no good way to jump right to November from January. One must go through all other months to get there. This can be quite inefficient if data from all months is accessed often.

The other main type of storage is random access media. These types often come in the form of main memory or rotating disks. A hard drive is an example of this. A hard drive consists of a number of circular platters and some read/write heads. Each of these platters is divided up into sectors, which go out radially from the center. Also dividing the patters are logical structures called cylinders. Cylinders corresponding tracks on each surface. In a four-platter disk, picture the first track from the center on each platter making up one cylinder (see Figure 1). Both the top and bottom of any platter contain separate data, using the total surface area more efficiently. Each side of every platter has a read/write head that scans out from the center in a radius across the side of the platter. When the platter spins, the head moves an appropriate amount from the center to the correct cylinder and the spinning disk will send all data on that cylinder under the head for reading. It reads the appropriate sector each time it comes by, and data are sent back to the CPU. In general, this is how rotating media works. It need not be broken up into sectors and cylinders (CD's are not), but some reading device sits scanning a spinning disk below it to get data.

Random access media are usually circular in shape, and the data stored on them are not in any particular order. Two files stored in chronological order are not necessarily sorted together physically on the disk, hence the name random access. Generally, finding a file in this file system is simply a lookup in a table or following short chain of pointers to the file. The time it takes to get to any given file, called access time, is where rotating media rises above sequential in terms of performance time. With rotating devices, there is some sort of read head that scans in a line across the disk along a radius out from the center. The disk spins and the head scans to the appropriate radius from the center to reach any bit encoded on the disk. Since files are not necessarily stored sequentially, the file system does not need to go through all previously written files to find the file it is looking for. This makes lookup for any given file very fast and efficient compared with sequential searching for files.

Rotating devices not only differ from sequential in data access, but also in storage mechanism. Sequential devices are confined to putting new files physically next to the last file saved. This is not the case with a random access disk. As long as there is a record of where the files are stored, they do not need to physically next to each other, even if that is the order they were stored in. In fact, a single file does not need to be stored in a single place; it can be broken up and stored in multiple locations. With some sort of record of file storage, half of a file could be stored on one part of the disk and the other half on another part.

Sequential storage and random access (usually in the form of rotating disks) are opposite ends of the spectrum. Sequential storage means files stored together chronologically will necessarily be stored together physically. If one file is deleted from a sequential storage device, that space is gone forever because each subsequent save will be at the end of the device. To get around this, there is random access media, where data do not need to be stored sequentially. This creates a different problem, however, of lookup: if a file's location is not known by its date and time, how is it known? The answer to this is different for different file system implementations. Microsoft uses a lookup table where filenames are mapped to disk addresses. UNIX and its clones use a series of pointers to get to the location of a file.

Microsoft's DOS (Disk Operating System) uses a two objects to store where its files reside on the disk: a Directory and File Allocation Table (FAT). The directory and FAT are where file names are stored and mapped to respective locations. On any disk, whether it is a floppy or a hard disk, the total space on that disk is logically broken up into allocation units, or clusters (Microsoft uses the term cluster, another common term is block). The size of a cluster depends on the size of the disk in question. A 1.44 MB floppy disk usually uses one or two sectors per cluster, whereas a hard drive may use four or more (Norton, 1994). Each entry in the File Allocation Table, or FAT, corresponds to a single cluster on the disk. If a cluster is unused on the disk, it is marked with a 0 in the FAT. Other entries in the table signify different parts of a file. If a file uses only one cluster, its corresponding entry in the FAT is marked with EOF, signifying the end of that file, as shown in cluster 2 in Figure 2. The first two entries in the FAT are always blank. They correspond to the portions of the disk that contain master boot records and partition tables- not part of the file system (Norton, 1994). If a cluster of the disk is found to be bad, that will be marked in the allocation table as well. If a file takes up more than a single cluster on the disk, that will be represented in the FAT by a pointer to the next cluster that file is contained in. Consider a file that starts in cluster 3 as represented in the example FAT of Figure 2. The clusters are ordered starting a 0, not 1, so the fourth entry in the table corresponds to cluster 3. The next cluster this file occupies is cluster 4, as shown by the pointer to cluster 4 in the FAT. This cluster ends and more of the file lies in cluster 9. The entry in the FAT representing cluster 9 points to 11, which is where the remainder of the file lives, signified by the EOF marker in the FAT. Clusters linked in this manner are sometimes called file chains.

The file names are mapped to each of these file chains using a structure called a directory. This is simply a list of file names and their corresponding first cluster locations. An application would look in the directory for the file name it wanted to access, get its first block from the directory, then get the rest of the file information (whether that is the end of file signifier or the next cluster address) from the FAT. The root directory, along with the FAT, is always stored in a known location, so no lookup is needed to find the lookup tables. Directories in DOS are analogous to folders in Windows. Along with filenames, they also contain subdirectories, which is like a folder within a folder. To get to the file C:\DOS\msdos.sys, the file system would first to the C:\ directory. It sees the DOS subdirectory there and the cluster address of it. It goes to that address and looks in the C:\DOS directory for msdos.sys. It gets the address of the first cluster of that file and retrieves it. It also checks in the FAT to see if there are more blocks in the file, and follows the file chain to get those.

Using this method, files do not need to be stored in contiguous clusters. Small files can be deleted with no worry. Large files can be partially stored in the spaces left open by these small files. This has advantages and disadvantages. The obvious advantages are ease of storage and not needing large amounts of contiguous free space. This is an efficient use of disk space, but it also makes for slow file retrieval. If all files are fragmented in this way, the read/write head on the hard disk may have to move much further than necessary, changing heads and cylinders multiple times. The efficiency gained in disk usage is paid for in speed.

File allocation tables are not the only way to handle files. The UNIX operating system and its derivatives use inodes to keep track of file placement. Inodes are like the FAT's file chains, in that they contain pointers to each block of the file. However, the FAT keeps all pointers in a central table, where each file has its own inode, they are not all stored together. An inode generally contains pointers to the first twelve blocks of a file, and then three levels of indirection for larger files. A large number of files in the UNIX system are very small, less than 12 blocks in size, making inodes very efficient for small files. For any files larger than this, UNIX uses the idea of indirects. An indirect points to a particular block on the disk. This block is laid constructed like a typical inode in that it contains pointers to other blocks that contain actual data, however an indirect block is much larger than a normal inode. For example, a single indirect may contain 1000 pointers, each pointing to another block on the disk that contains part of a file. If this does not provide enough storage space for a single file, a double indirect will be used. Similar to a single indirect, this points to a block, set up like an inode, which contains a large number of pointers to a single indirect, each of which contains pointers to data blocks. Thus the potential file size has increased exponentially. Clearly, a triple indirect is a block that contains pointers to double indirects, which contain pointers to single indirects, which contain pointers to file blocks. It is easy to see that this method can handle very large files -- on the order of a terabyte, or 1000 gigabytes. The disadvantage to this method is that for large files, multiple levels of indirection must be traced before the data can be reached, possibly many times, slowing the process down, whereas in the case of DOS, a pointer always points at a block of data, there is no indirection. But since most files in UNIX are relatively small, most of them are dealt with by the first twelve block pointers, and the second and third levels of indirection are rarely needed.

Both UNIX and MS-DOS use a hierarchical naming scheme to organize their file system. This is likened to a tree structure with directories or folders being branches in the tree and files being the leaves. In a hierarchical structure, there is a root directory (/ in UNIX, \ in DOS) and potentially other directories and files under that. Under each of these secondary directories are potentially more directories and files. In the DOS scheme, the root directory is kept in a known location in main memory, which gets loaded from a known location on the disk. Every file and directory in the file system live under this root. With a hierarchical file path like C:\DOS\msdos.sys, the msdos.sys file can be found by starting at the root directory and looking for the DOS subdirectory. This is a pointer in the root directory to the address of the DOS directory. The file system then goes into the DOS directory and looks for the msdos.sys file.

The hierarchical naming scheme works similarly with inodes. The root inode is stored in a known location on the disk, which gets loaded into a known location in memory. Consider looking for the /home/x0tsing/.forward file. The file .forward can be found by following the root's inode to the home inode, which would point to the x0tsing inode, which points to the .forward inode, which points to the actual block(s) that make up that file.

Depending on the particular OS, files may be found by looking up their names in a directory and following the corresponding file chain in the file allocation table. They may be found by going to the root inode and tracing down through the hierarchical directory structure until finding the inode for the given file. These are only two examples of how OS's handles files and directories, and there are others. No matter what method is used, though, the file must somehow get from bits on the media it is stored on to a form recognizable to an application in main memory. This is also a job of the operating system.

To do this as seamlessly as possible, a number of layers of abstraction must be passed through. These layers include the application, the device driver that communicates with the application, and the device controller, which communicates with the driver and the hardware, and the hardware itself. This multi-layered method is used for many reasons. The program does not know where the data it is opening is stored, nor does it care. It would be nearly impossible to require every piece of software to be able to control every piece of hardware a system may contain. If the operating system takes care of handling the hardware, and just providing a common interface through which the software communicates, programs could send the operating system a command like "open", and not have to worry about what happens next, as long as it gets the data it expects. This makes applications smaller, easier to write, more efficient, and more stable. Another reason for the multi-layered system is protection. If the programmer could access whatever hardware they desired, hacking, viruses, and accidental catastrophic system failure would be rampant because of the lack of security, not protecting those components that are not meant to be accessed.

The first of these layers above the hardware is the device controller. This is a combination of software and hardware that performs very low level actions such as spinning the hard drive platters. This device controller, aptly named, physically controls the device, it tells it what movements to go through and what to do once it gets there. In the case of a hard drive, this would be in charge of moving the read/write head, spinning the disk, and reading and writing bits on to the disk. [2]

The next level of abstraction away from the hardware is the device driver. Implemented in software, this abstraction gives commands like "write a block of data" to the controller, which translates it into "move read/write head to position x", and "change the magnetic polarity of these bits to this". This device driver is the interface for the application. When the user clicks "Save", the application sends a request to the device driver saying "Save this text to this file". The driver gets the request, translates it into "write this block of data to this location on the disk", and passes it on to the controller. The device controller gets this request and in turn translates it into "move the read/write head to cylinder/sector/head x, and change the polarity of these bits to this."

The device driver and controller provide levels of abstraction between the hardware and the software. This enables each end component to work independently of each other, making it more modular and thus easier to maintain and upgrade. If one hard drive is replaced with another, the software does not need to know about it. It does not matter to the application how many systems its data passed through before it got to the hardware. It does not matter that it is getting its data from a hard drive or a CD or a network- all very different retrieval methods.

Abstraction is a very important and often used concept in computers. Abstraction is the practice of one component in a system working independently of all others. Also involved in abstraction is components not needing to know what other components exist or how they work. This is a key feature in the file system, file retrieval in particular. There are very distinct levels of abstraction between the application hardware. The concept of a file is abstracted from the 0's and 1's that are actually stored on a drive. The FAT/inode is abstracted from the application so it does not need to know how the file is retrieved. However, this is not limited to the file system. The whole concept of virtual memory is based on the fact that the CPU does not need to know how much main memory there is in a given system, as long as it gets memory space when it needs it.

Virtual Memory

The purpose of virtual memory is to give the CPU the impression that it has much more primary memory (i.e. RAM) than there actually is. Primary memory can be up to 7 orders of magnitude faster than secondary storage (hard disk)[3]. Therefore programs loaded in primary memory run much faster than if they were run from hard disk. Likewise files that are read and written to main memory are accessed much faster than from the disk. The ideal situation would be to have everything stored and running in main memory. This is not a viable option now, though, in part because of the physical size and cost of memory: at the time of this writing, main memory is between 10 and 20 times more expensive than secondary storage (http://www.pricewatch.com). Another reason this is not possible is because main memory is volatile: every time the power is turned off, all data is lost. Virtual memory is used to make the CPU believe there are gigabytes of primary memory available to it, knowing that only small portions of that will ever be used at any given time.

The principle behind virtual memory is to swap programs and data out of primary memory when they are not being used, and swap them back in as they are needed. For instance, if two programs are open, say Netscape and Word, only one is likely to be used at one time[4]. If a user switches from Microsoft Word to Netscape, Word and the buffer being used by it are not needed. They can be swapped out of main memory and into secondary storage (the hard disk), leaving more space for Netscape to do what it needs to do. Word sits on the relatively slow hard disk because it is not being used and does not need to be accessed with any speed. When the user decides he has wasted enough time online, he brings Word to the foreground. The memory manager looks for Word in main memory. If it is not there, it must be swapped back in from the hard disk; this is called a memory fault. Word is swapped back into the fast primary memory, and Netscape swapped into secondary for its dormancy. Sections of programs can be swapped as well; it need not be the entire program. For instance, if this user was in Netscape and wanted to check his mail using Netscape's Messenger mail client, the memory manager could swap out the browser portion of the Netscape program, and swap in the mail client.

This form of virtual memory is called segmentation. The other main form commonly implemented is paging. Both segmentation and paging follow the basic idea of swapping in sections of programs and data when they are needed and swapping them out when they are not, but they go about it in different ways. In segmentation, predefined portions of the program are swapped, whereas in paging the memory manager breaks up programs and data into equal sized sections called pages and swaps those as it sees fit. Regardless of which mechanism is used, the necessary code and/or data are always in main memory when being accessed (as opposed to on the disk). This speeds up the whole process greatly and the CPU does not waste valuable cycles waiting for the program to execute from the hard drive. Even though only parts of the program are in main memory at any given time, it may as well all be there it as far as the CPU is concerned.

Segmentation is an older form of virtual memory. People trying to figure out how to efficiently swap in and out programs noticed that most programs could be broken up into distinct logical segments, which could be swapped in and out separately. The decision of where these segmentation lines were drawn was originally left up to the programmer. If he wanted to write a program with 15 different segments in it, he could take the time to figure out where would be the best place to put one, where a likely place to stop or swap would be, and code the segment breaks himself. One good place to define a segment is the portion of code that defines error codes. They are not used very often, and therefore can be kept in secondary storage until they are needed.

As time went on, compilers got smarter and began doing the segmentation for the programmer. The C compiler began segmenting the program automatically into three types of segments: code, temporary variables, and static variables (Nutt, 1992). During compilation, the compiler generated code segments that contained the executable code. The second type of segment is for temporary variables. The third is a data segment for variables declared and assigned statically within the program. Using this method, the program would load one or more code segments into main memory. It would execute until it needed to do work with variables. It would then load a variable segment into memory, potentially swapping out a code segment. When that was finished the code segment may be swapped back in if needed. Thus the virtual memory system had done its job: it maintained the illusion of having large amounts of main memory while running all programs in main memory regardless of the size.

Over time, it was noticed that certain sections of code tended to be used over and over again. Loops repeat many times, certain function may get called every second or third command. If one memory location is needed, chances are good whatever is physically nearby will be needed as well. This is called the locality of reference principal, and is the basis for the paging implementation of virtual memory. Paging is like segmentation in that parts of the program and/or data are swapped in and out of memory when needed. Unlike segmentation, however, the size of this section to be swapped is fixed, not dependant on the compiler or programmer but on the operating system. The memory manager declares a certain number of bytes to be the size of one page. Thus all of secondary storage can be divided up logically into pages. Likewise main memory can be divided up into equal sized sections called page frames. The size of these pages and frames are operating system and CPU dependent. For instance, DOS uses a 4 KB page on x86 processors (McFedries, 1998) as does Linux (Russling, 1996), but Novell's NetWare 4 OS can have up to a 16 KB page depending on the CPU (Lee, 1995).

When a program is loaded, the first page on disk gets swapped into main memory and executed. If that page gets used often (a loop, for example) it will stay in memory where it is needed. If more pages are needed they will be swapped in. Pages currently in memory that are not needed will be purged to make room. This applies to data as well as code. If a database file keeps getting accessed, it will stay in memory the same way code would; there is no differentiation between the two. The page that was swapped out gets put back on the hard drive in a special location reserved for the virtual memory system, so accessing it does not go through the file system and is faster. UNIX-like OS's assign a partition of the hard drive, called the swap partition to the virtual memory system for swapping. Microsoft operating systems use a page file for their swapping.

The main advantage of paging over segmentation is the small page size. The operating system decides what is the best size for a page, and uses that. This allows for more efficient use of swapping and the limited space in main memory.

Paging works very well, but when all the page frames in main memory are full, the decision on which page gets swapped out creates difficulties. The question of why one page should stay in memory and another should be swapped out is the subject of much research and debate. There are a number of replacement policies, each with its own advantages and disadvantages. Random replacement, least recently used, least frequently used, and first-in first-out are four standard examples. There are algorithms that find optimal or near-optimal page replacement, but these techniques often are more costly than they are worth. Often with these, the time spent deciding what page to replace take up more time than they would save in optimal paging.

Random replacement swaps out any random page when a page frame is needed. This was used in early paging schemes, but it was found to actually decrease performance, so that was quickly discarded. The least frequently used (LFU) policy swaps out the page that has been used the least. If there are 10 page frames, and 9 of them are loops that are used often, the 10th page, which is not used as often as the rest, would be swapped out. This policy assumes that if a page is not being used at all, or relatively infrequently compared to the others, it will continue not being used and is a good choice to swap out. The first in first (FIFO) out policy treats main memory like a queue. Every page stays in for the same number of swaps, based on the assumption that if a single page is used often enough, more pages will be not need to be swapped in as often, keeping it there. Once that popular page gets swapped out it will quickly go back in again and stay for another long while. Otherwise, one page is as good as another to replace, so remove the one that has been taking up space the longest. The policy used most often in today's implementations is the least recently used (LRU) method. When a page frame is needed in main memory, it swaps out the one that has been unused for the longest amount of time. So if a page containing a loop is used over and over that would stay in memory until it was no longer used. The difference between LRU and LFU is that if a page is accessed a lot of times in the beginning of the program and not again at all, it was still used very frequently compared to the other pages, so LFU would keep it in memory. On the other hand, since it is not being used anymore, there are better ways to use the space it is taking up, so LRU would kick it out. The least recently used replacement policy is one of the more efficient policies currently implemented, and it is very easy to code into the memory manager, so it is used more than most.

All of these methods enable a memory manager to make efficient use of memory by replacing sections of data or code that is not being used with that which will be. The effect of this is to make a limited amount of high speed memory seem essentially unlimited. Segmentation does this by swapping entire predefined code or data segments of variable size into memory and exchanging it for one or more segments that are not needed. Paging goes through a similar procedure by swapping in needed data or code and swapping out that which is unneeded. However paging swaps a particular size of memory in pages that are a particular size decided by the operating system, not the programmer or compiler.

Not all of main memory is used for the Virtual Memory system though; memory is divided up into other sections as well. There is a section of memory set aside for the file system's root directory or root inode, the mouse location is stored somewhere special, the current window coordinates have a pre-defined space. The virtual memory system has its own memory space that is shared among all processes that use it; all applications, all editors, all user-driven programs share virtual memory space. How the memory manager deals with all these different processes trying to share the same space is an example of shared resources. Many of the resources in a computer, such as the CPU and temporary registers must be shared to give efficiency to the system as a whole. However the principle of abstraction says that processes do not and should not know what other process are doing. This creates potential difficulties with coordinating all processes trying to access the same resource.

Shared Resources

Sharing resources creates conflicts among the processes that need access to that resource. On any system where multitasking and multiprocessing are possible, there are shared resource issues. If two different processes want to read in a shared variable, update it for their own purposes and then set it back again, there will be problems when attempting this simultaneously. Consider process A and B sharing a register. Both want to get data from memory and store it into that register, both want to change that data, and both want to store the new data back, all at the same time. The simple lines of assembly code below represent this. Process A is the assembly for I := I + 10 and Process B is for the instruction J := J - 5.



         Process A                  Process B
     Load    R1,   I             Load    R1,   J
     Add     R1,   10            Sub     R1,   5
     Store   R1,   I             Store   R1,   J
where R1 is a register and I and J are memory addresses.

Process A loads data from memory location I into the register and adds 10 to it. Then process B loads into the same register what is in memory location J, process A will store the value in the register (whatever came from address J) into address I- not at all what it wants or expects. Process B will subtract 5 from register 1 and store that value back into J exactly as it had planned. Process A entered into a critical region of time during which the data was vulnerable to being changed and it did not have exclusive access to the register. This critical section was between the time it loaded the register and the time it stored it. These three steps are obviously not one atomic action that cannot be broken down any further. Just as an atom is the smallest part of a molecule, it is also the smallest possible instruction that cannot be separated into pieces. Process A's command sequence of load/add/store was not atomic, therefore could be interrupted by Process B, since it was able to use the register as well. This creates a race condition between the two processes. Whichever can store its value before the other changes it will win the "race" and maintain its data. If Process B had loaded its data into R1 between Process A's load and add instructions, both would have lost this race.

This is a dangerous side effect of shared resources, whether they are registers, shared program variables, or common files. One way to protect these shared resources during the critical section of data manipulation is to lock that resource. This can be done, among other ways, by using semaphores. A semaphore is a variable that is available to every process using the shared resource that it protects. A semaphore can be seen as a switch: when the switch is off a process can turn it on, write to the shared resource, then turn it back off. If the switch is already on, the process must wait until it is off to write its data. Dijkstra, who came up with the implementation of semaphores, defines them this way (Nutt, 1992):

     A semaphore s is a non-negative integer variable that 
     can only be changed by two operations: P and V.

     P(s) = while (s=0) wait;   s <- s-1
     V(s) = s <- s+1 
If s= 0 then the switch is on. If s> 0 the switch is off.

It is very important that both P and V are atomic operations. Nothing must be able to interrupt a process while it is executing P or V or the same problem as described above will occur. The semaphore sis initially set to 1 for any process to be able to access it. This is like the switch being off. When a process (call it Process A) wants to have exclusive write access to a shared resource, it must execute P(s) for that resource's semaphore s. Process A sees that sis set to 1, so it skips over the wait and decrements sto 0. The resource is now locked (the switch is on). If Process b needs to access it, it sees sis set to 0, and it waits until A is finished and increments sto 1. B sees that sis set to 1, skips the wait, sets it to 0, and does its business.

It is essential that P is atomic. This is tricky to achieve because P has two parts and is difficult to execute as one operation. If P is not atomic, however, process A might execute the wait and see that s= 1. Between this point and when A decrements s, Process B may also see that s= 1; this is A's critical region. Now process A decrements s, setting it to 0 and locking the resource. B already tested sthough and it saw it was 1, so it also decrements s, setting it to -1. The definition of a semaphore is a non-negative integer variable, so this last change crashes the semaphore and the resource is no longer protected.

If P can be made atomic, semaphores are a good locking mechanism. While the resource is locked (s = 0), any process wanting to write must wait. When it is not locked, whichever process executes P first locks it by setting testing sand setting it to 0 at the same time. It writes to the shared resource, then executes V and sets sback to 1. Thus the resource has been locked while writing occurs, ensuring data integrity.

There are some situations, however, where semaphores are not the best protection scheme. If P cannot be made atomic, the semaphore may as well not be there at all. Another problem with semaphores is they sometimes lead to a phenomenon called starvation. This is when a process A is always waiting because s= 0 and every time it gets set back to 1, another process sees it first and sets sback to 0 locking process A out again. There is no precedence based on which process has been waiting longest or needs the resource the most. If there is some time frame in which process A must execute its command, it can literally die of starvation if it is not able to access this resource.

In cases such as these, monitors are often chosen to protect resources. A monitor is somewhat like a common feeder line for a teller at the bank. If any process wants to access that resource, it must wait in line. The key to using a monitor for protection is that it must have exclusive access to the resource. It alone has the ability to grant access rights to the resource it is protecting. These are usually implemented as queues, granting access to each process serially, based on the order they arrived. Since the monitor is what accesses the shared resource, it prevents race conditions because two processes cannot interrupt each other. It does not matter which process gets to the monitor first, they will all be served regardless of timing. Critical regions are no longer a problem either. Since the monitor has exclusive access to a resource, it dictates which processes are able to use it, and prevents all others from doing so, thus monitoring the critical region itself.

The use of monitors is an improvement over semaphores, but they have problems as well. Monitors create potential scheduling issues for the resource it is guarding. The most shared resource of any in the computer is the central processing unit (CPU), this may be guarded by a monitor like any other shared resource. If one process requires 1000 CPU cycles, and it is first in line for the monitor, and another process only needs 10 cycles is behind it, it makes no sense to proceed in the established queue order. On the other hand, if there are many processes, all waiting behind the first one, they should not all go in front or starvation will occur. There are algorithms for scheduling processes so each gets a fair time slice, some according to how many cycles is needed, some not. The most obvious one and easiest to implement is a simple queue: First-In First-Out. This undoubtedly works, but there are better ways. Just using a FIFO implementation is like having no express checkout line at the grocery store. Everybody will eventually get to a register, but some may wait unfair amounts of time for the few items they have. Another option is Shortest Job Next. This takes the process that is requesting the fewest cycles and gives that job precedence. Then it serves the next shortest job. This method brings about the problem of starvation for big jobs, just as semaphores could do for any job. If there are always short jobs coming into the queue, a process requesting a large number of CPU cycles is likely to keep getting postponed, perhaps indefinitely. This process literally starves for cycles, and may die, thinking there is no CPU. Each of these algorithms takes a process and runs it to completion before taking another.

To prevent problems of this type, preemptive algorithms have been developed to give each process a fair share of cycles. This may happen by granting a small job all of its cycles, and then a large job a portion of its requested cycles and then putting it back into the queue. This Round Robin method keeps feeding CPU time to processes. If any process needs more CPU time, it is sent back to the queue, partially unfinished but guaranteed another turn.

Preemptive algorithms make for efficient division of CPU resources, but the more complex and efficient the algorithm is, the more time it takes to execute. The best scheduling algorithms may be a good choice in terms of multiprocessing efficiency, but it may defeat the purpose if the algorithm takes up more time deciding which process to pick next than it saves over uniprocessing.

Monitors can be to protect hardware, such as the CPU and registers, but they protect data just as well. If there is a particular file that gets opened often and written to by various processes, a monitor would be a good choice to protect this. These processes may be doing this as a means of sharing data common to them all. If multiple programs needed to access the same data repeatedly, they may allocate a central spot in main memory to share between them, and use as needed. With shared memory, as with all other shared resources, there needs to be some protection. To ensure shared data is not corrupted or monopolized, semaphores or monitors could be put in place. Semaphores would guard the data by locking the file every time a process attempts to write, but leaving itself open to a crash because of its critical region. Monitors would enqueue processes trying to access the file and give each its own turn. This takes up resources, though, and if a similar system is needed for many processes, this could be quite inefficient. Another method of sharing data between processes is to allow the processes to communicate with each other. This interprocess communication (IPC) is much safer than shared memory because every process gets the data exclusively. There are no critical regions and no chance for unexpected results simply because the data is shared. IPC is fundamentally different than resource protection because the data is being passed from process to process, instead of the processes going from resource to resource.

The most logical form of IPC is message passing. When process A wants to communicate with process B, A simply sends B a message. Sending messages can happen one of two ways: synchronously, and asynchronously. Synchronous sending implies that sending process sends the messages and then does nothing until the message is received. Asynchronous sending occurs when one process sends out its message and then goes on with normal operation. This is a more efficient method, but it cannot guarantee or confirm receipt like synchronous sending can. Each process may have a section of its assigned memory opened up as a "mailbox" of sorts for interprocess messages. Asynchronously sending a message would consist of a process giving its message up to the operating system to copy into another process' open memory space. This takes care of the shared resource problem because the OS is the only process allowed to write to shared memory. These messages could be enqueued with a monitor if there were more than expected. This achieves communication between processes without the dangers of all needing access to a single piece of shared memory or other resource.

Similar to synchronous and asynchronous sending, receiving can be either blocking or nonblocking. A blocking receive means that a process blocks all other input and halts execution until it receives a message. A nonblocking receive comes from a process that performs its operation regardless of messages. It will check its mailbox and if there is no message when a receipt has been requested it will continue.

UNIX's "pipes" is example of interprocess communication. This is a utility that uses asynchronous sending and blocking receive. Programs often produce output that is not in a desirable format to the user. A user may want to print out a directory listing looking for a certain file. If this is a huge directory, he may not want to look through each entry, but put that directory listing through grep and search for the desired file[5]. Pipes make this possible and easy. Pipes can be likened to a pipe connecting the standard output of one process or program to the standard input of another. In this example, the output of the directory listing can be "piped" to the grep utility to search for the filename. The output of grep, the desired filename, is then the only thing printed to the screen. In terms of IPC, the directory listing process and grep are both executed. Grep knows it is waiting for input, so it blocks its execution until it receives input. The directory list asynchronously sends its output to grep continuing on with its execution, not worrying about if the messages are being received.

Example

As a final example, consider a DOS user editing the "myfile.doc" file in the directory C:\PAPERS[6]. This user is using a word processor such as Word Perfect. He just opened the application and wants to open the file. The "file open" command always results in a disk fetch. The file being opened implies that it was previously closed and would not be paged in main memory. The application's request goes to the device driver for the hard disk with a command like "open the file 'C:\PAPERS\myfile.doc'". The driver looks up the subdirectory name "PAPERS" in the root directory and gets the cluster location where that directory is stored. It tells the device controller to fetch this cluster. When it gets that directory back, it looks for the file name "myfile.doc" and sees the first cluster location. The driver tells the device controller to fetch this cluster. The controller tells the platters on the disk to start spinning, tells the read arm to move to a certain distance away from the center, and return the bits it senses when the appropriate sector comes around. The controller then passes the entire block back up to the driver. The driver gives this data to the memory manager for temporary storage until it can be given to the application. If this is not the entire file, the driver then goes into the File Allocation Table and looks up the next cluster location. In this location in the FAT there would be either a pointer to the next cluster to fetch or an End Of File marker. If there are more clusters to fetch, the driver tells the controller to do so and continues following the chain of pointers through the FAT. When the EOF marker has been reached, the process stops and control is turned back to the memory manager.

The memory manager gets the "myfile.doc" file and must swap it into main memory. Assume the paging replacement policy is Least Recently Used. The memory manager finds the page frame that has been least recently accessed, swaps it out and swaps in the first page frame of "myfile.doc" in its place. The application can now get the file from main memory and print it to the screen. More page frames will either be swapped in right away or they will be stored in the page file to be swapped in later when the application requests the rest of the file.

It is likely that this file is larger than 4 KB, the size of a single page. When the user scrolls down to the end of the document, the application may not have that portion of the file in its memory space because it was never swapped in. It would look in its memory space, and not finding it throw a page fault, which signals the memory manager to swap in the requested page. The memory manager then finds the page that has been least recently used (most likely not the other page of the open file), and puts it into its own virtual memory space. Then the requested page frame of "myfile.doc" is put into the newly freed virtual memory location, and the swapped out page is moved from the memory manager's space onto the hard drive.

Conclusion

Operating systems are incredibly complex things, both on the theoretical and physical level. Individual concepts are relatively simple. The idea of virtual memory being a system to swap in parts of programs or data in order to keep necessary information in main memory makes sense, it is an intelligent idea. Sharing resources is a logical step from processes wanting to run at the same time. The issues that arise with shared resources are dealt with in a number of ways- such as semaphores, monitors, and inter-process communication- each having strengths and weaknesses. The file system simply manages files, a directory structure, and how the applications communicate with the hardware to access these files. There are many more aspects of operating systems that have not been touched upon, such as an operating system handling multiple CPU's and file protection models.

However it is not any one component that makes an operating system so amazing. It is the simultaneous practice of abstraction, modularity, and integration that is truly impressive. Each component has been specifically designed so that it does not need to know what any other component is doing, or that it even exists. The application would perform equally as well with or without virtual memory, as would the file system. This level of abstraction and modularity enables components to be taken out and entirely new ones dropped back in with no need for informing the other parts. A totally new file system could be implemented in an operating system neither the virtual memory system, nor the applications, nor the shared resources would have any idea of what had happened. Nor would it matter. Knowing how complex each component can be, it is truly amazing that the whole thing works, and works as well as it does.











Bibliography and Works Cited
Lee, R. "The NetWare 4 Memory Architecture." (1995, April 6). 
     Provo, UT: Novell Corporation. Retrieved April 10, 2000 
     from the World Wide Web: 
     http://developer.novell.com/research/appnotes/1995/april/06

McFedries, P. (1998) Complete Idiot's Guide to Windows 98 
     Indianapolis: MacMillan Publishing.  Retrieved April 10, 2000 
     from the World Wide Web: 
     http://www.geocities.com/Pipeline/5748/opmem13.htm

Norton, P. (1994)  Peter Norton's Complete Guide to DOS 6.22 
     (6th ed.).  Indianapolis: Sams Publishing.

Nutt, G. (1992).  Centralized and Distributed Operating Systems. 
     Englewood Cliffs: Prentice Hall.

Rusling, David A. The Linux Kernel.  
     Copywrite 1996 http://www.linuxdoc.org/LDP/tlk/

Tanenbaum, A. (1990).  Structured Computer Organization (3rd ed.). 
     Englewood Cliffs: Prentice Hall.