File Management
UNIX has three types of files: directories, ordinary files, and special files. Each file enjoys certain privileges. Directories are files used by the system to maintain the hierarchical structure of the file system. Users are allowed to read information in directory files, but only the system is allowed to modify directory files. Ordinary files are those in which users store information. Their protection is based on a user’s requests and related to the read, write, execute, and delete functions that can be performed on a file. Special files are the device drivers that provide the interface to I/O hardware. Special files appear as entries in directories. They’re part of the file system, and most of them reside in the /dev directory. The name of each special file indicates the type of device with which it’s associated. Most users don’t need to know much about special files, but system programmers should know where they are and how to use them. UNIX stores files as sequences of bytes and doesn’t impose any structure on them. Therefore, text files (those written using an editor) are strings of characters with lines delimited by the line feed, or new line, character. On the other hand, binary files (those containing executable code generated by a compiler or assembler) are sequences of binary digits grouped into words as they will appear in memory during execution of the program. Therefore, the structure of files is controlled by the programs that use them, not by the system. The UNIX file management system organizes the disk into blocks of 512 bytes each and divides the disk into four basic regions: • The first region (starting at address 0) is reserved for booting. • The second region, called a superblock, contains information about the disk as a whole, such as its size and the boundaries of the other regions. • The third region includes a list of file definitions, called the i-list, which is a list of file descriptors, one for each file. The descriptors are called i-nodes. The position of an i-node on the list is called an i-number, and it is this i-number that uniquely identifies a file. • The fourth region holds the free blocks available for file storage. The free blocks are kept in a linked list where each block points to the next available empty block. Then, as files grow, noncontiguous blocks are linked to the already existing chain. 417File Management C7047_13_Ch13.qxd 1/13/10 9:43 PM Page 417 Whenever possible, files are stored in contiguous empty blocks. And since all disk allocation is based on fixed-size blocks, allocation is very simple and there’s no need to compact the files. Each entry in the i-list is called an i-node (also spelled inode) and contains 13 disk addresses. The first 10 addresses point to the first 10 blocks of a file. However, if a file is larger than 10 blocks, the eleventh address points to a block that contains the addresses of the next 128 blocks of the file. For larger files, the twelfth address points to another set of 128 blocks, each one pointing to 128 blocks. For files larger than 8 MB, there is a thirteenth address allowing for a maximum file size of over 1GB. Each i-node contains information on a specific file, such as owner’s identification, protection bits, physical address, file size, time of creation, last use and last update, number of links, and whether the file is a directory, an ordinary file, or a special file.
File Naming Conventions
Filenames are case sensitive so they recognize capital letters in filenames. For example, these are legitimate names for four different files in a single directory: FIREWALL, firewall, FireWall, and fireWALL. Most versions of UNIX allow filenames to be up to 255 characters in length. Although the operating systems don’t impose any naming conventions on files, some system programs, such as compilers, expect files to have specific suffixes (which are the same as extensions described in Chapter 8). For example, prog1.bas would indicate the file to be a BASIC program because of its suffix .bas, while the suffix in backup.sh would indicate the file to be a shell program. UNIX supports a hierarchical tree directory structure. The root directory is identified by a slash (/); the names of other directories are preceded by the slash (/) symbol, which is used as a delimiter. A file is accessed by starting at a given point in the hierarchy and descending through the branches of the tree (subdirectories) until reaching the leaf (file). This path can become very long and it’s sometimes advantageous to change directories before accessing a file. This can be done quickly by typing two periods (“..”) if the file needed is one level up from the working directory in the hierarchy. Typing ../.. will move you up two branches toward the root in the tree structure. 418Chapter 13 | UNIX Operating System C7047_13_Ch13.qxd 1/13/10 9:43 PM Page 418 To access the file checks in the system illustrated in Figure 13.9, the user can type