Filesystems

Overview

The basic definition of a filesystem was discussed in the filesystems section of Understanding UNIX Concepts. In this section, we will go into more detail about the different types of filesystems encountered on a UNIX system, and the utilities used to create, mount, and maintain filesystems.

Physical Filesystems

Storage space on a computer usually resides on several devices. This encompasses several different types of media, including hard drives, CD-ROM drives, and floppy drives. Each of these devices has a distinct physical filesystem associated with it. There are numerous types of physical filesystems found under UNIX, including:

ufs: The standard UNIX File System
bffs: The Berkeley Fat Fast File System, an improvement over the original UNIX filesystem
nfs: The Network File System, an abstraction for accessing devices over networks.
msdosfs: The filesystem used by MS-DOS; usually available on UNIX variants running on IBM-PC compatible computers
cd9660: The ISO-9660 filesystem for CD-ROM drives.

The types of filesystems available vary from operating system to operating system, and cover a wide variety of devices and media.

The UNIX File System (Logical Filesystems)

The UNIX kernel provides a standard interface to each of these filesystems. As far as the user is concerned, each physical filesystem is accessed using the same set of UNIX system calls. The aim is to provide as consistent an interface as possible. It is this consistency that allows the set of physical filesystems to be represented as a single directory hierarchy.

UNIX designates one filesystem as the root filesystem. This filesystem is mounted upon boot, and has its base at the top of the hierarchy. All paths beginning with '/' are relative to the root filesystem; therefore, '/' is at the base of the root filesystem. Other filesystems are attached to the existing hierarchy using the mount command. mount takes a filesystem and maps it to an existing directory in the file tree, called the mount point. Once a filesystem is mounted at a given mount point, the file tree of that filesystem is accessed as if it is contained in the directory serving as the mount point.

Mount points are typically empty directories, though not necessarily so; if a directory serving as a mount point contains anything, the contents will be inaccessible while the new filesystem is mounted there.

A filesystem may be mounted anywhere in the directory tree; it does not necessarily have to be mounted on the root filesystem. For example, it is possible (and very common) to have filesystem A mounted at a mount point on the root filesystem, and filesystem B mounted at a mount point contained in filesystem A.

The UNIX File System Internal Structure

The Boot Block

The boot block is usually a part of the disk label, a special set of blocks containing information on the disk layout. The boot block holds the loader to boot the operating system.

The SuperBlock

Each UNIX partition usually contains a special block called the superblock. The superblock contains the basic information about the entire file system. This includes the size of the file system, the list of free and allocated blocks, the name of the partition, and the modification time of the filesystem.

Inodes

Information about each file in the filesystem is kept in a special kernel structure called an inode. The inode contains a pointer to the disk blocks containing the data in the file, as well as other information such as the type of file, the permission bits, the owner and group, the file size, the file modification time, and so on. The inode does not, however, contain the name of the file. The name of each file is listed in the directory the file is associated with. A directory is really just special type of file containing a list of filenames and associated inodes; when a user attempts to access a given file by name, the name is looked up in the directory, where the corresponding inode is found.

The inode structure is used to explain the difference between a hard link and a symbolic link. A hard link is just another directory entry corresponding to the inode of the file. Neither link is considered to be the "real" file; both of them are. By adding a hard link to a file, the file has multiple names associated with it. Any changes made to one will affect the other. A file is considered deleted when all of the hard links to it (including the original link to the file) are removed. Renaming one of the links or even the original file will not affect the validity of the other links.

By contrast, a symbolic link is actually a special type of file that contains the name of the file it points to. This file has its own inode and is distinct from the original; it contains only a reference to the pathname of the file being linked to. When the kernel accesses a symbolic link, it recognizes that it is a pointer to another file, and attempts to find that file. This is why symbolic links will become bad if the original file is moved or deleted; they reference only the name of the original file, not the actual data of the file.

File System Operations

Creating a New Filesystem

A new filesystem is usually created using either the mkfs command or the newfs command. newfs is actually a more user-friendly front-end to mkfs. In most cases, the only argument required is the device node corresponding to the partition the filesystem is being created on, i.e: mkfs /dev/rdsk/c0t5d0s0

Mounting and Unmounting a Filesystem

A filesystem is mounted using the mount command, and unmounted using the umount command. The basic usage of the mount command is:

	mount device_node mount_point

The device node should be a block special file.

For example, to mount the disk partition accessed through /dev/sd1a on top of the directory /mnt, the syntax is:

	mount /dev/sd1a /mnt

To unmount a filesystem, issue umount mount_point.

Filesystem Diagnostics and Repair

The df command is used to show the amount of free space available on a filesystem. It can be invoked with a specific mount point or device node as an argument. If called with no arguments, it will show the free space available on all mounted partitions.

The fsck command is used to check filesystems for integrity and repair them if necessary. The basic usage is:


	fsck device_node

In general, the device node should be a raw disk (character special file), not a block special file. fsck will not work on a block device if it is mounted.

Terms used: filesystem, mount, unmount, boot block, superblock, inode, symbolic link, hard link, device node