[ Previous | Next | Contents | Glossary | Home | Search ]
AIX Version 4.3 General Programming Concepts: Writing and Debugging Programs

File Space Allocation

File space allocation is the method by which data is apportioned physical storage space in the operating system. The kernel allocates disk space to a file or directory in the form of logical blocks. A Logical block refers to the division of a file or directory's contents into 4096-byte units. Logical blocks are not tangible entities; however, the data in a logical block consumes physical storage space on the disk. Each file or directory consists of 0 or more logical blocks. Fragments, instead of logical blocks, are the basic units for allocated disk space in the journaled file system (JFS).

Full and Partial Logical Blocks

A file or directory may contain full or partial logical blocks. A full logical block contains 4096 bytes of data. Partial logical blocks occur when the last logical block of a file or directory contains less than 4096 bytes of data.

For example, a file of 8192 bytes is two logical blocks. The first 4096 bytes reside in the first logical block and the following 4096 bytes reside in the second logical block. Likewise, a file of 4608 bytes consists of two logical blocks. However, the last logical block is a partial logical block containing the last 512 bytes of the file's data. Only the last logical block of a file can be a partial logical block.

The default fragment size is 4096 bytes. You can specify smaller fragment sizes with the mkfs command during a file system's creation. Allowable fragment sizes are: 512, 1024, 2048, and 4096 bytes. You can use only one fragment size in a file system. See "File System Layout" for more information on the file system structure.

Allocation in Fragmented File Systems

To maintain efficiency in file system operations, the JFS allocates 4096 bytes of fragment space to files and directories that are 32KB or larger. A fragment that covers 4096 bytes of disk space is allocated to a full logical block. When data is added to a file or directory, the kernel allocates disk fragments to store the logical blocks. Thus, if the file system's fragment size is 512 bytes, a full logical block is the allocation of 8 fragments.

The kernel allocates disk space so that only the last bytes of data receive a partial block allocation. As the partial block grows beyond the limits of its current allocation, additional fragments are allocated. If the partial block increases to 4096 bytes, the data stored in its fragments are reallocated into 4096-byte allocations. A partial logical block that contains less than 4096 bytes of data is allocated the number of fragments that best matches its storage requirements.

Fragment reallocation also occurs if data is added to logical blocks that represent file holes. A file hole is an "empty" logical block located prior to the last logical block that stores data. (File holes do not occur within directories.) These empty logical blocks are not allocated fragments. However, as data is added to file holes, allocation occurs. Each logical block that was not previously allocated disk space is allocated 4096 byte of fragment space.

Additional fragment allocation is not required if existing data in the middle of a file or directory is overwritten. The logical block containing the existing data has already been allocated fragments.

JFS tries to maintain contiguous allocation of a file or directory's logical blocks on the disk. Maintaining contiguous allocation lessens seek time because the data for a file or directory can be accessed sequentially and found on the same area of the disk. However, disk fragments for one logical block are not always contiguous to the disk fragments for another logical block. The disk space required for contiguous allocation may not be available if it has already been written to by another file or directory. An allocation for a single logical block does, however, always contain contiguous fragments.

The file system uses a bitmap called the fragment allocation map to record the status of every fragment in the file system. When the file system needs to allocate a new fragment, it refers to the fragment allocation map to identify which fragments are available. A fragment can only be allocated to a single file or directory at a time.

Allocation in Compressed File Systems

In a file system that supports data compression, directories are allocated disk space. Data compression also applies to regular files and symbolic links whose size is larger than that of their i-nodes.

The allocation of disk space for compressed file systems is the same as that of fragments in fragmented file systems. A logical block is allocated 4096 bytes when it is modified. This allocation guarantees that there will be a place to store the logical block if the data does not compress. The system requires that a write or store operation report an out-of-disk-space condition into a memory-mapped file at a logical block's initial modification. After modification is complete, the logical block is compressed before it is written to a disk. The compressed logical block is then allocated only the number of fragments required for its storage.

A logical block is no longer considered modified after it is written to a disk. Each time a logical block is modified, a full disk block is allocated again, according to the system requirements. Reallocation of the initial full block occurs when the logical block of compressed data is successfully written to a disk.

Allocation in File Systems Enabled for Large Files

Beginning in Version 4.2, in a file system enabled for large files, the JFS allocates two sizes of fragments for regular files. A "large" fragment (32 X 4096) is allocated for logical blocks after the 4 MB boundary, and a 4096-byte fragment is allocated for logical blocks before the 4 MB boundary. All nonregular files allocate 4096-byte fragments. This geometry allows a maximum file size of slightly less than 64 gigabytes (68589453312).

A "large" fragment is made up of 32 contiguous 4096-byte fragments. Because of this requirement, it is recommended that file systems enabled for large files have predominantly large files in them. Storing many small files (files less than 4 MB) can cause free-space fragmentation problems. This can cause large allocations to fail with ENOSPC because the file system does not contain 32 contiguous disk addresses.

Disk Address Format

Fragmented and compressed file systems use the same method for representing disk addresses. In a fragmented file system, only the last logical block of a file (not larger than 32KB) can be allocated less than 4096 bytes. The logical block becomes a partial logical block. In a compressed file system, every logical block can be allocated less than a full block.

JFS fragment support requires fragment-level addressability. As a result, disk addresses have a special format for mapping where the fragments of a logical block reside on the disk. Disk addresses are contained in the i_rdaddr field of the i-nodes or in the indirect blocks. All fragments referenced in a single address must be contiguous on the disk.

The disk address format consists of two fields, the nfrags and addr fields. These fields describe the area of disk covered by the address. The addr field indicates which fragment on the disk is the starting fragment. The nfrags field indicates the total number of contiguous fragments not used by the address. For example, if the fragment size for the file system is 512 bytes and the logical block is divided into eight fragments, the nfrags value is 3, indicating that five fragments are included in the address.

The following examples illustrate possible values for the addr and nfrags fields for different disk addresses. These values assume a fragment size of 512 bytes, indicating that the logical block is divided into eight fragments.

Address for a single fragment:

addr:   143
nfrags: 7

This address indicates that the starting location of the data is fragment 143 on the disk. The nfrags value indicates that the total number of fragments included in the address is one. The nfrags value changes in a file system that has a fragment size other than 512 bytes. To correctly read the nfrags value, the system, or any user examining the address, must know the fragment size of the file system.

Address for five fragments:

addr:   1117
nfrags: 3

In this case, the address starts at fragment number 1117 on the disk and continues for five fragments (including the starting fragment). There are three fragments remaining, as illustrated by the nfrags value.

The disk addresses are 32 bits in size. The bits are numbered from 0 to 31. The 0 bit is always reserved. Bits 1 through 3 contain the nfrags field. Bits 4 through 31 contain the addr field.

The Disk Address Format figure illustrates the mapping of a file's data to the fragments contained on the disk.

Indirect Blocks

The JFS uses the indirect blocks to address the disk space allocated to larger files. Indirect blocks allow the greatest flexibility for file sizes and the fastest retrieval time. The indirect block is assigned using the i_rindirect field of the disk i-node. This field allows for three geometries or methods for addressing the disk space:

Each of these methods uses the same disk address format as compressed and fragmented file systems. Because files larger than 32KB are allocated fragments of 4096 bytes, the nfrags field for addresses using the single indirect or double indirect method has a value of 0.

When the direct method of disk addressing is used, each of the eight addresses listed in the i_rdaddr field of the disk i-node points directly to a single allocation of disk fragments. The maximum size of a file using direct geometry is 32,768 bytes (32KB), or 8 x 4096 bytes. When the file requires more than 32KB, an indirect block is used to address the file's disk space. The Geometry One figure illustrates the direct disk address method.

The i_rindirect field contains an address that points to either a single indirect block or a double indirect block. When the single indirect disk addressing method is used, the i_rindirect field contains the address of an indirect block containing 1024 addresses. These addresses point to the disk fragments for each allocation. Using the single indirect block geometry, the file can be up to 4,194,304 bytes (4MB), or 1024 x 4096 bytes. The Geometry Two figure illustrates the indirect disk address method.

The double indirect addressing method uses the i_rindirect field to point to a double indirect block. The double indirect block contains 512 addresses that point to indirect blocks, which contain pointers to the fragment allocations. The largest file size that can be used with the double indirect geometry in a file system not enabled for large files is 2,147,483,648 bytes (2GB), or 512(1024 x 4096) bytes. The Geometry Three figure illustrates the double indirect disk address method.

Note: The maximum file size that the read and write system calls would allow is 2GB minus 1 (231-1 ). When memory map interface is used, 2GB can be addresed.

Beginning in Version 4.2, file systems enabled for large files allow a maximum file size of slightly less than 64 gigabytes (68589453312). The first single indirect block contains 4096 byte fragments, and all subsequent single indirect blocks contain (32 X 4096) byte fragments. The following produces the maximum file size for file systems enabling large files:

(1 * (1024 * 4096)) + (511 * (1024 * 131072))

The fragment allocation assigned to a directory is divided into records of 512 bytes each and grows in accordance with the allocation of these records.


Disk quotas restrict the amount of file system space any single user or group can monopolize. The quotactl subroutine sets limits on both the number of files and the number of disk blocks allocated to each user or group on a file system. Quotas enforce two kinds of limits:

hard Maximum limit allowed. When a process hits its hard limit, requests for more space fail.
soft Practical limit. If a process hits the soft limit, a warning is printed to the user's terminal. The warning is often displayed at login. If the user fails to correct the problem after several login sessions, the soft limit can become a hard limit.

System warnings are designed to encourage users to heed the soft limit. However, the quota system allows processes access to the higher hard limit when more resources are temporarily required.

Related Information

Working with JFS i-nodes describes the internal representation of files and lists the contents of disk i-nodes and in-core (main memory) i-nodes.

File Creation and Removal introduces internal mechanism of file creation and removal.

Using File Descriptors explains the creation and use of file descriptors.

File System Layout

The quotactl subroutine.

[ Previous | Next | Contents | Glossary | Home | Search ]