[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]

System Management Concepts: Operating System and Devices


Understanding Sparse Files

A file is a sequence of indexed blocks of arbitrary size. The indexing is accomplished through the use of direct mapping or indirect index blocks from the files inode. Each index within a files address range is not required to map to an actual data block.

A file that has one or more indexes that are not mapped to a data block is referred to as being sparsely-allocated or a sparse file. A sparse file will have a size associated with it, but it will not have all of the data blocks allocated to fulfill the size requirements. To identify if a file is sparsely-allocated, use the fileplace command. It will indicate all blocks in the file that are not currently allocated.

NOTE: In most circumstances, du can also be used to determine if the number of data blocks allocated to a file do not match those required to hold a file of its size. A compressed filesystem might show the same behavior for files that are not sparsely-allocated.

A sparse file is created when an application extends a file by seeking to a location outside the currently allocated indexes, but the data that is written does not occupy all of the newly assigned indexes. The new file size reflects the farthest write into the file.

A read to a section of a file that has unallocated data blocks results in a default value of null bytes being returned. A write to a section of a file that has unallocated data blocks causes the neccesary data blocks to be allocated and the data written.

This behavior can affect file manipulation or archival commands. For example, the following commands do not preserve the sparse allocation of a file:

    cp     mv     tar     cpio

NOTE:  In the case of mv, this only applies to moving a file to another filesystem. If the file is moved within the same filesystem, it will remain sparse.

The result of a file being copied or restored from the preceding commands has each data block allocated, and thus have no sparse characteristics. However, the following archival commands either preserve sparse characteristics or actively sparse a file:
  backup
  restore
  pax

Because it is possible to overcommit the resources of a file system with sparse files, care should be taken in the use and maintenance of files of this type.


[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]