[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]

General Programming Concepts: Writing and Debugging Programs


Using File Descriptors

A file descriptor is an unsigned integer used by a process to identify an open file. Two thousand file descriptors are available to each process. The open, pipe, creat, and fcntl subroutines all generate file descriptors. File descriptors are generally unique to each process, but they can be shared by child processes created with a fork subroutine or copied by the fcntl, dup, and dup2 subroutines.

File descriptors are indexes to the file descriptor table in the u_block area maintained by the kernel for each process. The most common ways for processes to obtain file descriptors are through open or creat operations or through inheritance from a parent process. When a fork operation occurs, the descriptor table is copied for the child process, which allows the child process equal access to the files used by the parent process.

System File and File Descriptor Tables

The system file and file descriptor data structures track each process' access to a file and ensure data integrity.

Structure Activity and Contents
file descriptor table Translates an index number (file descriptor) in the table to an open file. File descriptor tables are created for each process and are located in the u_block area set aside for that process. Each of the entries in a file descriptor table has two fields: the flags area and the file pointer. The structure of the file descriptor table is:

struct ufd
{
        struct file *fp;
        int flags;
}u_ufd[OPEN_MAX]

The close-on-exec (FD_CLOEXEC bit) flag can be set in the file descriptor table using the fcntl subroutine. The dup subroutine copies one file descriptor entry into another position in the same table. The fork subroutine creates an identical copy of the entire file descriptor table for a child process.

system open file table Contains entries for each open file. Two of the most important pieces of information tracked in a file table entry are the current offset referenced by all read or write operations to the file and the open mode (O_RDONLY, O_WRONLY, or O_RDWR) of the file.

The open file data structure contains the current I/O offset for the file. The system treats each read/write operation as an implied seek to the current offset. Thus if x bytes are read or written, the pointer advances x bytes. The lseek subroutine can be used to reassign the current offset to a specified location in files that are randomly accessible. Stream-type files (such as pipes and sockets) do not use the offset because the data in the file is not randomly accessible.

Managing File Descriptors

Because files can be shared by many users, it is necessary to allow related processes to share a common offset pointer and have a separate current offset pointer for independent processes that access the same file. The open file table entry maintains a reference count to track the number of file descriptors assigned to the file.

Multiple references to a single file can be caused by:

Sharing Open Files

Each open operation creates a system table entry. Individual table entries ensure each process a separate current I/O offsets. Independent offsets protect the integrity of the data.

When a file descriptor is duplicated, two processes then share the same offset and interleaving can occur. Interleaving means that bytes are not read or written sequentially.

Duplicating File Descriptors

There are three ways file descriptors can be duplicated between processes: the dup or dup2 subroutine, the fork subroutine, and the fcntl (file descriptor control) subroutine.

The dup and dup2 Subroutines


dup Creates a copy of a file descriptor

The duplicate is created at an empty space in the user file descriptor table that contains the original descriptor. A dup process increments the reference count in the file table entry by 1 and returns the index number of the file-descriptor where the copy was placed.

dup2 Scans for the requested descriptor assignment and closes the requested file descriptor if it is open

The dup2 subroutine allows the process to designate which descriptor entry the copy will occupy, if a specific descriptor-table entry is required.

The fork Subroutine


fork Creates a child process that inherits the file descriptors assigned to the parent process. The child process then execs a new process. Inherited descriptors that had the close-on-exec flag set by the fcntl subroutine close.

The fcntl (File Descriptor Control) Subroutine


fcntl Manipulates file structure and controls open file descriptors.

The fcntl subroutine can be used to make the following changes to a descriptor:

  • Duplicate a file descriptor (identical to the dup subroutine).
  • Get or set the close-on-exec flag.
  • Set nonblocking mode for the descriptor.
  • Append future writes to the end of the file (O_APPEND).
  • Enable the generation of a signal to the process when it is possible to do I/O.
  • Set or get the process ID or the group process ID for SIGIO handling.
  • Close all file descriptors.

Preset File Descriptor Values

When the shell runs a program, it opens three files with file descriptors 0, 1, and 2. The default assignments for these descriptors are:

0 Represents standard input.
1 Represents standard output.
2 Represents standard error.

These default file descriptors are connected to the terminal, so that if a program reads file descriptor 0 and writes file descriptors 1 and 2, the program collects input from the terminal and sends output to the terminal. As the program uses other files, file descriptors are assigned in ascending order.

If I/O is redirected using the < (less than) or > (greater than) symbols, the shell's default file descriptor assignments are changed. For instance:

prog < FileX > FileY

changes the default assignments for file descriptors 0 and 1 from the terminal to the appropriate files. In this example, file descriptor 0 now refers to FileX and file descriptor 1 refers to FileY. File descriptor 2 has not been changed. The program does not need to know where its input comes from nor where it is sent, as long as file descriptor 0 represents the input file and 1 and 2 represent output files.

The following sample program illustrates the redirection of standard output:

#include <fcntl.h>
#include <stdio.h>
 
void redirect_stdout(char *);
 
main()
{
       printf("Hello world\n");       /*this printf goes to
                                      * standard output*/
       fflush(stdout);
       redirect_stdout("foo");        /*redirect standard output*/
       printf("Hello to you too, foo\n");
                                      /*printf goes to file foo */
       fflush(stdout);
}

void
redirect_stdout(char *filename)
{
        int fd;
        if ((fd = open(filename,O_CREAT|O_WRONLY,0666)) < 0)
                                        /*open a new file */
        {
                perror(filename);
                exit(1);
        }
        close(1);                       /*close old */
                                        *standard output*/
        if (dup(fd) !=1)                /*dup new fd to
                                        *standard input*/
        {
                fprintf(stderr,"Unexpected dup failure\n");
                exit(1);
        }
        close(fd);                       /*close original, new fd,*/
                                         * no longer needed*/
}

The value for file descriptor 2 can also be reassigned, but this is rarely done.

Within the file descriptor table, file descriptor numbers are assigned the lowest descriptor number available at the time of a request for a descriptor. However, any value can be assigned within the file descriptor table by using the dup subroutine.

File Descriptor Resource Limit

The number of file descriptors that can be allocated to a process is governed by a resource limit. The default value is set in the /etc/security/limits file and is typically 2000 (for compatibility with earlier releases). The limit can be changed by the ulimit command or the setrlimit subroutine. The maximum size is defined by the constant OPEN_MAX.

Related Information

Chapter 5, File Systems and Directories

Working with JFS i-nodes

File Creation and Removal

Special Files Overview

fcntl, dup, or dup2 subroutine, lseek subroutine, open, openx, or create subroutine


[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]