General Programming Concepts:
Writing and Debugging Programs

Thread-Specific Data

Many applications require that certain data be maintained on a per-thread basis across function calls. For example, a multi-threaded grep command using one thread for each file needs to have thread-specific file handlers and list of found strings. The thread-specific data interface is provided by the threads library to meet these needs.

Thread-specific data may be viewed as a two-dimensional array of values, with keys serving as the row index and thread IDs as the column index. A thread-specific data key is an opaque object, of type pthread_key_t. The same key can be used by all threads in a process. Although all threads use the same key, they set and access different thread-specific data values associated with that key. Thread-specific data are void pointers. This allows referencing any kind of data, such as dynamically allocated strings or structures.

In the following figure, thread T2 has a thread-specific data value of 12 associated with the key K3. Another thread T4 has the value 2 associated with the same key.

Table 1. Thread-Specific Data Array
		Threads
		T1	T2	T3	T4
Keys	K1	6	56	4	1
	K2	87	21	0	9
	K3	23	12	61	2
	K4	11	76	47	88

Creating and Destroying Keys

Thread-specific data keys must be created before being used. Their values can be automatically destroyed when the corresponding threads terminate. A key can also be destroyed upon request to reclaim its storage.

Key Creation

A thread-specific data key is created by calling the pthread_key_create subroutine. This subroutine returns a key. The thread-specific data is set to a value of NULL for all threads, including threads not yet created.

For example, consider two threads A and B. Thread A performs the following operations in chronological order:

Create a thread-specific data key K.
Threads A and B can use the key K. The value for both threads is NULL.
Create a thread C.
Thread C can also use the key K. The value for thread C is NULL.

The number of thread-specific data keys is limited to 450 per process. This number can be retrieved by the PTHREAD_KEYS_MAX symbolic constant.

The pthread_key_create subroutine must be called only once. Otherwise, two different keys are created. For example, consider the following code fragment:

/* a global variable */
static pthread_key_t theKey;
 
/* thread A */
...
pthread_key_create(&theKey, NULL);   /* call 1 */
...
 
/* thread B */
...
pthread_key_create(&theKey, NULL);   /* call 2 */
...

Threads A and B run concurrently, but call 1 happens before call 2. Call 1 will create a key K1 and store it in the theKey variable. Call 2 will create another key K2, and store it also in the theKey variable, thus overriding K1. As a result, thread A will use K2, assuming it is K1. This situation should be avoided for two reasons:

Key K1 is lost, thus its storage will never be reclaimed until the process terminates. Because the number of keys is limited, you may run out of keys.
If thread A stores a thread-specific data using the theKey variable before call 2, the data will be bound to key K1. After call 2, the theKey variable contains K2; if thread A then tries to fetch its thread-specific data, it would always get NULL.

Ensuring the uniqueness of key creation can be done in two ways:

Using the one-time initialization facility. See One-Time Initializations.
Creating the key before the threads that will use it. This is often possible, for example, when using a pool of threads with thread-specific data to perform similar operations. This pool of threads is usually created by one thread, the initial (or another "driver") thread.

It is the programmer's responsibility to ensure the uniqueness of key creation. The threads library provides no way to check if a key has been created more than once.

Destructor Routine

A destructor routine may be associated with each thread-specific data key. Whenever a thread is terminated, if there is non-NULL, thread-specific data for this thread bound to any key, the destructor routine associated with that key is called. This allows dynamically allocated thread-specific data to be automatically freed when the thread is terminated. The destructor routine has one parameter, the value of the thread-specific data.

For example, a thread-specific data key may be used for dynamically allocated buffers. A destructor routine should be provided to ensure that the buffer is freed when the thread terminates, the free subroutine can be used:

pthread_key_create(&key, free);

More complex destructors may be used. If a multi-threaded grep command, using a thread per file to scan, has thread-specific data to store a structure containing a work buffer and the thread's file descriptor, the destructor routine may be:

typedef struct {
        FILE *stream;
        char *buffer;
} data_t;
...

void destructor(void *data)
{
        fclose(((data_t *)data)->stream);
        free(((data_t *)data)->buffer);
        free(data);
        *data = NULL;
}

Destructor calls can be repeated up to 4 times.

Key Destruction

A thread-specific data key can be destroyed by calling the pthread_key_delete subroutine. This subroutine frees the key only if no thread-specific data is bound to it. Data is said to be bound to the key when at least one value is not NULL. The pthread_key_delete subroutine does not actually call the destructor routine for each thread having data. To destroy a thread-specific data key, the programmer must ensure that no thread-specific data is bound to the key.

Once a data key is destroyed, it can be reused by another call to the pthread_key_create subroutine. Thus, the pthread_key_delete is useful especially when using many data keys. For example, in the following code fragment the loop would never end:

/* bad example - do not write such code! */
pthread_key_t key;
 
while (pthread_key_create(&key, NULL))
        pthread_key_delete(key);

Using Thread-Specific Data

Thread-specific data is accessed using the pthread_getspecific and pthread_setspecific subroutines. The pthread_getspecific subroutine reads the value bound to the specified key and is specific to the calling thread; the pthread_setspecific subroutine sets the value.

Setting Successive Values

The value should be a pointer. The pointer may point to any kind of data. Thread-specific data is typically used for dynamically allocated storage, as in the following code fragment:

private_data = malloc(...);
pthread_setspecific(key, private_data);

When setting a value, the previous value is lost. For example, in the following code fragment, the value of the old pointer is lost, and the storage it pointed to may not be recoverable:

pthread_setspecific(key, old);
...
pthread_setspecific(key, new);

It is the programmer's responsibility to retrieve the old thread-specific data value to reclaim storage before setting the new value. For example, it is possible to implement a swap_specific routine in the following manner:

int swap_specific(pthread_key_t key, void **old_pt, void *new)
{
        *old_pt = pthread_getspecific(key);
        if (*old_pt == NULL)
                return -1;
        else
                return pthread_setspecific(key, new);
}

Such a routine does not exist in the threads library because it is not always necessary to retrieve the previous value of thread-specific data. Such a case occurs, for example, when thread-specific data are pointers to specific locations in a memory pool allocated by the initial thread.

Taking Care about Destructor Routines

When using dynamically allocated thread-specific data, the programmer must provide a destructor routine when calling the pthread_key_create subroutine. The programmer must also ensure that, when freeing the storage allocated for thread-specific data, the pointer is set to NULL. Otherwise, the destructor routine might be called with an illegal parameter. For example:

pthread_key_create(&key, free);
...

...
private_data = malloc(...);
pthread_setspecific(key, private_data);
...

/* bad example! */
...
pthread_getspecific(key, &data);
free(data);
...

When the thread terminates, the destructor routine is called for its thread-specific data. Because the value is a pointer to already freed memory, an error can occur. To correct this, the following code fragment should be substituted:

/* better example! */
...
pthread_getspecific(key, &data);
free(data);
pthread_setspecific(key, NULL);
...

When the thread terminates, the destructor routine is not called, because there is no thread-specific data.

Using Non-Pointer Values

It is possible to store values that are not pointers, such as integers. It is not recommended to do this for at least two reasons:

Casting a pointer into a scalar type may not be portable
The NULL pointer value is implementation-dependent; several systems assign the NULL pointer a non-zero value.

If you are sure that your program will never be ported to another system, you may use integer values for thread-specific data.