线程安全 vs 重入

最近,我提出了一个题目为 “ malloc 线程安全吗?”的问题,其中我问道: “ malloc 是可重入的吗?”

我以为所有的重入都是线程安全的。

这个假设错了吗?

50393 次浏览

Re-entrant functions do not rely on global variables that are exposed in the C library headers .. take strtok() vs strtok_r() for example in C.

Some functions need a place to store a 'work in progress' , re-entrant functions allow you to specify this pointer within the thread's own storage, not in a global. Since this storage is exclusive to the calling function, it can be interrupted and re-entered (re-entrant) and since in most cases mutual exclusion beyond what the function implements isn't required for this to work, they are often considered to be thread safe. This isn't, however, guaranteed by definition.

errno, however, is a slightly different case on POSIX systems (and tends to be the oddball in any explanation of how this all works) :)

In short, reentrant often means thread safe (as in "use the reentrant version of that function if you're using threads"), but thread safe does not always mean re-entrant (or the reverse). When you're looking at thread-safety, concurrency is what you need to be thinking about. If you have to provide a means of locking and mutual exclusion to use a function, then the function isn't inherently thread-safe.

But, not all functions need to be examined for either. malloc() has no need to be reentrant, it does not depend on anything out of the scope of the entry point for any given thread (and is itself thread safe).

Functions that return statically allocated values are not thread safe without the use of a mutex, futex, or other atomic locking mechanism. Yet, they don't need to be reentrant if they're not going to be interrupted.

i.e.:

static char *foo(unsigned int flags)
{
static char ret[2] = { 0 };


if (flags & FOO_BAR)
ret[0] = 'c';
else if (flags & BAR_FOO)
ret[0] = 'd';
else
ret[0] = 'e';


ret[1] = 'A';


return ret;
}

So, as you can see, having multiple threads use that without some kind of locking would be a disaster .. but it has no purpose being re-entrant. You'll run into that when dynamically allocated memory is taboo on some embedded platform.

In purely functional programming, reentrant often doesn't imply thread safe, it would depend on the behavior of defined or anonymous functions passed to the function entry point, recursion, etc.

A better way to put 'thread safe' is safe for concurrent access , which better illustrates the need.

It depends on the definition. For example Qt uses the following:

  • A thread-safe* function can be called simultaneously from multiple threads, even when the invocations use shared data, because all references to the shared data are serialized.

  • A reentrant function can also be called simultaneously from multiple threads, but only if each invocation uses its own data.

Hence, a thread-safe function is always reentrant, but a reentrant function is not always thread-safe.

By extension, a class is said to be reentrant if its member functions can be called safely from multiple threads, as long as each thread uses a different instance of the class. The class is thread-safe if its member functions can be called safely from multiple threads, even if all the threads use the same instance of the class.

but they also caution:

Note: Terminology in the multithreading domain isn't entirely standardized. POSIX uses definitions of reentrant and thread-safe that are somewhat different for its C APIs. When using other object-oriented C++ class libraries with Qt, be sure the definitions are understood.

TL;DR: A function can be reentrant, thread-safe, both or neither.

The Wikipedia articles for thread-safety and reentrancy are well worth reading. Here are a few citations:

A function is thread-safe if:

it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time.

A function is reentrant if:

it can be interrupted at any point during its execution and then safely called again ("re-entered") before its previous invocations complete execution.

As examples of possible reentrance, the Wikipedia gives the example of a function designed to be called by system interrupts: suppose it is already running when another interrupt happens. But don't think you're safe just because you don't code with system interrupts: you can have reentrance problems in a single-threaded program if you use callbacks or recursive functions.

The key for avoiding confusion is that reentrant refers to only one thread executing. It is a concept from the time when no multitasking operating systems existed.

Examples

(Slightly modified from the Wikipedia articles)

Example 1: not thread-safe, not reentrant

/* As this function uses a non-const global variable without
any precaution, it is neither reentrant nor thread-safe. */


int t;


void swap(int *x, int *y)
{
t = *x;
*x = *y;
*y = t;
}

Example 2: thread-safe, not reentrant

/* We use a thread local variable: the function is now
thread-safe but still not reentrant (within the
same thread). */


__thread int t;


void swap(int *x, int *y)
{
t = *x;
*x = *y;
*y = t;
}

Example 3: not thread-safe, reentrant

/* We save the global state in a local variable and we restore
it at the end of the function.  The function is now reentrant
but it is not thread safe. */


int t;


void swap(int *x, int *y)
{
int s;
s = t;
t = *x;
*x = *y;
*y = t;
t = s;
}

Example 4: thread-safe, reentrant

/* We use a local variable: the function is now
thread-safe and reentrant, we have ascended to
higher plane of existence.  */


void swap(int *x, int *y)
{
int t;
t = *x;
*x = *y;
*y = t;
}