Are locks unnecessary in multi-threaded Python code because of the GIL?

If you are relying on an implementation of Python that has a Global Interpreter Lock (i.e. CPython) and writing multithreaded code, do you really need locks at all?

If the GIL doesn't allow multiple instructions to be executed in parallel, wouldn't shared data be unnecessary to protect?

sorry if this is a dumb question, but it is something I have always wondered about Python on multi-processor/core machines.

same thing would apply to any other language implementation that has a GIL.

10659 次浏览

You still need to use locks (your code could be interrupted at any time to execute another thread and this can cause data inconsistencies). The problem with GIL is that it prevents Python code from using more cores at the same time (or multiple processors if they are available).

No - the GIL just protects python internals from multiple threads altering their state. This is a very low-level of locking, sufficient only to keep python's own structures in a consistent state. It doesn't cover the application level locking you'll need to do to cover thread safety in your own code.

The essence of locking is to ensure that a particular block of code is only executed by one thread. The GIL enforces this for blocks the size of a single bytecode, but usually you want the lock to span a larger block of code than this.

This post describes the GIL at a fairly high-level:

Of particular interest are these quotes:

Every ten instructions (this default can be changed), the core releases the GIL for the current thread. At that point, the OS chooses a thread from all the threads competing for the lock (possibly choosing the same thread that just released the GIL – you don't have any control over which thread gets chosen); that thread acquires the GIL and then runs for another ten bytecodes.

and

Note carefully that the GIL only restricts pure Python code. Extensions (external Python libraries usually written in C) can be written that release the lock, which then allows the Python interpreter to run separately from the extension until the extension reacquires the lock.

It sounds like the GIL just provides fewer possible instances for a context switch, and makes multi-core/processor systems behave as a single core, with respect to each python interpreter instance, so yes, you still need to use synchronization mechanisms.

The Global Interpreter Lock prevents threads from accessing the interpreter simultaneously (thus CPython only ever uses one core). However, as I understand it, the threads are still interrupted and scheduled preemptively, which means you still need locks on shared data structures, lest your threads stomp on each other's toes.

The answer I've encountered time and time again is that multithreading in Python is rarely worth the overhead, because of this. I've heard good things about the PyProcessing project, which makes running multiple processes as "simple" as multithreading, with shared data structures, queues, etc. (PyProcessing will be introduced into the standard library of the upcoming Python 2.6 as the multiprocessing module.) This gets you around the GIL, as each process has its own interpreter.

You will still need locks if you share state between threads. The GIL only protects the interpreter internally. You can still have inconsistent updates in your own code.

For example:

#!/usr/bin/env python
import threading


shared_balance = 0


class Deposit(threading.Thread):
def run(self):
for _ in xrange(1000000):
global shared_balance
balance = shared_balance
balance += 100
shared_balance = balance


class Withdraw(threading.Thread):
def run(self):
for _ in xrange(1000000):
global shared_balance
balance = shared_balance
balance -= 100
shared_balance = balance


threads = [Deposit(), Withdraw()]


for thread in threads:
thread.start()


for thread in threads:
thread.join()


print shared_balance

Here, your code can be interrupted between reading the shared state (balance = shared_balance) and writing the changed result back (shared_balance = balance), causing a lost update. The result is a random value for the shared state.

To make the updates consistent, run methods would need to lock the shared state around the read-modify-write sections (inside the loops) or have some way to detect when the shared state had changed since it was read.

Adding to the discussion:

Because the GIL exists, some operations are atomic in Python and do not need a lock.

http://www.python.org/doc/faq/library/#what-kinds-of-global-value-mutation-are-thread-safe

As stated by the other answers, however, you still need to use locks whenever the application logic requires them (such as in a Producer/Consumer problem).

A little bit of update from Will Harris's example:

class Withdraw(threading.Thread):
def run(self):
for _ in xrange(1000000):
global shared_balance
if shared_balance >= 100:
balance = shared_balance
balance -= 100
shared_balance = balance

Put a value check statement in the withdraw and I don't see negative anymore and updates seems consistent. My question is:

If GIL prevents only one thread can be executed at any atomic time, then where would be the stale value? If no stale value, why we need lock? (Assuming we only talk about pure python code)

If I understand correctly, the above condition check wouldn't work in a real threading environment. When more than one threads are executing concurrently, stale value can be created hence the inconsistency of the share state, then you really need a lock. But if python really only allows just one thread at any time (time slicing threading), then there shouldn't be possible for stale value to exist, right?

Think of it this way:

On a single processor computer, multithreading happens by suspending one thread and starting another fast enough to make it appear to be running at the same time. This is like Python with the GIL: only one thread is ever actually running.

The problem is that the thread can be suspended anywhere, for example, if I want to compute b = (a + b) * 3, this might produce instructions something like this:

1    a += b
2    a *= 3
3    b = a

Now, lets say that is running in a thread and that thread is suspended after either line 1 or 2 and then another thread kicks in and runs:

b = 5

Then when the other thread resumes, b is overwritten by the old computed values, which is probably not what was expected.

So you can see that even though they're not ACTUALLY running at the same time, you still need locking.

Locks are still needed. I will try explaining why they are needed.

Any operation/instruction is executed in the interpreter. GIL ensures that interpreter is held by a single thread at a particular instant of time. And your program with multiple threads works in a single interpreter. At any particular instant of time, this interpreter is held by a single thread. It means that only thread which is holding the interpreter is running at any instant of time.

Suppose there are two threads,say t1 and t2, and both want to execute two instructions which is reading the value of a global variable and incrementing it.

#increment value
global var
read_var = var
var = read_var + 1

As put above, GIL only ensures that two threads can't execute an instruction simultaneously, which means both threads can't execute read_var = var at any particular instant of time. But they can execute instruction one after another and you can still have problem. Consider this situation:

  • Suppose read_var is 0.
  • GIL is held by thread t1.
  • t1 executes read_var = var. So, read_var in t1 is 0. GIL will only ensure that this read operation will not be executed for any other thread at this instant.
  • GIL is given to thread t2.
  • t2 executes read_var = var. But read_var is still 0. So, read_var in t2 is 0.
  • GIL is given to t1.
  • t1 executes var = read_var+1 and var becomes 1.
  • GIL is given to t2.
  • t2 thinks read_var=0, because that's what it read.
  • t2 executes var = read_var+1 and var becomes 1.
  • Our expectation was that var should become 2.
  • So, a lock must be used to keep both reading and incrementing as an atomic operation.
  • Will Harris' answer explains it through a code example.