Bug report
Bug description:
Reported by @pablogsal / @godlygeek from memray
Stack trace:
https://gist.github.com/pablogsal/513fa8b0c29cda852ce11c86ce3b1345
We have two threads, the main thread (M) and a daemon thread (D). The main thread starts _Py_Finalize() and performs a global stop the world. The daemon thread is disabling profiling and so tries to performa a stop-the-world specific to it's interpreter:
|
static void |
|
stop_the_world(struct _stoptheworld_state *stw) |
|
{ |
|
_PyRuntimeState *runtime = &_PyRuntime; |
|
|
|
PyMutex_Lock(&stw->mutex); |
|
if (stw->is_global) { |
|
_PyRWMutex_Lock(&runtime->stoptheworld_mutex); |
|
} |
|
else { |
|
_PyRWMutex_RLock(&runtime->stoptheworld_mutex); |
|
} |
M: _PyEval_StopTheWorldAll():
M: acquires runtime->stoptheworld->mutex
M: acquires RW lock runtime->stoptheworld_mutex in W (exclusive) mode
M: ... waits on threads
D: _PyEval_StopTheWorld(interp):
D: acquires interp->stoptheworld->mutex
D: ... blocks trying to acquire runtime->stoptheworld_mutex in R mode. Later, the daemon thread will hang in _PyThreadState_HangThread() when trying to re-attach it's thread state.
M: _PyEval_StopTheWorldAll() finishes, marks the interpreter as finalizing
M: ...
M: calls _PyGC_CollectNoFail() which tries to run _PyEval_StopTheWorld(interp)
M: ... blocks trying to acquire interp->stoptheworld->mutex, which is still held by the daemon thread!
Deadlock! Summary:
The daemon thread holds interp->stoptheworld->mutex and is hanging because the interpreter is shutting down.
The main thread is trying to perform the shutdown procedure, including calling the GC a few times, which requires interp->stoptheworld->mutex.
Fix???
- Release the previously acquired
interp->stoptheworld->mutex when hanging the thread if necessary? Crosses a bunch of abstraction barriers, which is messy and tricky
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
Bug report
Bug description:
Reported by @pablogsal / @godlygeek from memray
Stack trace:
https://gist.github.com/pablogsal/513fa8b0c29cda852ce11c86ce3b1345
We have two threads, the main thread (M) and a daemon thread (D). The main thread starts
_Py_Finalize()and performs a global stop the world. The daemon thread is disabling profiling and so tries to performa a stop-the-world specific to it's interpreter:cpython/Python/pystate.c
Lines 2256 to 2267 in 9745976
M:
_PyEval_StopTheWorldAll():M: acquires
runtime->stoptheworld->mutexM: acquires RW lock
runtime->stoptheworld_mutexin W (exclusive) modeM: ... waits on threads
D:
_PyEval_StopTheWorld(interp):D: acquires
interp->stoptheworld->mutexD: ... blocks trying to acquire
runtime->stoptheworld_mutexin R mode. Later, the daemon thread will hang in_PyThreadState_HangThread()when trying to re-attach it's thread state.M:
_PyEval_StopTheWorldAll()finishes, marks the interpreter as finalizingM: ...
M: calls
_PyGC_CollectNoFail()which tries to run_PyEval_StopTheWorld(interp)M: ... blocks trying to acquire
interp->stoptheworld->mutex, which is still held by the daemon thread!Deadlock! Summary:
The daemon thread holds
interp->stoptheworld->mutexand is hanging because the interpreter is shutting down.The main thread is trying to perform the shutdown procedure, including calling the GC a few times, which requires
interp->stoptheworld->mutex.Fix???
interp->stoptheworld->mutexwhen hanging the thread if necessary? Crosses a bunch of abstraction barriers, which is messy and trickyCPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs