是什么导致了巨蟒内存区段错误?

我正在用 Python 实现 Kosaraju 的连接元件(图论)图搜索算法。

这个程序在小型数据集上运行得很好,但是当我在超大型图形(超过80万个节点)上运行时,它会显示“内存区段错误”。

可能是什么原因? 谢谢!


附加信息: 首先,当我在超大数据集上运行时,我得到了这个错误:

"RuntimeError: maximum recursion depth exceeded in cmp"

然后我使用

sys.setrecursionlimit(50000)

但是得到了一个“内存区段错误”

相信我,这不是一个无限循环,它在相对较小的数据上运行正确。有没有可能是程序耗尽了资源?

240742 次浏览

This happens when a python extension (written in C) tries to access a memory beyond reach.

You can trace it in following ways.

  • Add sys.settrace at the very first line of the code.
  • Use gdb as described by Mark in this answer.. At the command prompt

    gdb python
    (gdb) run /path/to/script.py
    ## wait for segfault ##
    (gdb) backtrace
    ## stack trace of the c code
    

I understand you've solved your issue, but for others reading this thread, here is the answer: you have to increase the stack that your operating system allocates for the python process.

The way to do it, is operating system dependant. In linux, you can check with the command ulimit -s your current value and you can increase it with ulimit -s <new_value>

Try doubling the previous value and continue doubling if it does not work, until you find one that does or run out of memory.

Segmentation fault is a generic one, there are many possible reasons for this:

  • Low memory
  • Faulty Ram memory
  • Fetching a huge data set from the db using a query (if the size of fetched data is more than swap mem)
  • wrong query / buggy code
  • having long loop (multiple recursion)

Updating the ulimit worked for my Kosaraju's SCC implementation by fixing the segfault on both Python (Python segfault.. who knew!) and C++ implementations.

For my MAC, I found out the possible maximum via :

$ ulimit -s -H
65532

Google search found me this article, and I did not see the following "personal solution" discussed.


My recent annoyance with Python 3.7 on Windows Subsystem for Linux is that: on two machines with the same Pandas library, one gives me segmentation fault and the other reports warning. It was not clear which one was newer, but "re-installing" pandas solves the problem.

Command that I ran on the buggy machine.

conda install pandas

More details: I was running identical scripts (synced through Git), and both are Windows 10 machine with WSL + Anaconda. Here go the screenshots to make the case. Also, on the machine where command-line python will complain about Segmentation fault (core dumped), Jupyter lab simply restarts the kernel every single time. Worse still, no warning was given at all.

enter image description here


Updates a few months later: I quit hosting Jupyter servers on Windows machine. I now use WSL on Windows to fetch remote ports opened on a Linux server and run all my jobs on the remote Linux machine. I have never experienced any execution error for a good number of months :)

I was experiencing this segmentation fault after upgrading dlib on RPI. I tracebacked the stack as suggested by Shiplu Mokaddim above and it settled on an OpenBLAS library.

Since OpenBLAS is also multi-threaded, using it in a muilt-threaded application will exponentially multiply threads until segmentation fault. For multi-threaded applications, set OpenBlas to single thread mode.

In python virtual environment, tell OpenBLAS to only use a single thread by editing:

    $ workon <myenv>
$ nano .virtualenv/<myenv>/bin/postactivate

and add:

    export OPENBLAS_NUM_THREADS=1
export OPENBLAS_MAIN_FREE=1

After reboot I was able to run all my image recognition apps on rpi3b which were previously crashing it.

reference: https://github.com/ageitgey/face_recognition/issues/294

Looks like you are out of stack memory. You may want to increase it as Davide stated. To do it in python code, you would need to run your "main()" using threading:

def main():
pass # write your code here


sys.setrecursionlimit(2097152)    # adjust numbers
threading.stack_size(134217728)   # for your needs


main_thread = threading.Thread(target=main)
main_thread.start()
main_thread.join()

Source: c1729's post on codeforces. Runing it with PyPy is a bit trickier.

I'd run into the same error. I learnt from another SO answer that you need to set the recursion limit through sys and resource modules.