How to track down a "double free or corruption" error

当我运行我的(C + +)程序时,它崩溃了,出现了这个错误。

* glibc 检测到 * ./load: double free or rupt (! prev) : 0x000000000c6ed50 * * *

我怎样才能找到这个错误?

我尝试使用 print (std::cout)语句,但没有成功。 gdb能让这个过程变得更简单吗?

359917 次浏览

您可以使用 gdb,但是我会首先尝试 Valgrind

简而言之,瓦尔恩检测您的程序,以便它能够在使用动态分配的内存时检测到几种错误,比如双重释放并写入已分配内存块的结尾(这会损坏堆)。它检测并报告错误 as soon as they occur,从而直接指向问题的原因。

至少有两种可能的情况:

  1. 您正在删除同一个实体两次
  2. 您正在删除未分配的内容

对于第一个,我强烈建议对所有删除的指针使用 NULL。

You have three options:

  1. 重载 new 并删除和跟踪分配
  2. 是的,使用 gdb ——然后你会从你的崩溃中得到一个反向追踪,这可能会非常有帮助
  3. 就像建议的那样,使用瓦尔格伦它不容易进入,但它会在未来为你节省成千上万倍的时间。

三条基本原则:

  1. 释放后将指针设置为 NULL
  2. 释放前检查 NULL
  3. 在开始时初始化指向 NULL的指针。

这三者结合起来相当好。

如果你正在使用 glibc,你可以将 MALLOC_CHECK_的环境变量设置为 2,这将导致 glibc 使用一个容错版本的 malloc,这将导致你的程序在执行 double free 时中止。

可以在运行程序之前使用 set environment MALLOC_CHECK_ 2命令从 gdb 设置这个值; 程序应该终止,并且在回溯跟踪中可以看到 free()调用。

有关详情,请参阅 malloc()的手册页

您是否正在使用 Boost shared_ptr之类的智能指针?如果是,通过调用 get()检查是否在任何地方直接使用原始指针。我发现这是一个相当普遍的问题。

例如,假设一个场景,其中一个原始指针被传递(可能作为回调处理程序)到您的代码。您可能会决定将其分配给智能指针,以便处理引用计数等问题。大错误: 除非获取深度副本,否则代码不拥有此指针。当你的代码使用智能指针完成后,它会破坏它并试图破坏它所指向的内存,因为没有其他人需要它,思考,然后调用代码会尝试删除它,你会得到一个双自由的问题。

当然,这可能不是你的问题。最简单的例子就是它是如何发生的。第一次删除是正常的,但是编译器感觉到它已经删除了那个内存并导致了一个问题。这就是为什么在删除后立即给指针赋值0是一个好主意的原因。

int main(int argc, char* argv[])
{
char* ptr = new char[20];


delete[] ptr;
ptr = 0;  // Comment me out and watch me crash and burn.
delete[] ptr;
}

编辑: 将 delete更改为 delete[],因为 ptr 是一个 char 数组。

您可以使用 valgrind来调试它。

#include<stdio.h>
#include<stdlib.h>


int main()
{
char *x = malloc(100);
free(x);
free(x);
return 0;
}


[sand@PS-CNTOS-64-S11 testbox]$ vim t1.c
[sand@PS-CNTOS-64-S11 testbox]$ cc -g t1.c -o t1
[sand@PS-CNTOS-64-S11 testbox]$ ./t1
*** glibc detected *** ./t1: double free or corruption (top): 0x00000000058f7010 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3a3127245f]
/lib64/libc.so.6(cfree+0x4b)[0x3a312728bb]
./t1[0x400500]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3a3121d994]
./t1[0x400429]
======= Memory map: ========
00400000-00401000 r-xp 00000000 68:02 30246184                           /home/sand/testbox/t1
00600000-00601000 rw-p 00000000 68:02 30246184                           /home/sand/testbox/t1
058f7000-05918000 rw-p 058f7000 00:00 0                                  [heap]
3a30e00000-3a30e1c000 r-xp 00000000 68:03 5308733                        /lib64/ld-2.5.so
3a3101b000-3a3101c000 r--p 0001b000 68:03 5308733                        /lib64/ld-2.5.so
3a3101c000-3a3101d000 rw-p 0001c000 68:03 5308733                        /lib64/ld-2.5.so
3a31200000-3a3134e000 r-xp 00000000 68:03 5310248                        /lib64/libc-2.5.so
3a3134e000-3a3154e000 ---p 0014e000 68:03 5310248                        /lib64/libc-2.5.so
3a3154e000-3a31552000 r--p 0014e000 68:03 5310248                        /lib64/libc-2.5.so
3a31552000-3a31553000 rw-p 00152000 68:03 5310248                        /lib64/libc-2.5.so
3a31553000-3a31558000 rw-p 3a31553000 00:00 0
3a41c00000-3a41c0d000 r-xp 00000000 68:03 5310264                        /lib64/libgcc_s-4.1.2-20080825.so.1
3a41c0d000-3a41e0d000 ---p 0000d000 68:03 5310264                        /lib64/libgcc_s-4.1.2-20080825.so.1
3a41e0d000-3a41e0e000 rw-p 0000d000 68:03 5310264                        /lib64/libgcc_s-4.1.2-20080825.so.1
2b1912300000-2b1912302000 rw-p 2b1912300000 00:00 0
2b191231c000-2b191231d000 rw-p 2b191231c000 00:00 0
7ffffe214000-7ffffe229000 rw-p 7ffffffe9000 00:00 0                      [stack]
7ffffe2b0000-7ffffe2b4000 r-xp 7ffffe2b0000 00:00 0                      [vdso]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0                  [vsyscall]
Aborted
[sand@PS-CNTOS-64-S11 testbox]$




[sand@PS-CNTOS-64-S11 testbox]$ vim t1.c
[sand@PS-CNTOS-64-S11 testbox]$ cc -g t1.c -o t1
[sand@PS-CNTOS-64-S11 testbox]$ valgrind --tool=memcheck ./t1
==20859== Memcheck, a memory error detector
==20859== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==20859== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==20859== Command: ./t1
==20859==
==20859== Invalid free() / delete / delete[]
==20859==    at 0x4A05A31: free (vg_replace_malloc.c:325)
==20859==    by 0x4004FF: main (t1.c:8)
==20859==  Address 0x4c26040 is 0 bytes inside a block of size 100 free'd
==20859==    at 0x4A05A31: free (vg_replace_malloc.c:325)
==20859==    by 0x4004F6: main (t1.c:7)
==20859==
==20859==
==20859== HEAP SUMMARY:
==20859==     in use at exit: 0 bytes in 0 blocks
==20859==   total heap usage: 1 allocs, 2 frees, 100 bytes allocated
==20859==
==20859== All heap blocks were freed -- no leaks are possible
==20859==
==20859== For counts of detected and suppressed errors, rerun with: -v
==20859== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
[sand@PS-CNTOS-64-S11 testbox]$




[sand@PS-CNTOS-64-S11 testbox]$ valgrind --tool=memcheck --leak-check=full ./t1
==20899== Memcheck, a memory error detector
==20899== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==20899== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==20899== Command: ./t1
==20899==
==20899== Invalid free() / delete / delete[]
==20899==    at 0x4A05A31: free (vg_replace_malloc.c:325)
==20899==    by 0x4004FF: main (t1.c:8)
==20899==  Address 0x4c26040 is 0 bytes inside a block of size 100 free'd
==20899==    at 0x4A05A31: free (vg_replace_malloc.c:325)
==20899==    by 0x4004F6: main (t1.c:7)
==20899==
==20899==
==20899== HEAP SUMMARY:
==20899==     in use at exit: 0 bytes in 0 blocks
==20899==   total heap usage: 1 allocs, 2 frees, 100 bytes allocated
==20899==
==20899== All heap blocks were freed -- no leaks are possible
==20899==
==20899== For counts of detected and suppressed errors, rerun with: -v
==20899== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
[sand@PS-CNTOS-64-S11 testbox]$

一种可能的解决办法是:

#include<stdio.h>
#include<stdlib.h>


int main()
{
char *x = malloc(100);
free(x);
x=NULL;
free(x);
return 0;
}


[sand@PS-CNTOS-64-S11 testbox]$ vim t1.c
[sand@PS-CNTOS-64-S11 testbox]$ cc -g t1.c -o t1
[sand@PS-CNTOS-64-S11 testbox]$ ./t1
[sand@PS-CNTOS-64-S11 testbox]$


[sand@PS-CNTOS-64-S11 testbox]$ valgrind --tool=memcheck --leak-check=full ./t1
==20958== Memcheck, a memory error detector
==20958== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==20958== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==20958== Command: ./t1
==20958==
==20958==
==20958== HEAP SUMMARY:
==20958==     in use at exit: 0 bytes in 0 blocks
==20958==   total heap usage: 1 allocs, 1 frees, 100 bytes allocated
==20958==
==20958== All heap blocks were freed -- no leaks are possible
==20958==
==20958== For counts of detected and suppressed errors, rerun with: -v
==20958== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)
[sand@PS-CNTOS-64-S11 testbox]$

看看博客上使用瓦尔格林 Link

我知道这是一个非常老的线程,但它是这个错误的顶部谷歌搜索,并没有任何回应提到一个共同的原因,这个错误。

也就是关闭你已经关闭的文件。

如果您没有注意并且有两个不同的函数关闭同一个文件,那么第二个函数将生成此错误。

使用现代的 C + + 编译器,您可以使用 消毒剂来跟踪。

例子:

My program:

$cat d_free.cxx
#include<iostream>


using namespace std;


int main()


{
int * i = new int();
delete i;
//i = NULL;
delete i;
}

与地址消毒剂一起编译:

# g++-7.1 d_free.cxx -Wall -Werror -fsanitize=address -g

执行:

# ./a.out
=================================================================
==4836==ERROR: AddressSanitizer: attempting double-free on 0x602000000010 in thread T0:
#0 0x7f35b2d7b3c8 in operator delete(void*, unsigned long) /media/sf_shared/gcc-7.1.0/libsanitizer/asan/asan_new_delete.cc:140
#1 0x400b2c in main /media/sf_shared/jkr/cpp/d_free/d_free.cxx:11
#2 0x7f35b2050c04 in __libc_start_main (/lib64/libc.so.6+0x21c04)
#3 0x400a08  (/media/sf_shared/jkr/cpp/d_free/a.out+0x400a08)


0x602000000010 is located 0 bytes inside of 4-byte region [0x602000000010,0x602000000014)
freed by thread T0 here:
#0 0x7f35b2d7b3c8 in operator delete(void*, unsigned long) /media/sf_shared/gcc-7.1.0/libsanitizer/asan/asan_new_delete.cc:140
#1 0x400b1b in main /media/sf_shared/jkr/cpp/d_free/d_free.cxx:9
#2 0x7f35b2050c04 in __libc_start_main (/lib64/libc.so.6+0x21c04)


previously allocated by thread T0 here:
#0 0x7f35b2d7a040 in operator new(unsigned long) /media/sf_shared/gcc-7.1.0/libsanitizer/asan/asan_new_delete.cc:80
#1 0x400ac9 in main /media/sf_shared/jkr/cpp/d_free/d_free.cxx:8
#2 0x7f35b2050c04 in __libc_start_main (/lib64/libc.so.6+0x21c04)


SUMMARY: AddressSanitizer: double-free /media/sf_shared/gcc-7.1.0/libsanitizer/asan/asan_new_delete.cc:140 in operator delete(void*, unsigned long)
==4836==ABORTING

要了解更多关于消毒剂的信息,您可以查看 this这个或任何现代的 c + + 编译器(例如 gcc、 clang 等)文档。

In my case, I was linking my program against CUDA 10.0 while a dependency of my program was linked with CUDA 10.2 (cudart.10.2.so). The inconsistency caused "double free or corruption" for me.

您可以使用 ldd <your program>查看依赖项中是否有多个版本的 CUDA 库。