为什么我会得到一个 C malloc 断言失败?

我正在实现一个分而治之的多项式算法,这样我就可以将它与 OpenCL 实现进行比较,但是我无法让 malloc工作。当我运行这个程序时,它会分配一堆东西,检查一些东西,然后将 size/2发送给算法。然后当我再次按下 malloc线时,它会吐出这样的信息:

malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Aborted

问题在于:

int *mult(int size, int *a, int *b) {
int *out,i, j, *tmp1, *tmp2, *tmp3, *tmpa1, *tmpa2, *tmpb1, *tmpb2,d, *res1, *res2;
fprintf(stdout, "size: %d\n", size);


out = (int *)malloc(sizeof(int) * size * 2);
}

我用 fprintf检查了 size,它是一个正整数(通常是50)。我也试着用一个普通的号码打电话给 malloc,但我仍然得到了错误。我只是不知道发生了什么,到目前为止,我在谷歌上找到的东西都没有帮助。

知道是怎么回事吗?我正在想办法编译一个新的 GCC 以防出现编译器错误,但我真的很怀疑。

182262 次浏览

99.9% likely that you have corrupted memory (over- or under-flowed a buffer, wrote to a pointer after it was freed, called free twice on the same pointer, etc.)

Run your code under Valgrind to see where your program did something incorrect.

You are probably overrunning beyond the allocated mem somewhere. then the underlying sw doesn't pick up on it until you call malloc

There may be a guard value clobbered that is being caught by malloc.

edit...added this for bounds checking help

http://www.lrde.epita.fr/~akim/ccmp/doc/bounds-checking.html

We got this error because we forgot to multiply by sizeof(int). Note the argument to malloc(..) is a number of bytes, not number of machine words or whatever.

I was porting one application from Visual C to gcc over Linux and I had the same problem with

malloc.c:3096: sYSMALLOc: Assertion using gcc on UBUNTU 11.

I moved the same code to a Suse distribution (on other computer ) and I don't have any problem.

I suspect that the problems are not in our programs but in the own libc.

To give you a better understanding of why this happens, I'd like to expand upon @r-samuel-klatchko's answer a bit.

When you call malloc, what is really happening is a bit more complicated than just giving you a chunk of memory to play with. Under the hood, malloc also keeps some housekeeping information about the memory it has given you (most importantly, its size), so that when you call free, it knows things like how much memory to free. This information is commonly kept right before the memory location returned to you by malloc. More exhaustive information can be found on the internet™, but the (very) basic idea is something like this:

+------+-------------------------------------------------+
+ size |                  malloc'd memory                +
+------+-------------------------------------------------+
^-- location in pointer returned by malloc

Building on this (and simplifying things greatly), when you call malloc, it needs to get a pointer to the next part of memory that is available. One very simple way of doing this is to look at the previous bit of memory it gave away, and move size bytes further down (or up) in memory. With this implementation, you end up with your memory looking something like this after allocating p1, p2 and p3:

+------+----------------+------+--------------------+------+----------+
+ size |                | size |                    | size |          +
+------+----------------+------+--------------------+------+----------+
^- p1                   ^- p2                       ^- p3

So, what is causing your error?

Well, imagine that your code erroneously writes past the amount of memory you've allocated (either because you allocated less than you needed as was your problem or because you're using the wrong boundary conditions somewhere in your code). Say your code writes so much data to p2 that it starts overwriting what is in p3's size field. When you now next call malloc, it will look at the last memory location it returned, look at its size field, move to p3 + size and then start allocating memory from there. Since your code has overwritten size, however, this memory location is no longer after the previously allocated memory.

Needless to say, this can wreck havoc! The implementors of malloc have therefore put in a number of "assertions", or checks, that try to do a bunch of sanity checking to catch this (and other issues) if they are about to happen. In your particular case, these assertions are violated, and thus malloc aborts, telling you that your code was about to do something it really shouldn't be doing.

As previously stated, this is a gross oversimplification, but it is sufficient to illustrate the point. The glibc implementation of malloc is more than 5k lines, and there have been substantial amounts of research into how to build good dynamic memory allocation mechanisms, so covering it all in a SO answer is not possible. Hopefully this has given you a bit of a view of what is really causing the problem though!

I got the following message, similar to your one:



program: malloc.c:2372: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.


Made a mistake some method call before, when using malloc. Erroneously overwrote the multiplication sign '*' with a '+', when updating the factor after sizeof()-operator on adding a field to unsigned char array.

Here is the code responsible for the error in my case:



UCHAR* b=(UCHAR*)malloc(sizeof(UCHAR)+5);
b[INTBITS]=(some calculation);
b[BUFSPC]=(some calculation);
b[BUFOVR]=(some calculation);
b[BUFMEM]=(some calculation);
b[MATCHBITS]=(some calculation);


In another method later, I used malloc again and it produced the error message shown above. The call was (simple enough):



UCHAR* b=(UCHAR*)malloc(sizeof(UCHAR)*50);


Think using the '+'-sign on the 1st call, which lead to mis-calculus in combination with immediate initialization of the array after (overwriting memory that was not allocated to the array), brought some confusion to malloc's memory map. Therefore the 2nd call went wrong.

i got the same problem, i used malloc over n over again in a loop for adding new char *string data. i faced the same problem, but after releasing the allocated memory void free() problem were sorted

My alternative solution to using Valgrind:

I'm very happy because I just helped my friend debug a program. His program had this exact problem (malloc() causing abort), with the same error message from GDB.

I compiled his program using Address Sanitizer with

gcc -Wall -g3 -fsanitize=address -o new new.c
^^^^^^^^^^^^^^^^^^

And then ran gdb new. When the program gets terminated by SIGABRT caused in a subsequent malloc(), a whole lot of useful information is printed:

=================================================================
==407==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060000000b4 at pc 0x7ffffe49ed1a bp 0x7ffffffedc20 sp 0x7ffffffed3c8
WRITE of size 104 at 0x6060000000b4 thread T0
#0 0x7ffffe49ed19  (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5ed19)
#1 0x8001dab in CreatHT2 /home/wsl/Desktop/hash/new.c:59
#2 0x80031cf in main /home/wsl/Desktop/hash/new.c:209
#3 0x7ffffe061b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
#4 0x8001679 in _start (/mnt/d/Desktop/hash/new+0x1679)


0x6060000000b4 is located 0 bytes to the right of 52-byte region [0x606000000080,0x6060000000b4)
allocated by thread T0 here:
#0 0x7ffffe51eb50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
#1 0x8001d56 in CreatHT2 /home/wsl/Desktop/hash/new.c:55
#2 0x80031cf in main /home/wsl/Desktop/hash/new.c:209
#3 0x7ffffe061b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

Let's take a look at the output, especially the stack trace:

The first part says there's a invalid write operation at new.c:59. That line reads

memset(len,0,sizeof(int*)*p);
^^^^^^^^^^^^

The second part says the memory that the bad write happened on is created at new.c:55. That line reads

if(!(len=(int*)malloc(sizeof(int)*p))){
^^^^^^^^^^^

That's it. It only took me less than half a minute to locate the bug that confused my friend for a few hours. He managed to locate the failure, but it's a subsequent malloc() call that failed, without being able to spot this error in previous code.

Sum up: Try the -fsanitize=address of GCC or Clang. It can be very helpful when debugging memory issues.