Malloc VS new ——不同的填充物

小开

I think you are right. Padding is done by the compiler not new or malloc. Padding considerations would apply even if you declared an array or struct without using new or malloc at all. In any case while I can see how different implementations of new and malloc could cause problems when porting code between platforms, I completely fail to see how they could cause problems transferring data between platforms.

小开

Your colleague may have had new[]/delete[]'s magic cookie in mind (this is the information the implementation uses when deleting an array). However, this would not have been a problem if the allocation beginning at the address returned by new[] were used (as opposed to the allocator's).

Packing seems more probable. Variations in ABIs could (for example) result in a different number of trailing bytes added at the end a structure (this is influenced by alignment, also consider arrays). With malloc, the position of a structure could be specified and thus more easily portable to a foreign ABI. These variations are normally prevented by specifying alignment and packing of transfer structures.

小开

The layout of an object can't depend on whether it was allocated using malloc or new. They both return the same kind of pointer, and when you pass this pointer to other functions they won't know how the object was allocated. sizeof *ptr is just dependent on the declaration of ptr, not how it was assigned.

小开

最佳答案

IIRC there's one picky point. malloc is guaranteed to return an address aligned for any standard type. ::operator new(n) is only guaranteed to return an address aligned for any standard type no larger than n, and if T isn't a character type then new T[n] is only required to return an address aligned for T.

But this is only relevant when you're playing implementation-specific tricks like using the bottom few bits of a pointer to store flags, or otherwise relying on the address to have more alignment than it strictly needs.

It doesn't affect padding within the object, which necessarily has exactly the same layout regardless of how you allocated the memory it occupies. So it's hard to see how the difference could result in errors transferring data.

Is there any sign what the author of that comment thinks about objects on the stack or in globals, whether in his opinion they're "padded like malloc" or "padded like new"? That might give clues to where the idea came from.

Maybe he's confused, but maybe the code he's talking about is more than a straight difference between malloc(sizeof(Foo) * n) vs new Foo[n]. Maybe it's more like:

malloc((sizeof(int) + sizeof(char)) * n);

vs.

struct Foo { int a; char b; }
new Foo[n];

That is, maybe he's saying "I use malloc", but means "I manually pack the data into unaligned locations instead of using a struct". Actually malloc is not needed in order to manually pack the struct, but failing to realize that is a lesser degree of confusion. It is necessary to define the data layout sent over the wire. Different implementations will pad the data differently when the struct is used.

小开

This is my wild guess of where this thing is coming from. As you mentioned, problem is with data transmission over MPI.

Personally, for my complicated data structures that I want to send/receive over MPI, I always implement serialization/deserialization methods that pack/unpack the whole thing into/from an array of chars. Now, due to padding we know that that size of the structure could be larger than the size of its members and thus one also needs to calculate the unpadded size of the data structure so that we know how many bytes are being sent/received.

For instance if you want to send/receive std::vector<Foo> A over MPI with the said technique, it is wrong to assume the size of resulting array of chars is A.size()*sizeof(Foo) in general. In other words, each class that implements serialize/deserialize methods, should also implement a method that reports the size of the array (or better yet store the array in a container). This might become the reason behind a bug. One way or another, however, that has nothing to do with new vs malloc as pointed out in this thread.

小开

When I want to control the layout of my plain old data structure, with MS Visual compilers I use #pragma pack(1). I suppose such a precompiler directive is supported for most compilers, like for example gcc.

This has the consequence of aligning all fields of the structures one behind the other, without empty spaces.

If the platform on the other end does the same ( i.e. compiled its data exchange structure with a padding of 1), then the data retrieved on both side justs fits well. Thus I have never had to to play with malloc in C++.

At worst I would have considered overloading the new operator so as it performs some tricky things, rather than using malloc directly in C++.

小开

In c++: newkeyword is used to allocate some particular bytes of memory with respect to some data-structure. For example, you have defined some class or structure and you want to allocate memory for its object.

myclass *my = new myclass();

or

int *i = new int(2);

But in all cases you need the defined datatype (class, struct, union, int, char etc...) and only that bytes of memory will be allocated which is required for its object/variable. (ie; multiples of that datatype).

But in case of malloc() method, you can allocate any bytes of memory and you don't need to specify the data type at all times. Here you can observe it in few possibilities of malloc():

void *v = malloc(23);

or

void *x = malloc(sizeof(int) * 23);

or

char *c = (char*)malloc(sizeof(char)*35);

小开

malloc is a type of function and new is a type of data type in c++ in c++, if we use malloc than we must and should use typecast otherwise compiler give you error and if we use new data type for allocation of memory than we no need to typecast