为什么这个结构体的大小是3而不是2?

我已经定义了这个结构:

typedef struct
{
char A:3;
char B:3;
char C:3;
char D:3;
char E:3;
} col;

sizeof(col)给出了3的输出,但它不应该是2吗?如果我只注释一个元素,那么 sizeof就是2。我不明白为什么: 5个3位的元素等于15位,而这还不到2字节。

在定义这样一个结构时,是否存在一个“内部尺寸”?我只是需要一个澄清,因为从我的语言概念到目前为止,我期望的大小是2字节,而不是3。

7777 次浏览

Because you can't have a bit packet field that spans across the minimum alignment boundary (which is 1 byte) so they'll probably get packed like

byte 1
A : 3
B : 3
padding : 2
byte 2
C : 3
D : 3
padding : 2
byte 3
E : 3
padding : 5

(the orders of field/padding inside the same byte is not intentional, it's just to give you the idea, since the compiler could laid them down how it prefers)

The first two bit fields fit into a single char. The third cannot fit into that char and needs a new one. 3 + 3 + 3 = 9 which doesn't fit into an 8 bit char.

So the first pair takes a char, the second pair takes a char, and the last bit field get a third char.

Because you are using char as the underlying type for your fields, the compiler tries to group bits by bytes, and since it cannot put more than eight bits in each byte, it can only store two fields per byte.

The total sum of bits your struct uses is 15, so the ideal size to fit that much data would be a short.

#include <stdio.h>


typedef struct
{
char A:3;
char B:3;
char C:3;
char D:3;
char E:3;
} col;




typedef struct {
short A:3;
short B:3;
short C:3;
short D:3;
short E:3;
} col2;




int main(){


printf("size of col: %lu\n", sizeof(col));
printf("size of col2: %lu\n", sizeof(col2));


}

The above code (for a 64-bit platform like mine) will indeed yield 2 for the second struct. For anything larger than a short, the struct will fill no more than one element of the used type, so - for that same platform - the struct will end up with size four for int , eight for long, etc.

Even though the ANSI C standard specifies too little about how bitfields are packed to offer any significant advantage over "compilers are allowed to pack bitfields however they see fit", it nonetheless in many cases forbids compilers from packing things in the most efficient fashion.

In particular, if a structure contains bitfields, a compiler is required to store it as a structure which contains one or more anonymous fields of some "normal" storage type and then logically subdivide each such field into its constituent bitfield parts. Thus, given:

unsigned char foo1: 3;
unsigned char foo2: 3;
unsigned char foo3: 3;
unsigned char foo4: 3;
unsigned char foo5: 3;
unsigned char foo6: 3;
unsigned char foo7: 3;

If unsigned char is 8 bits, the compiler would be required to allocate four fields of that type, and assign two bitfields to all but one (which would be in a char field of its own). If all char declarations had been replaced with short, then there would be two fields of type short, one of which would hold five bitfields and the other of which would hold the remaining two.

On a processor without alignment restrictions, the data could be laid out more efficiently by using unsigned short for the first five fields and unsigned char for the last two, storing seven three-bit fields in three bytes. While it should be possible to store eight three-bit fields in three bytes, a compiler could only allow that if there existed a three-byte numeric type which could be used as the "outer field" type.

Personally, I consider bitfields as defined to be basically useless. If code needs to work with binary-packed data, it should explicitly define storage locations of actual types, and then use macros or some other such means to access the bits thereof. It would be helpful if C supported a syntax like:

unsigned short f1;
unsigned char f2;
union foo1 = f1:0.3;
union foo2 = f1:3.3;
union foo3 = f1:6.3;
union foo4 = f1:9.3;
union foo5 = f1:12.3;
union foo6 = f2:0.3;
union foo7 = f2:3.3;

Such a syntax, if allowed, would make it possible for code to use bitfields in a portable fashion, without regard for word sizes or byte orderings (foo0 would be in the three least-significant bits of f1, but those could be stored at the lower or higher address). Absent such a feature, however, macros are probably the only portable way to operate with such things.

Most compilers allow you to control the padding, e.g. using #pragmas. Here's an example with GCC 4.8.1:

#include <stdio.h>


typedef struct
{
char A:3;
char B:3;
char C:3;
char D:3;
char E:3;
} col;


#pragma pack(push, 1)
typedef struct {
char A:3;
char B:3;
char C:3;
char D:3;
char E:3;
} col2;
#pragma pack(pop)


int main(){
printf("size of col: %lu\n", sizeof(col));  // 3
printf("size of col2: %lu\n", sizeof(col2));  // 2
}

Note that the default behaviour of the compiler is there for a reason and will probably give you better performance.