是否有机器,其中 sizeof (CHAR) ! = 1,或者至少 CHAR_BIT > 8?

是否有机器(或编译器) ,在哪里 sizeof(char) != 1

C99标准是否说 sizeof(char)在标准遵从性实现上必须正好是1?如果是的话,请给我区号和传票。

更新: 如果我有一台机器(CPU) ,它不能处理字节(最小读取是4字节,对齐) ,但只有4-s 字节(uint32_t) ,这台机器的编译器能定义 sizeof(char)到4吗? sizeof(char)将是1,但 char 将有32位(CHAR_BIT宏)

更新2: 但是结果的大小不是一个字节!它是 CHAR 的大小。Char 可以是2字节,或者(可能是)7位?

更新3: 好的。所有的机器都有 sizeof(char) == 1。但是哪些机器有 CHAR_BIT > 8呢?

25890 次浏览

It is always one in C99, section 6.5.3.4:

When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1.

Edit: not part of your question, but for interest from Harbison and Steele's. C: A Reference Manual, Third Edition, Prentice Hall, 1991 (pre c99) p. 148:

A storage unit is taken to be the amount of storage occupied by one character; the size of an object of type char is therefore 1.

Edit: In answer to your updated question, the following question and answer from Harbison and Steele is relevant (ibid, Ex. 4 of Ch. 6):

Is it allowable to have a C implementation in which type char can represent values ranging from -2,147,483,648 through 2,147,483,647? If so, what would be sizeof(char) under that implementation? What would be the smallest and largest ranges of type int?

Answer (ibid, p. 382):

It is permitted (if wasteful) for an implementation to use 32 bits to represent type char. Regardless of the implementation, the value of sizeof(char) is always 1.

While this does not specifically address a case where, say bytes are 8 bits and char are 4 of those bytes (actually impossible with the c99 definition, see below), the fact that sizeof(char) = 1 always is clear from the c99 standard and Harbison and Steele.

Edit: In fact (this is in response to your upd 2 question), as far as c99 is concerned sizeof(char) is in bytes, from section 6.5.3.4 again:

The sizeof operator yields the size (in bytes) of its operand

so combined with the quotation above, bytes of 8 bits and char as 4 of those bytes is impossible: for c99 a byte is the same as a char.

In answer to your mention of the possibility of a 7 bit char: this is not possible in c99. According to section 5.2.4.2.1 of the standard the minimum is 8:

Their implementation-defined values shall be equal or greater [my emphasis] in magnitude to those shown, with the same sign.

— number of bits for smallest object that is not a bit-field (byte)

CHAR_BIT 8

— minimum value for an object of type signed char

SCHAR_MIN -127

— maximum value for an object of type signed char

SCHAR_MAX +127

— maximum value for an object of type unsigned char

UCHAR_MAX 255

— minimum value for an object of type char

CHAR_MIN see below

— maximum value for an object of type char

CHAR_MAX see below

[...]

If the value of an object of type char is treated as a signed integer when used in an expression, the value of CHAR_MIN shall be the same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same as that of SCHAR_MAX. Otherwise, the value of CHAR_MIN shall be 0 and the value of CHAR_MAX shall be the same as that of UCHAR_MAX. The value UCHAR_MAX shall equal 2CHAR_BIT − 1.

There are no machines where sizeof(char) is 4. It's always 1 byte. That byte might contain 32 bits, but as far as the C compiler is concerned, it's one byte. For more details, I'm actually going to point you at the C++ FAQ 26.6. That link covers it pretty well and I'm fairly certain C++ got all of those rules from C. You can also look at comp.lang.c FAQ 8.10 for characters larger than 8 bits.

Upd2: But sizeof result is NOT a BYTES ! it is the size of CHAR. And char can be 2 byte, or (may be) 7 bit?

Yes, it is bytes. Let me say it again. sizeof(char) is 1 byte according to the C compiler. What people colloquially call a byte (8 bits) is not necessarily the same as what the C compiler calls a byte. The number of bits in a C byte varies depending on your machine architecture. It's also guaranteed to be at least 8.

PDP-10 and PDP-11 was.

Update: there like no C99 compilers for PDP-10.

Some models of Analog Devices 32-bit SHARC DSP have CHAR_BIT=32, and Texas Instruments DSP from TMS32F28xx have CHAR_BIT=16, reportedly.

Update: There is GCC 3.2 for PDP-10 with CHAR_BIT=9 (check include/limits.h in that archive).