Java 读整数是用 little endian 还是 big endian?

我这样问是因为我正在将一个字节流从一个 C 进程发送到 Java。在 C 端,32位整数的 LSB 是第一个字节,MSB 是第四个字节。

所以我的问题是: 在 Java 端,当我们读取从 C 进程发送的字节时,在 Java 端 Endian是什么?

接下来的问题是: 如果 Java 端的 endian 与发送的 endian 不同,那么如何在它们之间进行转换?

129478 次浏览

There are no unsigned integers in Java. All integers are signed and in big endian.

On the C side the each byte has tne LSB at the start is on the left and the MSB at the end.

It sounds like you are using LSB as Least significant bit, are you? LSB usually stands for least significant byte. Endianness is not bit based but byte based.

To convert from unsigned byte to a Java integer:

int i = (int) b & 0xFF;

To convert from unsigned 32-bit little-endian in byte[] to Java long (from the top of my head, not tested):

long l = (long)b[0] & 0xFF;
l += ((long)b[1] & 0xFF) << 8;
l += ((long)b[2] & 0xFF) << 16;
l += ((long)b[3] & 0xFF) << 24;

I would read the bytes one by one, and combine them into a long value. That way you control the endianness, and the communication process is transparent.

Use the network byte order (big endian), which is the same as Java uses anyway. See man htons for the different translators in C.

If it fits the protocol you use, consider using a DataInputStream, where the behavior is very well defined.

There's no way this could influence anything in Java, since there's no (direct non-API) way to map some bytes directly into an int in Java.

Every API that does this or something similar defines the behaviour pretty precisely, so you should look up the documentation of that API.

I stumbled here via Google and got my answer that Java is big endian.

Reading through the responses I'd like to point out that bytes do indeed have an endian order, although mercifully, if you've only dealt with “mainstream” microprocessors you are unlikely to have ever encountered it as Intel, Motorola, and Zilog all agreed on the shift direction of their UART chips and that MSB of a byte would be 2**7 and LSB would be 2**0 in their CPUs (I used the FORTRAN power notation to emphasize how old this stuff is :) ).

I ran into this issue with some Space Shuttle bit serial downlink data 20+ years ago when we replaced a $10K interface hardware with a Mac computer. There is a NASA Tech brief published about it long ago. I simply used a 256 element look up table with the bits reversed (table[0x01]=0x80 etc.) after each byte was shifted in from the bit stream.

Java is 'Big-endian' as noted above. That means that the MSB of an int is on the left if you examine memory (on an Intel CPU at least). The sign bit is also in the MSB for all Java integer types.
Reading a 4 byte unsigned integer from a binary file stored by a 'Little-endian' system takes a bit of adaptation in Java. DataInputStream's readInt() expects Big-endian format.
Here's an example that reads a four byte unsigned value (as displayed by HexEdit as 01 00 00 00) into an integer with a value of 1:

 // Declare an array of 4 shorts to hold the four unsigned bytes
short[] tempShort = new short[4];
for (int b = 0; b < 4; b++) {
tempShort[b] = (short)dIStream.readUnsignedByte();
}
int curVal = convToInt(tempShort);


// Pass an array of four shorts which convert from LSB first
public int convToInt(short[] sb)
{
int answer = sb[0];
answer += sb[1] << 8;
answer += sb[2] << 16;
answer += sb[3] << 24;
return answer;
}

Imho there is no endianness defined for java. The endianness is the one of the hardware but java is highlevel and hides the hardware so you don't have to wory about that.

The only endianess related feature is how the java lib maps int and long to byte[] (and inversely). It does it Big-Endian which is the most readable and natural:

int i=0xAABBCCDD

maps to

byte[] b={0xAA,0xBB,0xCC,0xDD}