将 char []转换为 byte []

我想在 Java 中将字符数组转换为字节数组。存在哪些进行此转换的方法?

159338 次浏览
char[] ch = ?
new String(ch).getBytes();

Or, to get non-default charset:

new String(ch).getBytes("UTF-8");

Update: Since Java 7:

new String(ch).getBytes(StandardCharsets.UTF_8);

Convert without creating String object:

import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.util.Arrays;


byte[] toBytes(char[] chars) {
CharBuffer charBuffer = CharBuffer.wrap(chars);
ByteBuffer byteBuffer = Charset.forName("UTF-8").encode(charBuffer);
byte[] bytes = Arrays.copyOfRange(byteBuffer.array(),
byteBuffer.position(), byteBuffer.limit());
Arrays.fill(byteBuffer.array(), (byte) 0); // clear sensitive data
return bytes;
}

Usage:

char[] chars = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
byte[] bytes = toBytes(chars);
/* do something with chars/bytes */
Arrays.fill(chars, '\u0000'); // clear sensitive data
Arrays.fill(bytes, (byte) 0); // clear sensitive data

Solution is inspired from Swing recommendation to store passwords in char[]. (See Why is char[] preferred over String for passwords?)

Remember not to write sensitive data to logs and ensure that JVM won't hold any references to it.


The code above is correct but not effective. If you don't need performance but want security you can use it. If security also not a goal then do simply String.getBytes. Code above is not effective if you look down of implementation of encode in JDK. Besides you need to copy arrays and create buffers. Another way to convert is inline all code behind encode (example for UTF-8):

val xs: Array[Char] = "A ß € 嗨 𝄞 🙂".toArray
val len = xs.length
val ys: Array[Byte] = new Array(3 * len) // worst case
var i = 0; var j = 0 // i for chars; j for bytes
while (i < len) { // fill ys with bytes
val c = xs(i)
if (c < 0x80) {
ys(j) = c.toByte
i = i + 1
j = j + 1
} else if (c < 0x800) {
ys(j) = (0xc0 | (c >> 6)).toByte
ys(j + 1) = (0x80 | (c & 0x3f)).toByte
i = i + 1
j = j + 2
} else if (Character.isHighSurrogate(c)) {
if (len - i < 2) throw new Exception("overflow")
val d = xs(i + 1)
val uc: Int =
if (Character.isLowSurrogate(d)) {
Character.toCodePoint(c, d)
} else {
throw new Exception("malformed")
}
ys(j) = (0xf0 | ((uc >> 18))).toByte
ys(j + 1) = (0x80 | ((uc >> 12) & 0x3f)).toByte
ys(j + 2) = (0x80 | ((uc >>  6) & 0x3f)).toByte
ys(j + 3) = (0x80 | (uc & 0x3f)).toByte
i = i + 2 // 2 chars
j = j + 4
} else if (Character.isLowSurrogate(c)) {
throw new Exception("malformed")
} else {
ys(j) = (0xe0 | (c >> 12)).toByte
ys(j + 1) = (0x80 | ((c >> 6) & 0x3f)).toByte
ys(j + 2) = (0x80 | (c & 0x3f)).toByte
i = i + 1
j = j + 3
}
}
// check
println(new String(ys, 0, j, "UTF-8"))

Excuse me for using Scala language. If you have problems with converting this code to Java I can rewrite it. What about performance always check on real data (with JMH for example). This code looks very similar to what you can see in JDK[2] and Protobuf[3].

Edit: Andrey's answer has been updated so the following no longer applies.

Andrey's answer (the highest voted at the time of writing) is slightly incorrect. I would have added this as comment but I am not reputable enough.

In Andrey's answer:

char[] chars = {'c', 'h', 'a', 'r', 's'}
byte[] bytes = Charset.forName("UTF-8").encode(CharBuffer.wrap(chars)).array();

the call to array() may not return the desired value, for example:

char[] c = "aaaaaaaaaa".toCharArray();
System.out.println(Arrays.toString(Charset.forName("UTF-8").encode(CharBuffer.wrap(c)).array()));

output:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 0]

As can be seen a zero byte has been added. To avoid this use the following:

char[] c = "aaaaaaaaaa".toCharArray();
ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
System.out.println(Arrays.toString(b));

output:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97]

As the answer also alluded to using passwords it might be worth blanking out the array that backs the ByteBuffer (accessed via the array() function):

ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
blankOutByteArray(bb.array());
System.out.println(Arrays.toString(b));

You could make a method:

public byte[] toBytes(char[] data) {
byte[] toRet = new byte[data.length];
for(int i = 0; i < toRet.length; i++) {
toRet[i] = (byte) data[i];
}
return toRet;
}

Hope this helps

private static byte[] charArrayToByteArray(char[] c_array) {
byte[] b_array = new byte[c_array.length];
for(int i= 0; i < c_array.length; i++) {
b_array[i] = (byte)(0xFF & (int)c_array[i]);
}
return b_array;
}

If you just want to convert the data container (the array) type itself, only regarding the data size and being agnostic to any encoding:

// original byte[]
byte[] pattern = null;
char[] arr = new char[pattern.length * 2];
ByteBuffer wrapper = ByteBuffer.wrap(pattern);
wrapper.position(0);
int i = 0;
while(wrapper.hasRemaining()) {
char character = wrapper.remaining() < 2 ? ((char) (((int) wrapper.get()) << 8)) : wrapper.getChar();
arr[i++] = character;
}