Using the getBytes method, giving it the appropriate Charset (or Charset name).
Example:
String s = "Hello, there.";
byte[] b = s.getBytes(StandardCharsets.US_ASCII);
If more control is required (such as throwing an exception when a character outside the 7 bit US-ASCII is encountered) then CharsetDecoder can be used:
private static byte[] strictStringToBytes(String s, Charset charset) throws CharacterCodingException {
ByteBuffer x = charset.newEncoder().onMalformedInput(CodingErrorAction.REPORT).encode(CharBuffer.wrap(s));
byte[] b = new byte[x.remaining()];
x.get(b);
return b;
}
Before Java 7 it is possible to use: byte[] b = s.getBytes("US-ASCII");. The enum StandardCharsets, the encoder as well as the specialized getBytes(Charset) methods have been introduced in Java 7.
Notice the upper case "String". This tries to invoke a static method on the string class, which does not exist. Instead you need to invoke the method on your string instance:
The problem with other proposed solutions is that they will either drop characters that cannot be directly mapped to ASCII, or replace them with a marker character like ?.
You might desire to have for example accented characters converted to that same character without the accent. There are a couple of tricks to do this (including building a static mapping table yourself or leveraging existing 'normalization' defined for unicode), but those methods are far from complete.
Your best bet is using the junidecode library, which cannot be complete either but incorporates a lot of experience in the most sane way of transliterating Unicode to ASCII.
In my string I have Thai characters (TIS620 encoded) and German umlauts. The answer from agiles put me on the right path. Instead of .getBytes() I use now
int len = mString.length(); // Length of the string
byte[] dataset = new byte[len];
for (int i = 0; i < len; ++i) {
char c = mString.charAt(i);
dataset[i]= (byte) c;
}