使用 FileWriter (Java)编写 UTF-8格式的文件?

但是,我有以下代码,我希望它作为一个 UTF-8文件来处理外来字符。有没有办法做到这一点,是否有一些需要有一个参数?

我真的很感激你的帮助,谢谢。

try {
BufferedReader reader = new BufferedReader(new FileReader("C:/Users/Jess/My Documents/actresses.list"));
writer = new BufferedWriter(new FileWriter("C:/Users/Jess/My Documents/actressesFormatted.csv"));
while( (line = reader.readLine()) != null) {
//If the line starts with a tab then we just want to add a movie
//using the current actor's name.
if(line.length() == 0)
continue;
else if(line.charAt(0) == '\t') {
readMovieLine2(0, line, surname.toString(), forename.toString());
} //Else we've reached a new actor
else {
readActorName(line);
}
}
} catch (IOException e) {
e.printStackTrace();
}
140984 次浏览

You need to use the OutputStreamWriter class as the writer parameter for your BufferedWriter. It does accept an encoding. Review javadocs for it.

Somewhat like this:

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("jedis.txt"), "UTF-8"
));

Or you can set the current system encoding with the system property file.encoding to UTF-8.

java -Dfile.encoding=UTF-8 com.jediacademy.Runner arg1 arg2 ...

You may also set it as a system property at runtime with System.setProperty(...) if you only need it for this specific file, but in a case like this I think I would prefer the OutputStreamWriter.

By setting the system property you can use FileWriter and expect that it will use UTF-8 as the default encoding for your files. In this case for all the files that you read and write.

EDIT

  • Starting from API 19, you can replace the String "UTF-8" with StandardCharsets.UTF_8

  • As suggested in the comments below by tchrist, if you intend to detect encoding errors in your file you would be forced to use the OutputStreamWriter approach and use the constructor that receives a charset encoder.

    Somewhat like

    CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
    encoder.onMalformedInput(CodingErrorAction.REPORT);
    encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("jedis.txt"),encoder));
    

    You may choose between actions IGNORE | REPLACE | REPORT

Also, this question was already answered here.

Ditch FileWriter and FileReader, which are useless exactly because they do not allow you to specify the encoding. Instead, use

new OutputStreamWriter(new FileOutputStream(file), StandardCharsets.UTF_8)

and

new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

Safe Encoding Constructors

Getting Java to properly notify you of encoding errors is tricky. You must use the most verbose and, alas, the least used of the four alternate contructors for each of InputStreamReader and OutputStreamWriter to receive a proper exception on an encoding glitch.

For file I/O, always make sure to always use as the second argument to both OutputStreamWriter and InputStreamReader the fancy encoder argument:

  Charset.forName("UTF-8").newEncoder()

There are other even fancier possibilities, but none of the three simpler possibilities work for exception handing. These do:

 OutputStreamWriter char_output = new OutputStreamWriter(
new FileOutputStream("some_output.utf8"),
Charset.forName("UTF-8").newEncoder()
);


InputStreamReader char_input = new InputStreamReader(
new FileInputStream("some_input.utf8"),
Charset.forName("UTF-8").newDecoder()
);

As for running with

 $ java -Dfile.encoding=utf8 SomeTrulyRemarkablyLongcLassNameGoeShere

The problem is that that will not use the full encoder argument form for the character streams, and so you will again miss encoding problems.

Longer Example

Here’s a longer example, this one managing a process instead of a file, where we promote two different input bytes streams and one output byte stream all to UTF-8 character streams with full exception handling:

 // this runs a perl script with UTF-8 STD{IN,OUT,ERR} streams
Process
slave_process = Runtime.getRuntime().exec("perl -CS script args");


// fetch his stdin byte stream...
OutputStream
__bytes_into_his_stdin  = slave_process.getOutputStream();


// and make a character stream with exceptions on encoding errors
OutputStreamWriter
chars_into_his_stdin  = new OutputStreamWriter(
__bytes_into_his_stdin,
/* DO NOT OMIT! */  Charset.forName("UTF-8").newEncoder()
);


// fetch his stdout byte stream...
InputStream
__bytes_from_his_stdout = slave_process.getInputStream();


// and make a character stream with exceptions on encoding errors
InputStreamReader
chars_from_his_stdout = new InputStreamReader(
__bytes_from_his_stdout,
/* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
);


// fetch his stderr byte stream...
InputStream
__bytes_from_his_stderr = slave_process.getErrorStream();


// and make a character stream with exceptions on encoding errors
InputStreamReader
chars_from_his_stderr = new InputStreamReader(
__bytes_from_his_stderr,
/* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
);

Now you have three character streams that all raise exception on encoding errors, respectively called chars_into_his_stdin, chars_from_his_stdout, and chars_from_his_stderr.

This is only slightly more complicated that what you need for your problem, whose solution I gave in the first half of this answer. The key point is this is the only way to detect encoding errors.

Just don’t get me started about PrintStreams eating exceptions.

With Chinese text, I tried to use the Charset UTF-16 and lucklily it work.

Hope this could help!

PrintWriter out = new PrintWriter( file, "UTF-16" );

In my opinion

If you wanna write follow kind UTF-8.You should create a byte array.Then,you can do such as the following: byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();

Then, you can write each byte into file you created. Example:

OutputStream f=new FileOutputStream(xmlfile);
byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();
for (int i=0;i<by.length;i++){
byte b=by[i];
f.write(b);


}
f.close();

Since Java 7 there is an easy way to handle character encoding of BufferedWriter and BufferedReaders. You can create a BufferedWriter directly by using the Files class instead of creating various instances of Writer. You can simply create a BufferedWriter, which considers character encoding, by calling:

Files.newBufferedWriter(file.toPath(), StandardCharsets.UTF_8);

You can find more about it in JavaDoc:

Since Java 11 you can do:

FileWriter fw = new FileWriter("filename.txt", Charset.forName("utf-8"));

OK it's 2019 now, and from Java 11 you have a constructor with Charset:

FileWriter​(String fileName, Charset charset)

Unfortunately, we still cannot modify the byte buffer size, and it's set to 8192. (https://www.baeldung.com/java-filewriter)

use OutputStream instead of FileWriter to set encoding type

// file is your File object where you want to write you data
OutputStream outputStream = new FileOutputStream(file);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream, "UTF-8");
outputStreamWriter.write(json); // json is your data
outputStreamWriter.flush();
outputStreamWriter.close();