Reader 和 InputStream 有什么区别?

Reader 和 InputStream 有什么区别? 什么时候用? 如果我可以使用阅读器读取字符为什么我会使用 inputstream,我猜想读取对象?

55353 次浏览

One accepts bytes and the other accepts characters.

InputStreams are used to read bytes from a stream. So they are useful for binary data such as images, video and serialized objects.

Readers on the other hand are character streams so they are best used to read character data.

An InputStream is the raw method of getting information from a resource. It grabs the data byte by byte without performing any kind of translation. If you are reading image data, or any binary file, this is the stream to use.

A Reader is designed for character streams. If the information you are reading is all text, then the Reader will take care of the character decoding for you and give you unicode characters from the raw input stream. If you are reading any type of text, this is the stream to use.

You can wrap an InputStream and turn it into a Reader by using the InputStreamReader class.

Reader reader = new InputStreamReader(inputStream, StandardCharsets.UTF_8);

I guess the source of confusion is that InputStream.read() returns an int and Reader.read() also returns an int.

The difference is that InputStream.read() return byte values between 0 and 255 corresponding to the raw contents of the byte stream and Reader.read() return the character value which is between 0 and 65357 (because there are 65358 different unicode codepoints)

An InputStream lets you read the contents byte by byte, for example the contents "a‡a" has 3 characters but it's encoded at 5 bytes in UTF-8. So with Inputstream you can read it as a stream of 5 bytes (each one represented as an int between 0 and 255) resulting in 97, 226, 128, 161 and 97 where

a -> U+0061 -> 0x61 (hex) -> 97 (dec)
‡ -> U+2021 -> 0xE280A1 (utf-8 encoding of 0x2021) -> 226 128 161 (1 int per byte)
a -> U+0061 -> 0x61 (hex) -> 97 (dec)

A Reader lets you read the contents character by character so the contents "a‡a" are read as 3 characters 97, 8225 and 97 where

a -> U+0061 -> 0x61 -> 97
‡ -> U+2021 -> 0x2021 -> 8225 (single int, not 3)
a -> U+0061 -> 0x61 -> 97

The character ‡ is referred as U+2021 in Unicode

InputStream accept byte,Reader accept character, In Java, one character = two bytes , and Reader use buffer,InputStream not use. All file store in disk or transfer based on byte, include image and video, but character is in memory,so InputStream is used frequently.