如何从 java.lang.String 获得 java.io.InputStream?

我有一个 String,我想用作一个 InputStream。在 Java 1.0中,您可以使用 java.io.StringBufferInputStream,但它是 @Deprecrated(有充分的理由——您不能指定字符集编码) :

这个类没有正确地转换 从 JDK 1.1开始, 创建流的首选方法 从一个字符串是通过 StringReader 同学们。

您可以使用 java.io.StringReader创建 java.io.Reader,但是没有适配器来获取 Reader并创建 InputStream

我发现一个 古老的虫子需要一个合适的替代品,但是没有这样的东西存在——就我所知。

经常建议的解决办法是使用 java.lang.String.getBytes()作为对 java.io.ByteArrayInputStream的输入:

public InputStream createInputStream(String s, String charset)
throws java.io.UnsupportedEncodingException {


return new ByteArrayInputStream(s.getBytes(charset));
}

但这意味着将整个 String作为一个字节数组实体化在内存中,并且违背了流的目的。在大多数情况下,这没有什么大不了的,但是我正在寻找一些能够保持流的意图的东西——尽可能少的数据在内存中被(重新)实现。

84644 次浏览

Well, one possible way is to:

  • Create a PipedOutputStream
  • Pipe it to a PipedInputStream
  • Wrap an OutputStreamWriter around the PipedOutputStream (you can specify the encoding in the constructor)
  • Et voilá, anything you write to the OutputStreamWriter can be read from the PipedInputStream!

Of course, this seems like a rather hackish way to do it, but at least it is a way.

A solution is to roll your own, creating an InputStream implementation that likely would use java.nio.charset.CharsetEncoder to encode each char or chunk of chars to an array of bytes for the InputStream as necessary.

To my mind, the easiest way to do this is by pushing the data through a Writer:

public class StringEmitter {
public static void main(String[] args) throws IOException {
class DataHandler extends OutputStream {
@Override
public void write(final int b) throws IOException {
write(new byte[] { (byte) b });
}
@Override
public void write(byte[] b) throws IOException {
write(b, 0, b.length);
}
@Override
public void write(byte[] b, int off, int len)
throws IOException {
System.out.println("bytecount=" + len);
}
}


StringBuilder sample = new StringBuilder();
while (sample.length() < 100 * 1000) {
sample.append("sample");
}


Writer writer = new OutputStreamWriter(
new DataHandler(), "UTF-16");
writer.write(sample.toString());
writer.close();
}
}

The JVM implementation I'm using pushed data through in 8K chunks, but you could have some affect on the buffer size by reducing the number of characters written at one time and calling flush.


An alternative to writing your own CharsetEncoder wrapper to use a Writer to encode the data, though it is something of a pain to do right. This should be a reliable (if inefficient) implementation:

/** Inefficient string stream implementation */
public class StringInputStream extends InputStream {


/* # of characters to buffer - must be >=2 to handle surrogate pairs */
private static final int CHAR_CAP = 8;


private final Queue<Byte> buffer = new LinkedList<Byte>();
private final Writer encoder;
private final String data;
private int index;


public StringInputStream(String sequence, Charset charset) {
data = sequence;
encoder = new OutputStreamWriter(
new OutputStreamBuffer(), charset);
}


private int buffer() throws IOException {
if (index >= data.length()) {
return -1;
}
int rlen = index + CHAR_CAP;
if (rlen > data.length()) {
rlen = data.length();
}
for (; index < rlen; index++) {
char ch = data.charAt(index);
encoder.append(ch);
// ensure data enters buffer
encoder.flush();
}
if (index >= data.length()) {
encoder.close();
}
return buffer.size();
}


@Override
public int read() throws IOException {
if (buffer.size() == 0) {
int r = buffer();
if (r == -1) {
return -1;
}
}
return 0xFF & buffer.remove();
}


private class OutputStreamBuffer extends OutputStream {


@Override
public void write(int i) throws IOException {
byte b = (byte) i;
buffer.add(b);
}


}


}

If you don't mind a dependency on the commons-io package, then you could use the IOUtils.toInputStream(String text) method.

Update: This answer is precisely what the OP doesn't want. Please read the other answers.

For those cases when we don't care about the data being re-materialized in memory, please use:

new ByteArrayInputStream(str.getBytes("UTF-8"))

I know this is an old question but I had the same problem myself today, and this was my solution:

public static InputStream getStream(final CharSequence charSequence) {
return new InputStream() {
int index = 0;
int length = charSequence.length();
@Override public int read() throws IOException {
return index>=length ? -1 : charSequence.charAt(index++);
}
};
}

There is an adapter from Apache Commons-IO which adapts from Reader to InputStream, which is named ReaderInputStream.

Example code:

@Test
public void testReaderInputStream() throws IOException {
InputStream inputStream = new ReaderInputStream(new StringReader("largeString"), StandardCharsets.UTF_8);
Assert.assertEquals("largeString", IOUtils.toString(inputStream, StandardCharsets.UTF_8));
}

Reference: https://stackoverflow.com/a/27909221/5658642

You can take help of org.hsqldb.lib library.

public StringInputStream(String paramString)
{
this.str = paramString;
this.available = (paramString.length() * 2);
}