从 OutputStream 创建 InputStream 的最有效方法

翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳翻译: 奇芳 http://blog.ostermiller.org/convert-java-outputstream-inputstream 描述如何从 OutputStream 创建 InputStream:

new ByteArrayInputStream(out.toByteArray())

其他选择是使用 PipedStreams 和新线程,这是繁琐的。

我不喜欢在内存中将许多兆字节复制到新的字节数组中。 有没有一个图书馆能更有效地做到这一点?

编辑:

根据 Laurence Gonsalves 的建议,我尝试了 PipedStreams,结果发现它们并不是那么难对付。 下面是 clojure 中的示例代码:

(defn #^PipedInputStream create-pdf-stream [pdf-info]
(let [in-stream (new PipedInputStream)
out-stream (PipedOutputStream. in-stream)]
(.start (Thread. #(;Here you write into out-stream)))
in-stream))
109158 次浏览

If you don't want to copy all of the data into an in-memory buffer all at once then you're going to have to have your code that uses the OutputStream (the producer) and the code that uses the InputStream (the consumer) either alternate in the same thread, or operate concurrently in two separate threads. Having them operate in the same thread is probably much more complicated that using two separate threads, is much more error prone (you'll need to make sure that the consumer never blocks waiting for input, or you'll effectively deadlock) and would necessitate having the producer and consumer running in the same loop which seems way too tightly coupled.

So use a second thread. It really isn't that complicated. The page you linked to had reasonable example. Here's a somewhat modernized version, which also closes the streams:

try (PipedInputStream in = new PipedInputStream()) {
new Thread(() -> {
try (PipedOutputStream out = new PipedOutputStream(in)) {
writeDataToOutputStream(out);
} catch (IOException iox) {
// handle IOExceptions
}
}).start();
processDataFromInputStream(in);
}

There is another Open Source library called EasyStream that deals with pipes and thread in a transparent way. That isn't really complicated if everything goes well. Problems arise when (looking at Laurence Gonsalves example)

class1.putDataOnOutputStream(out);

Throws an exception. In that example the thread simply completes and the exception is lost, while the outer InputStream might be truncated.

Easystream deals with exception propagation and other nasty problems I've been debugging for about one year. (I'm the mantainer of the library: obviously my solution is the best one ;) ) Here is an example on how to use it:

final InputStreamFromOutputStream<String> isos = new InputStreamFromOutputStream<String>(){
@Override
public String produce(final OutputStream dataSink) throws Exception {
/*
* call your application function who produces the data here
* WARNING: we're in another thread here, so this method shouldn't
* write any class field or make assumptions on the state of the outer class.
*/
return produceMydata(dataSink)
}
};

There is also a nice introduction where all other ways to convert an OutputStream into an InputStream are explained. Worth to have a look.

I think the best way to connect InputStream to an OutputStream is through piped streams - available in java.io package, as follow:

// 1- Define stream buffer
private static final int PIPE_BUFFER = 2048;


// 2 -Create PipedInputStream with the buffer
public PipedInputStream inPipe = new PipedInputStream(PIPE_BUFFER);


// 3 -Create PipedOutputStream and bound it to the PipedInputStream object
public PipedOutputStream outPipe = new PipedOutputStream(inPipe);


// 4- PipedOutputStream is an OutputStream, So you can write data to it
// in any way suitable to your data. for example:
while (Condition) {
outPipe.write(mByte);
}


/*Congratulations:D. Step 4 will write data to the PipedOutputStream
which is bound to the PipedInputStream so after filling the buffer
this data is available in the inPipe Object. Start reading it to
clear the buffer to be filled again by the PipedInputStream object.*/

In my opinion there are two main advantages for this code:

1 - There is no additional consumption of memory except for the buffer.

2 - You don't need to handle data queuing manually

A simple solution that avoids copying the buffer is to create a special-purpose ByteArrayOutputStream:

public class CopyStream extends ByteArrayOutputStream {
public CopyStream(int size) { super(size); }


/**
* Get an input stream based on the contents of this output stream.
* Do not use the output stream after calling this method.
* @return an {@link InputStream}
*/
public InputStream toInputStream() {
return new ByteArrayInputStream(this.buf, 0, this.count);
}
}

Write to the above output stream as needed, then call toInputStream to obtain an input stream over the underlying buffer. Consider the output stream as closed after that point.

I usually try to avoid creating a separate thread because of the increased chance of deadlock, the increased difficulty of understanding the code, and the problems of dealing with exceptions.

Here's my proposed solution: a ProducerInputStream that creates content in chunks by repeated calls to produceChunk():

public abstract class ProducerInputStream extends InputStream {


private ByteArrayInputStream bin = new ByteArrayInputStream(new byte[0]);
private ByteArrayOutputStream bout = new ByteArrayOutputStream();


@Override
public int read() throws IOException {
int result = bin.read();
while ((result == -1) && newChunk()) {
result = bin.read();
}
return result;
}


@Override
public int read(byte[] b, int off, int len) throws IOException {
int result = bin.read(b, off, len);
while ((result == -1) && newChunk()) {
result = bin.read(b, off, len);
}
return result;
}


private boolean newChunk() {
bout.reset();
produceChunk(bout);
bin = new ByteArrayInputStream(bout.toByteArray());
return (bout.size() > 0);
}


public abstract void produceChunk(OutputStream out);


}