如果您有java.io.InputStream对象,应该如何处理该对象并生成String?
假设我有一个包含文本数据的InputStream,我想将其转换为String,例如,我可以将其写入日志文件。
获取InputStream并将其转换为字符串的最简单方法是什么?
public String convertStreamToString(InputStream is) {
// ???
}
如果您有java.io.InputStream对象,应该如何处理该对象并生成String?
假设我有一个包含文本数据的InputStream,我想将其转换为String,例如,我可以将其写入日志文件。
获取InputStream并将其转换为字符串的最简单方法是什么?
public String convertStreamToString(InputStream is) {
// ???
}
当前回答
快速简便:
String result = (String)new ObjectInputStream( inputStream ).readObject();
其他回答
下面的代码对我有用。
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
byte[] buffer = new byte[bi.available() ];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
请注意,根据Java文档,available()方法可能不适用于InputStream,但始终适用于BufferedInputStream。如果您不想使用available()方法,我们可以始终使用以下代码
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
File f = new File(url.getPath());
byte[] buffer = new byte[ (int) f.length()];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
我不确定是否会有任何编码问题。如果代码有任何问题,请发表评论。
Kotlin用户只需:
println(InputStreamReader(is).readText())
鉴于
readText()
是Kotlin标准库的内置扩展方法。
String inputStreamToString(InputStream inputStream, Charset charset) throws IOException {
try (
final StringWriter writer = new StringWriter();
final InputStreamReader reader = new InputStreamReader(inputStream, charset)
) {
reader.transferTo(writer);
return writer.toString();
}
}
纯Java标准库解决方案-无库自Java 10以来-Reader#transferTo(Java.io.Writer)无环溶液无新行字符处理
我已经写了一个这样的课,所以我想我会和大家分享。有时候,您不想仅仅为了一件事而添加Apache Commons,并且想要比Scanner更笨的东西,它不检查内容。
用法如下
// Read from InputStream
String data = new ReaderSink(inputStream, Charset.forName("UTF-8")).drain();
// Read from File
data = new ReaderSink(file, Charset.forName("UTF-8")).drain();
// Drain input stream to console
new ReaderSink(inputStream, Charset.forName("UTF-8")).drainTo(System.out);
以下是ReaderSink的代码:
import java.io.*;
import java.nio.charset.Charset;
/**
* A simple sink class that drains a {@link Reader} to a {@link String} or
* to a {@link Writer}.
*
* @author Ben Barkay
* @version 2/20/2014
*/
public class ReaderSink {
/**
* The default buffer size to use if no buffer size was specified.
*/
public static final int DEFAULT_BUFFER_SIZE = 1024;
/**
* The {@link Reader} that will be drained.
*/
private final Reader in;
/**
* Constructs a new {@code ReaderSink} for the specified file and charset.
* @param file The file to read from.
* @param charset The charset to use.
* @throws FileNotFoundException If the file was not found on the filesystem.
*/
public ReaderSink(File file, Charset charset) throws FileNotFoundException {
this(new FileInputStream(file), charset);
}
/**
* Constructs a new {@code ReaderSink} for the specified {@link InputStream}.
* @param in The {@link InputStream} to drain.
* @param charset The charset to use.
*/
public ReaderSink(InputStream in, Charset charset) {
this(new InputStreamReader(in, charset));
}
/**
* Constructs a new {@code ReaderSink} for the specified {@link Reader}.
* @param in The reader to drain.
*/
public ReaderSink(Reader in) {
this.in = in;
}
/**
* Drains the data from the underlying {@link Reader}, returning a {@link String} containing
* all of the read information. This method will use {@link #DEFAULT_BUFFER_SIZE} for
* its buffer size.
* @return A {@link String} containing all of the information that was read.
*/
public String drain() throws IOException {
return drain(DEFAULT_BUFFER_SIZE);
}
/**
* Drains the data from the underlying {@link Reader}, returning a {@link String} containing
* all of the read information.
* @param bufferSize The size of the buffer to use when reading.
* @return A {@link String} containing all of the information that was read.
*/
public String drain(int bufferSize) throws IOException {
StringWriter stringWriter = new StringWriter();
drainTo(stringWriter, bufferSize);
return stringWriter.toString();
}
/**
* Drains the data from the underlying {@link Reader}, writing it to the
* specified {@link Writer}. This method will use {@link #DEFAULT_BUFFER_SIZE} for
* its buffer size.
* @param out The {@link Writer} to write to.
*/
public void drainTo(Writer out) throws IOException {
drainTo(out, DEFAULT_BUFFER_SIZE);
}
/**
* Drains the data from the underlying {@link Reader}, writing it to the
* specified {@link Writer}.
* @param out The {@link Writer} to write to.
* @param bufferSize The size of the buffer to use when reader.
*/
public void drainTo(Writer out, int bufferSize) throws IOException {
char[] buffer = new char[bufferSize];
int read;
while ((read = in.read(buffer)) > -1) {
out.write(buffer, 0, read);
}
}
}
异-8859-1
如果您知道输入流的编码是ISO-8859-1或ASCII,这里有一种非常高效的方法来实现这一点。它(1)避免了StringWriter的内部StringBuffer中存在的不必要的同步,(2)避免了InputStreamReader的开销,(3)最小化了必须复制StringBuilder的内部字符数组的次数。
public static String iso_8859_1(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
chars.append((char)(buffer[i] & 0xFF));
}
}
return chars.toString();
}
UTF-8型
对于使用UTF-8编码的流,可以使用相同的通用策略:
public static String utf8(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
int state = 0;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
if ((state = nextStateUtf8(state, buffer[i])) >= 0) {
chars.appendCodePoint(state);
} else if (state == -1) { //error
state = 0;
chars.append('\uFFFD'); //replacement char
}
}
}
return chars.toString();
}
其中nextStateUtf8()函数定义如下:
/**
* Returns the next UTF-8 state given the next byte of input and the current state.
* If the input byte is the last byte in a valid UTF-8 byte sequence,
* the returned state will be the corresponding unicode character (in the range of 0 through 0x10FFFF).
* Otherwise, a negative integer is returned. A state of -1 is returned whenever an
* invalid UTF-8 byte sequence is detected.
*/
static int nextStateUtf8(int currentState, byte nextByte) {
switch (currentState & 0xF0000000) {
case 0:
if ((nextByte & 0x80) == 0) { //0 trailing bytes (ASCII)
return nextByte;
} else if ((nextByte & 0xE0) == 0xC0) { //1 trailing byte
if (nextByte == (byte) 0xC0 || nextByte == (byte) 0xC1) { //0xCO & 0xC1 are overlong
return -1;
} else {
return nextByte & 0xC000001F;
}
} else if ((nextByte & 0xF0) == 0xE0) { //2 trailing bytes
if (nextByte == (byte) 0xE0) { //possibly overlong
return nextByte & 0xA000000F;
} else if (nextByte == (byte) 0xED) { //possibly surrogate
return nextByte & 0xB000000F;
} else {
return nextByte & 0x9000000F;
}
} else if ((nextByte & 0xFC) == 0xF0) { //3 trailing bytes
if (nextByte == (byte) 0xF0) { //possibly overlong
return nextByte & 0x80000007;
} else {
return nextByte & 0xE0000007;
}
} else if (nextByte == (byte) 0xF4) { //3 trailing bytes, possibly undefined
return nextByte & 0xD0000007;
} else {
return -1;
}
case 0xE0000000: //3rd-to-last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0x80000000: //3rd-to-last continuation byte, check overlong
return (nextByte & 0xE0) == 0xA0 || (nextByte & 0xF0) == 0x90 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0xD0000000: //3rd-to-last continuation byte, check undefined
return (nextByte & 0xF0) == 0x80 ? currentState << 6 | nextByte & 0x9000003F : -1;
case 0x90000000: //2nd-to-last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xA0000000: //2nd-to-last continuation byte, check overlong
return (nextByte & 0xE0) == 0xA0 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xB0000000: //2nd-to-last continuation byte, check surrogate
return (nextByte & 0xE0) == 0x80 ? currentState << 6 | nextByte & 0xC000003F : -1;
case 0xC0000000: //last continuation byte
return (nextByte & 0xC0) == 0x80 ? currentState << 6 | nextByte & 0x3F : -1;
default:
return -1;
}
}
自动检测编码
如果您的输入流是使用ASCII、ISO-8859-1或UTF-8编码的,但您不确定是哪一种,我们可以使用与上一种方法类似的方法,但使用额外的编码检测组件在返回字符串之前自动检测编码。
public static String autoDetect(InputStream is) throws IOException {
StringBuilder chars = new StringBuilder(Math.max(is.available(), 4096));
byte[] buffer = new byte[4096];
int n;
int state = 0;
boolean ascii = true;
while ((n = is.read(buffer)) != -1) {
for (int i = 0; i < n; i++) {
if ((state = nextStateUtf8(state, buffer[i])) > 0x7F)
ascii = false;
chars.append((char)(buffer[i] & 0xFF));
}
}
if (ascii || state < 0) { //probably not UTF-8
return chars.toString();
}
//probably UTF-8
int pos = 0;
char[] charBuf = new char[2];
for (int i = 0, len = chars.length(); i < len; i++) {
if ((state = nextStateUtf8(state, (byte)chars.charAt(i))) >= 0) {
boolean hi = Character.toChars(state, charBuf, 0) == 2;
chars.setCharAt(pos++, charBuf[0]);
if (hi) {
chars.setCharAt(pos++, charBuf[1]);
}
}
}
return chars.substring(0, pos);
}
如果您的输入流的编码既不是ISO-8859-1,也不是ASCII,也不是UTF-8,那么我就遵从已经存在的其他答案。