java 给定InputStream替换字符并生成OutputStream

azpvetkf  于 2023-10-14  发布在  Java
关注(0)|答案(3)|浏览(118)

我有很多大规模的文件,我需要通过替换某些字符转换为CSV。
我正在寻找可靠的方法给定InputStream返回OutputStream和替换所有字符c1c2
这里的技巧是并行读取和写入,我不能在内存中容纳整个文件。
如果我想同时读和写,我需要在单独的线程中运行它吗?
非常感谢你的建议。

66bbxpm5

66bbxpm51#

要将数据从输入流复制到输出流,您可以在阅读数据时写入数据,每次一个字节(或字符)或一行。
下面是一个例子,读取一个文件,将所有的'x'字符转换为'y'。

BufferedInputStream in = new BufferedInputStream(new FileInputStream("input.dat"));
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream("output.dat"));
int ch;
while((ch = in.read()) != -1) {
        if (ch == 'x') ch = 'y';
        out.write(ch);
}
out.close();
in.close();

或者,如果可以使用Reader并一次处理一行,则可以使用此方法:

BufferedReader reader = new BufferedReader(new FileReader("input.dat"));
PrintWriter writer = new PrintWriter(
      new BufferedOutputStream(new FileOutputStream("output.dat")));
String str;
while ((str = reader.readLine()) != null) {
    str = str.replace('x', 'y');     // replace character at a time
    str = str.replace("abc", "ABC"); // replace string sequence
    writer.println(str);
}
writer.close();
reader.close();
  • BufferedInputStream* 和 BufferedReader 提前读取,并在缓冲区中保留8K字符以提高性能。可以处理非常大的文件,同时一次只在内存中保留8K的字符。
ftf50wuq

ftf50wuq2#

FileWriter writer = new FileWriter("Report.csv");
BufferedReader reader = new BufferedReader(new InputStreamReader(YOURSOURCE, Charsets.UTF_8));
String line;
while ((line = reader.readLine()) != null) {
    line.replace('c1', 'c2');
    writer.append(line);
    writer.append('\n');
}
writer.flush();
writer.close();
j2cgzkjk

j2cgzkjk3#

你可以在这里找到相关的答案:Filter (search and replace) array of bytes in an InputStream
我在那个线程中接受了@aioobe的答案,并在Java中构建了替换输入流模块,你可以在我的GitHub gist中找到它:https://gist.github.com/lhr0909/e6ac2d6dd6752871eb57c4b083799947
源代码也放在这里:

import java.io.FilterInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.Queue;

/**
 * Created by simon on 8/29/17.
 */
public class ReplacingInputStream extends FilterInputStream {

    private Queue<Integer> inQueue, outQueue;
    private final byte[] search, replacement;

    public ReplacingInputStream(InputStream in, String search, String replacement) {
        super(in);

        this.inQueue = new LinkedList<>();
        this.outQueue = new LinkedList<>();

        this.search = search.getBytes();
        this.replacement = replacement.getBytes();
    }

    private boolean isMatchFound() {
        Iterator<Integer> iterator = inQueue.iterator();

        for (byte b : search) {
            if (!iterator.hasNext() || b != iterator.next()) {
                return false;
            }
        }

        return true;
    }

    private void readAhead() throws IOException {
        // Work up some look-ahead.
        while (inQueue.size() < search.length) {
            int next = super.read();
            inQueue.offer(next);

            if (next == -1) {
                break;
            }
        }
    }

    @Override
    public int read() throws IOException {
        // Next byte already determined.

        while (outQueue.isEmpty()) {
            readAhead();

            if (isMatchFound()) {
                for (byte a : search) {
                    inQueue.remove();
                }

                for (byte b : replacement) {
                    outQueue.offer((int) b);
                }
            } else {
                outQueue.add(inQueue.remove());
            }
        }

        return outQueue.remove();
    }

    @Override
    public int read(byte b[]) throws IOException {
        return read(b, 0, b.length);
    }

    // copied straight from InputStream inplementation, just needed to to use `read()` from this class
    @Override
    public int read(byte b[], int off, int len) throws IOException {
        if (b == null) {
            throw new NullPointerException();
        } else if (off < 0 || len < 0 || len > b.length - off) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return 0;
        }

        int c = read();
        if (c == -1) {
            return -1;
        }
        b[off] = (byte)c;

        int i = 1;
        try {
            for (; i < len ; i++) {
                c = read();
                if (c == -1) {
                    break;
                }
                b[off + i] = (byte)c;
            }
        } catch (IOException ee) {
        }
        return i;
    }
}

相关问题