c++ 使用std：：getline进行boost：：iostream：：过滤iostream的替代方法

rdrgkggo 于 2023-01-10 发布在 iOS

关注(0)|答案(1)|浏览(204)

我有一个压缩成gz的二进制文件，我希望使用boost::iostream对其进行流式传输。在过去几个小时的网上搜索之后，我找到了一个不错的代码片段，除了std::getline之外，它可以完成我想要的任务：

try 
{
    std::ifstream file("../data.txt.gz", std::ios_base::in | std::ios_base::binary);
    boost::iostreams::filtering_istream in;
    in.push(boost::iostreams::gzip_decompressor());
    in.push(file);
    std::vector<std::byte> buffer;
    for(std::string str; std::getline(in, str); )
    {
        std::cout << "str length: " << str.length() << '\n';
        for(auto c : str){
            buffer.push_back(std::byte(c));
        }
        std::cout << "buffer size: " << buffer.size() << '\n';
        // process buffer 
        // ...
        // ...
    }
}
catch(const boost::iostreams::gzip_error& e) {
        std::cout << e.what() << '\n';
}

我想读取该文件，将其存储到某个中间缓冲区中，然后在流式传输该文件时填充该缓冲区。但是，std::getline使用\n分隔符，并且当它使用\n分隔符时，在输出字符串中不包括该分隔符。
有没有一种方法可以让我一次读取2048字节的数据？

c++

来源：https://stackoverflow.com/questions/75014142/alternative-to-using-stdgetline-for-boostiostreamfiltering-istream

1条答案

按热度按时间

x8diyxa71#

按照你想要的方式解压缩gzip流并不是很简单，一个选择是使用boost::iostreams::copy将gzip流解压缩成向量，但是由于你想以块的形式解压缩流（在你的帖子中提到了2k），这可能不是一个选择。
对于输入流，通常只需调用read()函数，指定缓冲区和要读入的字节数，然后调用gcount()来确定实际读取的字节数。不幸的是，filtering_istream或gzip_decompressor中似乎存在bug，或者gcount不受支持（应该是）因为它似乎总是返回请求的字节数，而不是实际读取的字节数。正如您可能想象的那样，这可能会在阅读文件的最后几个字节时导致问题，除非您提前知道要读取多少字节。
幸运的是，未压缩数据的大小存储在gzip文件的末尾，这意味着我们可以考虑到这一点，但我们只需要在解压缩循环中稍微努力一点。
下面是我用你喜欢的方式处理解压流的代码。它创建了两个向量-一个用于解压每个2k块，一个用于最终缓冲区。这是非常基本的，我还没有做任何事情来真正优化向量上的内存使用，但如果这是一个问题，我建议切换到一个单一的向量，调整它的大小到未压缩数据的长度。并调用read，将偏移量传递到正被读取的2k块的向量数据中。

#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <fstream>
#include <iostream>
#include <utility>

int main()
{
    namespace io = boost::iostreams;

    std::ifstream file("../data.txt.gz", std::ios_base::in | std::ios_base::binary);

    // Get the uncompressed size (stored in big endian, assume we're BE)
    uint32_t dataLeft;
    file.seekg(-4, std::ios_base::end);
    file.read(reinterpret_cast<char*>(&dataLeft), sizeof(dataLeft));
    file.seekg(0);

    // Set up the gzip stream
    io::filtering_istream in;
    in.push(io::gzip_decompressor());
    in.push(file);

    std::vector<std::byte> buffer, tmp(2048);
    for (auto toRead(std::min(tmp.size(), dataLeft));
        dataLeft && in.read(reinterpret_cast<char*>(tmp.data()), toRead);
        dataLeft -= toRead, toRead = std::min(tmp.size(), dataLeft))
    {
        tmp.resize(toRead);
        buffer.insert(buffer.end(), tmp.begin(), tmp.end());
        std::cout << "buffer size: " << buffer.size() << '\n';
    }
}

赞(0）回复(0）举报 2023-01-10

我来回答

c++ 使用std：：getline进行boost：：iostream：：过滤iostream的替代方法

1条答案

相关问题

热门标签

最新问答