curl 如何从github repo下载一个zip文件并读取内存中的内容?

wz1wpwve  于 2022-11-13  发布在  Git
关注(0)|答案(1)|浏览(225)

我已经上传了一个压缩文件7-zip选项add to .zip只包含一个名为text.txt的文件到这个GitHub repo,我如何可以读取文件text.txt的内容 * 而不 * 写入磁盘?
我正在使用curl将zip文件下载到内存中:

#include <curl/curl.h>

    static size_t WriteMemoryCallback(void* contents, size_t size, size_t nmemb,
                                      void* userp) {
        size_t realsize = size * nmemb;
        auto& mem = *static_cast<std::string*>(userp);
        mem.append(static_cast<char*>(contents), realsize);
        return realsize;
    }
    
    std::string Download(const std::string& url) 
    {
        CURL* curl_handle;
        CURLcode res;
    
        std::string chunk;
    
        curl_global_init(CURL_GLOBAL_ALL);
    
        curl_handle = curl_easy_init();
        curl_easy_setopt(curl_handle, CURLOPT_URL, url.c_str());
        curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);
        curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, &chunk);
        curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/1.0");
    
        // added options that may be required
        curl_easy_setopt(curl_handle, CURLOPT_FOLLOWLOCATION, 1L);  // redirects
        curl_easy_setopt(curl_handle, CURLOPT_HTTPPROXYTUNNEL, 1L); // corp. proxies etc.
        curl_easy_setopt(curl_handle, CURLOPT_VERBOSE, 1L); // we want it all
        // curl_easy_setopt(curl_handle, CURLOPT_REDIR_PROTOCOLS, CURLPROTO_HTTP | CURLPROTO_HTTPS);
    
        res = curl_easy_perform(curl_handle);
    
        if(res != CURLE_OK) {
            std::cerr << "curl_easy_perform() failed: " << curl_easy_strerror(res) << '\n';
        } else {
            std::cout << chunk.size() << " bytes retrieved\n";
        }
    
        curl_easy_cleanup(curl_handle);
        curl_global_cleanup();
    
        return chunk;
    }

int _tmain(int argc, _TCHAR* argv[])
{
    std::string link = "https://github.com/R3uan3/test/raw/main/text.zip";
    auto data = Download(link);
}

在搜索能够解压缩内存上的zip的lib时,我找到了这个:libzip(欢迎使用任何lib)。
在搜索示例时,我找到了这个answer,但是他正在将一个zip文件从disk加载到内存中,并阅读它的内容。
如何读取与字符串data上的curl一起下载的zip
当我在调试器中可视化data的内容时,它显示PK,我尝试将其传递给zip *z,但z返回null

//Open the ZIP archive
        int err = 0;
        zip *z = zip_open(data.c_str(), 0, &err);
    
        //Search for the file of given name
        const char *name = "text.txt";
        struct zip_stat st;
        zip_stat_init(&st);
        zip_stat(z, name, 0, &st);
    
        //Alloc memory for its uncompressed contents
        char *contents = new char[st.size];
    
        //Read the compressed file
        zip_file *f = zip_fopen(z, name, 0);
        zip_fread(f, contents, st.size);
        zip_fclose(f);
o3imoua4

o3imoua41#

我忽略了问题中关于curl的所有内容,因为我们已经验证了您已经将zip文件正确地存储在内存中。
我怎么能读到字符串数据上的zip?
由于您将整个zip文件存储在内存中,因此您需要从chunk.data()创建一个zip_source,并使用该zip_source打开归档文件,然后打开归档文件中的各个文件。
以下是如何实现的(没有错误检查-您需要添加错误检查):

{
    // ...
    zip_error_t ze; // for errors

    // create a zip_source from the data you've stored in memory
    zip_source_t* zs = zip_source_buffer_create(chunk.data(), chunk.size(), 0, &ze);

    // open the archive from the zip_source
    zip_t* zip = zip_open_from_source(zs, ZIP_CHECKCONS | ZIP_RDONLY, &ze);

    // read how many files you've got in there
    zip_int64_t entries = zip_get_num_entries(zip, 0);

    std::cout << entries << '\n';

    // loop over the entries in the archive
    for(zip_int64_t idx = 0; idx < entries; ++idx) {
        std::cout << zip_get_name(zip, idx, ZIP_FL_ENC_STRICT) << '\n';

        // open the file at this index
        zip_file_t* fp = zip_fopen_index(zip, idx, 0);

        // process the file
        zip_int64_t len;
        char buf[1024];
        while((len = zip_fread(fp, buf, sizeof buf)) > 0) {
            std::cout << "read " << len << " bytes\n";
            // do something with the `len` bytes you have in `buf`
        }
        zip_fclose(fp); // close this file
    }
    zip_close(zip); // close the whole archive
}

相关问题