NodeJS 如何下载puppeteer在新标签页打开的pdf文件?

hgqdbh6s  于 2023-03-12  发布在  Node.js
关注(0)|答案(1)|浏览(353)

我试图从网站下载发票使用 puppet 戏,我刚刚开始学习 puppet 戏。我正在使用节点创建和执行代码。我已经设法登录并导航到发票页面,但它在新的选项卡中打开,所以,代码没有检测到它,因为它不是活动选项卡。这是我使用的代码:

const puppeteer = require('puppeteer')

const SECRET_EMAIL = 'emailid'
const SECRET_PASSWORD = 'password'

const main = async () => {
  const browser = await puppeteer.launch({
    headless: false,
  })
  const page = await browser.newPage()
  await page.goto('https://my.apify.com/sign-in', { waitUntil: 'networkidle2' })
  await page.waitForSelector('div.sign_shared__SignForm-sc-1jf30gt-2.kFKpB')
  await page.type('input#email', SECRET_EMAIL)
  await page.type('input#password', SECRET_PASSWORD)
  await page.click('input[type="submit"]')
  await page.waitForSelector('#logged-user')
  await page.goto('https://my.apify.com/billing#/invoices', { waitUntil: 'networkidle2' })
  await page.waitForSelector('#reactive-table-1')
  await page.click('#reactive-table-1 > tbody > tr:nth-child(1) > td.number > a')
  const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())))
  const page2 = await newPagePromise
  await page2.bringToFront()
  await page2.screenshot({ path: 'apify1.png' })
  //await browser.close()
}

main()

在上面的代码我只是想截图。有人能帮助我吗?

fcg9iug3

fcg9iug31#

下面是一个解决上面评论中提到的 chrome 问题的例子。根据你的具体需求和用例进行调整。基本上,你需要捕获新页面(目标),然后执行任何需要执行的操作来下载文件,如果没有其他方法适合您,我可能会将其作为缓冲区传递给Node,如下例所示(包括通过获取直接请求下载位置,或者最好是后端的某个请求库)

const [PDF_page] = await Promise.all([
    browser
        .waitForTarget(target => target.url().includes('my.apify.com/account/invoices/' && target).then(target => target.page()),
    ATT_page.click('#reactive-table-1 > tbody > tr:nth-child(1) > td.number > a'),
]);

const asyncRes = PDF_page.waitForResponse(response =>
    response
        .request()
        .url()
        .includes('my.apify.com/account/invoices'));

await PDF_page.reload();
const res = await asyncRes;
const url = res.url();
const headers = res.headers();

if (!headers['content-type'].includes('application/pdf')) {
    await PDF_page.close();
    return null;
}

const options = {
    // target request options
};

const pdfAb = await PDF_page.evaluate(
    async (url, options) => {
        function bufferToBase64(buffer) {
            return btoa(
                new Uint8Array(buffer).reduce((data, byte) => {
                    return data + String.fromCharCode(byte);
                }, ''),
            );
        }

        return await fetch(url, options)
            .then(response => response.arrayBuffer())
            .then(arrayBuffer => bufferToBase64(arrayBuffer));
    },
    url,
    options,
);

const pdf = Buffer.from(pdfAb, 'base64');
await PDF_page.close();

相关问题