axios GET请求在Postman中工作，但在使用请求包的NodeJS中不工作

uqxowvwt 于 2022-11-23 发布在 iOS

关注(0)|答案(1)|浏览(326)

我一直在尝试使用request-promise包将一个简单的web抓取脚本迁移到NodeJS，但我总是得到以下错误作为输出

403 This IP has been automatically blocked

但是，如果我使用我的浏览器或 Postman 触发请求，它工作得很完美（而且IP根本没有被阻止）
下面是我用于NodeJS的代码

const request = require('request-promise');
const main = async () => {
    const options = {
        url: 'https://sfbay.craigslist.org/d/software-qa-dba-etc/search/sof',
        headers: {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
            'Accept-Language':'en-US,en;q=0.5',
            'Cache-Control': 'no-cache'
        }
    }
    try {
        const htmlResult = await request.get(options);
        console.log(htmlResult);
    } catch (e) {
        console.log(e);
    }

}

main();

我也试过Axios，但输出是一样的。有什么想法吗？

axios

来源：https://stackoverflow.com/questions/66743883/get-request-works-in-postman-but-not-in-nodejs-using-request-package

1条答案

按热度按时间

mpgws1up1#

作为WebScrapingAPI的工程师，我建议你使用第三方的抓取提供程序。我试着在我的端为你的目标运行一个基本的Puppeteer脚本，但是它马上被阻止了。这意味着你至少要实现一个代理系统和一些基本的规避（参见puppeteer-extra-plugin-stealth）。
由于开发这样一个scraper需要额外的成本和时间，您可以选择使用一个成熟的web scraper，如我们在WebScrapingAPI提供的一个，它提供IP轮换，住宅代理和更多。
以下是您的站点的实现：

import axios from 'axios'

const payload = {
    api_key: '<YOUR_API_KEY>',
    url: 'https://sfbay.craigslist.org/d/software-qa-dba-etc/search/sof',
    render_js: 1,
    proxy_type: 'residential',
    country: 'us',
    device: 'mobile'
}

const url = `https://api.webscrapingapi.com/v1?${new URLSearchParams(payload).toString()}`

axios(url).then(response=>{
    const html=response.data
    console.log(html)
}).catch(err=>console.log("This is an error"+err))

仅供参考，以下是您可以测试的Puppeteer脚本：

import puppeteer from 'puppeteer-extra';
import { executablePath } from 'puppeteer'
import StealthPlugin from 'puppeteer-extra-plugin-stealth' 

(async () => {
    puppeteer.use(StealthPlugin())
    
    const browser = await puppeteer.launch({
        headless: false,
        executablePath: executablePath(),
    })
    const page = await browser.newPage()
    await page.goto('https://sfbay.craigslist.org/d/software-qa-dba-etc/search/sof')

    const html = await page.content()
    console.log(html)

    await browser.close()
})()

赞(0）回复(0）举报 2022-11-23

我来回答

axios GET请求在Postman中工作，但在使用请求包的NodeJS中不工作

1条答案

相关问题

热门标签

最新问答