如何使用python上传图片到google-lens?

2sbarzqh  于 2023-03-28  发布在  Python
关注(0)|答案(2)|浏览(243)

我试图用python requests抓取**google-lens**,但找不到上传图像的请求或如何解码。
请求(其答案是图像分析)如下:

import requests

cookies = {
    'CONSENT': 'PENDING+XXX',
    'SOCS': 'XXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'HSID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'SSID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'APISID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'SAPISID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    '__Secure-1PAPISID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'SID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx.',
    '__Secure-1PSID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'SIDCC': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    '__Secure-1PSIDCC': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'AEC': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'NID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    'OTZ': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
    '__Secure-ENID': 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
}

headers = {
    'authority': 'lens.google.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'accept-language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7',
    'referer': 'https://lens.google.com/upload?hl=de-CH&re=df&st=1675340672651&plm=ChAIARIMCIDX7p4GEMDxtbYC&ep=gisbubb',
    'sec-ch-ua': '"Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"',
    'sec-ch-ua-arch': '"x86"',
    'sec-ch-ua-bitness': '"64"',
    'sec-ch-ua-full-version': '"109.0.5414.120"',
    'sec-ch-ua-full-version-list': '"Not_A Brand";v="99.0.0.0", "Google Chrome";v="109.0.5414.120", "Chromium";v="109.0.5414.120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-model': '""',
    'sec-ch-ua-platform': '"Windows"',
    'sec-ch-ua-platform-version': '"10.0.0"',
    'sec-ch-ua-wow64': '?0',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
    'x-client-data': 'CJe2yQEIorbJAQjBtskBCKmdygEIte3KAQiTocsBCPKCzQEIv4TNAQiAjM0BCIiMzQEI14zNAQiGjc0BCMeNzQEI1Y7NAQj2js0BCNLhrAII8fStAg==',
}

p = "AfVzNa_TGIdeDaL6ZaPXF7Wx8FCDSF8grbjYLUPXuk5_7Ia3vUCoQ5BUa8slWojngiUp-88dvc59Ohx3_22wAH3GXJHgaT-bLnpAm0r-5YjYIErXRCYJJ0ndUQUxxdF1JptYTdjqaEXXRR87igdc_xBCpxGpdXkXrf7Nf226SST0MdF3vF7mmtvJyklqA8494byV6bj_I92D3vihWglO3OV6phVD1zsqVyfSU_qZvtuEPEA59LETwQ4SKlztDy0fMWmBGgCsXiCuz2bWH2bOIRqUFo0stSVAvscHpY0iIVcEyRYQhXBxRkibV6UvnSIK2w_JQZV7TP4AkRRBPCwy2iKu-KJS6R28OZ3ABqIth7IPDLGymZKQ20vl_HPjXBHAgHzZgFLTs-AfR7zkmsnyWQ9FB77YVA"

response = requests.get(
    'https://lens.google.com/search?p='+p+"%3D%3D&ep=gisbubb&hl=en-US&re=df&st=1675340672651&plm=ChAIARIMCIDX7p4GEMDxtbYCCg8IFRILCIDX7p4GENCgvHUKDwgWEgsIgNfungYQkM3CdQoPCBMSCwiA1%2B6eBhCA/MJ1ChAIFBIMCIDX7p4GEOjKj7MC",
    cookies=cookies,
    headers=headers,
)

url中的p参数在我看来像是数据,但是:

  • 也许太短的图像?
  • 我无法将字符串Base64解码为图像。有什么想法吗?

p在我的例子中是:

AfVzNa_TGIdeDaL6ZaPXF7Wx8FCDSF8grbjYLUPXuk5_7Ia3vUCoQ5BUa8slWojngiUp-88dvc59Ohx3_22wAH3GXJHgaT-bLnpAm0r-5YjYIErXRCYJJ0ndUQUxxdF1JptYTdjqaEXXRR87igdc_xBCpxGpdXkXrf7Nf226SST0MdF3vF7mmtvJyklqA8494byV6bj_I92D3vihWglO3OV6phVD1zsqVyfSU_qZvtuEPEA59LETwQ4SKlztDy0fMWmBGgCsXiCuz2bWH2bOIRqUFo0stSVAvscHpY0iIVcEyRYQhXBxRkibV6UvnSIK2w_JQZV7TP4AkRRBPCwy2iKu-KJS6R28OZ3ABqIth7IPDLGymZKQ20vl_HPjXBHAgHzZgFLTs-AfR7zkmsnyWQ9FB77YVA==

在网络选项卡上传图像时,我找不到任何其他数据请求。
另外,我如何使用python将图像编码为这样的字符串?

l3zydbqr

l3zydbqr1#

正如你所猜测的,数据似乎太短的图像(在基地64格式或其他编码).我们不能肯定地告诉发生了什么事在谷歌图像搜索内部程序,但以下情况浮现在脑海中(通常这样的搜索系统的工作原理是这样的):
用户首先将图像上传到Google透镜,然后Google在其内部数据库中为上传的图像分配一个ID。您将在搜索URL和代码中看到该ID作为p参数。然后图像搜索使用该ID引用其内部数据库中上传的图像。
只是为了确保像p这样的小字符串不能容纳整个图像,运行base64.b64encode(open('path/to/image.png', 'rb').read()),结果是一个很长的字符串。
如果您更精确地拦截Google Chrome中的网络选项卡,您会注意到用户首先被重定向到类似https://lens.google.com/upload&re=df&st=some_number_hereplm=intenal_database_identifier的地址,然后用户将被重定向到地址栏中带有p参数的主搜索页面。
因此,为了使用谷歌图像搜索最好的解决方案是使用官方的API和库,如this。但如果你坚持使用非官方的方式,像selenium这样的东西可以像浏览器一样运行,并获得你正在寻找的参数。

46qrfjad

46qrfjad2#

作为替代方案,还有SerpApi的Google Lens API。这是一个付费API,有一个免费的计划,可以在其后端处理和解析块。
为了上传图像,您需要指定其url:

params = {
  "url": "https://i.imgur.com/HBrB8p0.png", # URL of an image to perform the Google Lens search
  # other parameters
}

检查在线IDE中的简单代码。

from serpapi import GoogleSearch
import json

params = {
  "engine": "google_lens",                  # search engine. Google, Bing, Yahoo, Naver, Baidu...
  "url": "https://i.imgur.com/HBrB8p0.png", # URL of an image to perform the Google Lens search
  "api_key": "..."                          # serpapi key, https://serpapi.com/manage-api-key
}

search = GoogleSearch(params)               # where data extraction happens on the backend
results = search.get_dict()                 # JSON -> Python dict
visual_matches = results["visual_matches"]

print(json.dumps(visual_matches, indent=2, ensure_ascii=False))

输出示例:

{
    "position": 1,
    "title": "File:Danny DeVito by Gage Skidmore.jpg - Wikipedia",
    "link": "https://en.wikipedia.org/wiki/File:Danny_DeVito_by_Gage_Skidmore.jpg",
    "source": "wikipedia.org",
    "source_icon": "https://encrypted-tbn2.gstatic.com/favicon-tbn?q=tbn:ANd9GcRLIcgPlYxOaoBg0MnSobLYyflrgF_RLdwAY09AXHWGy2jqWQnuIBNCY5I1BuzY7jeAJga0y0b9htBHe94i3Pg4B0NhHMNDVsmS-FVRKL014d-Xf6sX",
    "thumbnail": "https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTROFuWxRRMPIYZy_jI-rJ2Z6Ovi5R5ud7hipf-q6KXc-xmbObf"
  },
  {
    "position": 2,
    "title": "Danny DeVito - Wikipèdia",
    "link": "https://oc.wikipedia.org/wiki/Danny_DeVito",
    "source": "wikipedia.org",
    "source_icon": "https://encrypted-tbn0.gstatic.com/favicon-tbn?q=tbn:ANd9GcS4JBZumVMngTbiohbN1v6btphEzriH0ywY2I43F4DsPvX2g0xPTtw7__HJ8V2-eWBrdSIZseWxYn2gcf4EVcXzwui5ASKpxKMJ1cc12u-2O4Y4-jwj",
    "thumbnail": "https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcRngxWi4JZCYtYGvkwtbjm_z0v5Jt5uWTeR7QjBS0D10LiLd7sU"
  },

您可以转到Google Lens playground尝试不同的图像上传选项。
免责声明我为SerpApi工作。

相关问题