python 抓取www.example.com API返回看起来像加密数据的内容atptour.com API returns what looks like encrypted data

luaexgnf  于 2023-06-20  发布在  Python
关注(0)|答案(1)|浏览(175)

我正在尝试从下面的页面中抓取比赛统计数据:

https://www.atptour.com/en/scores/stats-centre/archive/2022/407/MS002

与其构建一个复杂的Selenium scraper来读取JavaScript启用的位,我想我应该尝试使用requests来找到一个API来进行抓取。
通过查看“网络”选项卡,我认为应该从以下内容开始:

requests.get("https://itp-atp-sls.infosys-platforms.com/prod/api/match-beats/status/year/2022/eventId/407/matchId/MS002")

这确实返回了一个结果,但它是官样文章(至少对我来说)。
我猜这是某种加密的回复有没有办法像浏览器那样解密它?

编辑:

以下是回应:

{"lastModified":1663265556422,"response":"hlXzkPyyhwUYql2Nwl/3AAcRSsZHKf5LyqsAHqSWjP+ZHzfdmQ7bG2cOrf3YxwcZFIlsJNLJOSL/dSj/fFtjWHkeQd21inSUPOkbu2hSD2xMxEkyss8rOIVJAx6NmY9sap852VtmTc2CT4TdXXRduEK4fXASReIX3Eb9V+TMs24t5ow6w8aau+GWZLP9b32ALs4IZeea+dE3YcKtYrZOu/bV7ZLSawlontkgGN9s4QSjUhv43ifxkS6oDHGFkh+4pjjqfLDa2c0fA28otRZUF4uz+UvYAW2b9hZxBVJQU0E45Bf/myuQjZ14KtQr0NdxAMq53PZlki2hRVtnCDErA2e26cK9/bkC6Pz/J0N7rosTYw6TtDRGPYeqM3z645Uew3f3vEcSQLkWWxi1txQPxTbn1MT4HzRtnAbGJOF+GeaAKbwtSt2B86iHjkyEJ+ssmIMsARRjUmhdFmsMF6vuqA5pSgxvYTacg/yzZvy6HVhZBqTpPcaRJGt41efib3zQg8u++yKXdz8MnHicuz32w/osWzcMsC3Cwm5/a1tJZ48xFJdu8YgUsFS6ioNaO9V6vWz8imQZiPEZxd1FLfRynjS8LpvY3+83M2h+A0oExmcd4UaEMCqkklM1A7ssOXeDTqKS8UiZVM3zH6lzNI42QOZE+WYcPvwNzVLanJpZcKqlLupGfOiHuUclEwKrBL8h3wHtU6UmU+VoPJQM82b4pv5vJY/qlUgjLnaWk18A5UV9MF2b81iI3T8i4U8KGeovMhVLdq7YRZFdBG9djQgPRzwfofB/LRz5+aTwKwiTTsmvy4DMP/2iCB7Eiqr7OaKtuaj1n6vt2MdIstqTz/nDEkjLcdrspajdqHnTfUYLEVJvns6KPIKQaQ61I71G7vkEG4MtZ3PRgGy7/zR/B2qAzhaJmHYMZtOfE2OPcPXi3wi9tTYObYaGzpQIqkFGUtpa862bq8qMSXVUpfb8dvDTOyuvURD9FmSHeDHiO6DYhqxqQrfw1aRHK0vu6QcSsGF31vYnrRGR48nZgouqyzUv90Nc9hvyXBcEaYZpCG2qbAArBseD+RRtXeWV1yvV+C7oy68JOxgLJaL1AsLPX81WV9maPy2Ns3IJ64iNvKMebWFtETNtDPIs5amm+wFjERiQ85DK70wucEd3lWWQr7UddSO8U72whJXGbtsC2onskI75uLF3n7XX4goaHrj0IVB3kVqc4O1zMXWvCzype2EerR2E9K/qoBWh5PQRc4bPhrNdoYGSAh18AKtzVOqPgNgzXnW591r4pWMrWW8Tww89sayPZUnxOwDIaf6kFP74+34K+ZWKGVJA9YBPpKfGAfMgOYalnB7YMA4Tn4Hmt4OQtPeArwgR4DBW+HiQ+aFNK04="}
pbwdgjma

pbwdgjma1#

迟到总比不到好!

简而言之:他们只是使用查询中的lastModified字段作为密钥,用AES 128加密数据。

在其前端解码数据的代码段:

var Ou = function(t) {
        var e = function(t) {
            var e = (new Date).getTimezoneOffset()
              , n = new Date(t.getTime() + 60 * e * 1e3).getDate()
              , r = parseInt((n < 10 ? "0" + n : n).toString().split("").reverse().join(""))
              , i = t.getFullYear()
              , a = parseInt(i.toString().split("").reverse().join(""))
              , o = parseInt(t.getTime().toString(), 16).toString(36) + ((i + a) * (n + r)).toString(24)
              , s = o.length;
            if (s < 14)
                for (var c = 0; c < 14 - s; c++)
                    o += "0";
            else
                s > 14 && (o = o.substr(0, 14));
            return "#" + o + "$"
        }(new Date(t.lastModified))
          , n = Jo.a.enc.Utf8.parse(e)
          , r = Jo.a.enc.Utf8.parse(e.toUpperCase())
          , i = Jo.a.AES.decrypt(t.response, n, {
            iv: r,
            mode: Jo.a.mode.CBC,
            padding: Jo.a.pad.Pkcs7
        });
        return JSON.parse(i.toString(Jo.a.enc.Utf8))
    };

对它进行了一点重构,如果使用nodejs,安装npm i crypto-js后,您将获得解码后的数据:

// const CryptoJS = require("crypto-js"); // If Nodejs

const data = {"lastModified":1663265556422,"response":"hlXzkPyyhwUYql2Nwl/3AAcRSsZHKf5LyqsAHqSWjP+ZHzfdmQ7bG2cOrf3YxwcZFIlsJNLJOSL/dSj/fFtjWHkeQd21inSUPOkbu2hSD2xMxEkyss8rOIVJAx6NmY9sap852VtmTc2CT4TdXXRduEK4fXASReIX3Eb9V+TMs24t5ow6w8aau+GWZLP9b32ALs4IZeea+dE3YcKtYrZOu/bV7ZLSawlontkgGN9s4QSjUhv43ifxkS6oDHGFkh+4pjjqfLDa2c0fA28otRZUF4uz+UvYAW2b9hZxBVJQU0E45Bf/myuQjZ14KtQr0NdxAMq53PZlki2hRVtnCDErA2e26cK9/bkC6Pz/J0N7rosTYw6TtDRGPYeqM3z645Uew3f3vEcSQLkWWxi1txQPxTbn1MT4HzRtnAbGJOF+GeaAKbwtSt2B86iHjkyEJ+ssmIMsARRjUmhdFmsMF6vuqA5pSgxvYTacg/yzZvy6HVhZBqTpPcaRJGt41efib3zQg8u++yKXdz8MnHicuz32w/osWzcMsC3Cwm5/a1tJZ48xFJdu8YgUsFS6ioNaO9V6vWz8imQZiPEZxd1FLfRynjS8LpvY3+83M2h+A0oExmcd4UaEMCqkklM1A7ssOXeDTqKS8UiZVM3zH6lzNI42QOZE+WYcPvwNzVLanJpZcKqlLupGfOiHuUclEwKrBL8h3wHtU6UmU+VoPJQM82b4pv5vJY/qlUgjLnaWk18A5UV9MF2b81iI3T8i4U8KGeovMhVLdq7YRZFdBG9djQgPRzwfofB/LRz5+aTwKwiTTsmvy4DMP/2iCB7Eiqr7OaKtuaj1n6vt2MdIstqTz/nDEkjLcdrspajdqHnTfUYLEVJvns6KPIKQaQ61I71G7vkEG4MtZ3PRgGy7/zR/B2qAzhaJmHYMZtOfE2OPcPXi3wi9tTYObYaGzpQIqkFGUtpa862bq8qMSXVUpfb8dvDTOyuvURD9FmSHeDHiO6DYhqxqQrfw1aRHK0vu6QcSsGF31vYnrRGR48nZgouqyzUv90Nc9hvyXBcEaYZpCG2qbAArBseD+RRtXeWV1yvV+C7oy68JOxgLJaL1AsLPX81WV9maPy2Ns3IJ64iNvKMebWFtETNtDPIs5amm+wFjERiQ85DK70wucEd3lWWQr7UddSO8U72whJXGbtsC2onskI75uLF3n7XX4goaHrj0IVB3kVqc4O1zMXWvCzype2EerR2E9K/qoBWh5PQRc4bPhrNdoYGSAh18AKtzVOqPgNgzXnW591r4pWMrWW8Tww89sayPZUnxOwDIaf6kFP74+34K+ZWKGVJA9YBPpKfGAfMgOYalnB7YMA4Tn4Hmt4OQtPeArwgR4DBW+HiQ+aFNK04="};

function decode(data) {
  var e = formatDate(new Date(data.lastModified))
    , n = CryptoJS.enc.Utf8.parse(e)
    , r = CryptoJS.enc.Utf8.parse(e.toUpperCase())
    , i = CryptoJS.AES.decrypt(data.response, n, {
      iv: r,
      mode: CryptoJS.mode.CBC,
      padding: CryptoJS.pad.Pkcs7
    });
  return JSON.parse(i.toString(CryptoJS.enc.Utf8))
};

function formatDate(t) {
  var e = (new Date).getTimezoneOffset(), n = new Date(t.getTime() + 60 * e * 1e3).getDate(), r = parseInt((n < 10 ? "0" + n : n).toString().split("").reverse().join("")), i = t.getFullYear(), a = parseInt(i.toString().split("").reverse().join("")), o = parseInt(t.getTime().toString(), 16).toString(36) + ((i + a) * (n + r)).toString(24), s = o.length;
  if (s < 14)
    for (var c = 0; c < 14 - s; c++)
      o += "0";

  else
    s > 14 && (o = o.substr(0, 14));
  return "#" + o + "$";
}

console.log(decode(data));
<script src="https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js"></script>

希望有帮助:)

相关问题