scrapy 如何从Facebook页面中的XHR响应中获取有用的数据?

tzdcorbm  于 12个月前  发布在  其他
关注(0)|答案(3)|浏览(131)

我正试图通过网页抓取我的Facebook页面来获取我所有朋友的生日。由于Facebook使用AHR调用来加载“生日事件”页面中的朋友姓名,我在Chrome开发工具中查看了网络活动,以弄清楚它在哪里以及如何进行XHR调用以及响应数据是如何的。
这些调用的响应对我来说没有任何意义。它们看起来像是被混淆了什么的.
以下是响应数据:

for (;;); {
    "__ar": 1,
    "payload": null,
    "domops": [
        ["replace", "#birthdays_pager", false, {
            "__html": "\u003Cdiv class=\"_4-u2 _tzh _fbBirthdays__monthCard _4-u8\">\u003Cdiv class=\"_4-u3 _5dwa _5dw9\" id=\"birthdays_monthly_card_1522566000\">\u003Cspan class=\"_38my\">April\u003Cspan class=\"_c1c\">\u003C\/span>\u003C\/span>\u003Cspan class=\"_5dw8\">\u003Cdiv class=\"_tzj\">\u003Ca href=\"https:\/\/www.facebook.com\/kajal.chaudhary.5492\">Kajal Chaudhary\u003C\/a>, \u003Ca href=\"https:\/\/www.facebook.com\/shreesha.bhat.963\">Shreesha Bhat Galimane\u003C\/a> and 19 others\u003C\/div>\u003C\/span>\u003Cdiv class=\"_3s3-\">\u003C\/div>\u003C\/div>\u003Cdiv class=\"_4-u3\">\u003Cdiv class=\"_43qm _tzu _43q9\">\u003Cul class=\"uiList _4cg3 _509- _4ki\">\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/satish.ven.58\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Satish Ven (4\/2)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27540391_2084438825122967_6451048031944951645_n.jpg?oh=77383450a07722e1a44bf39c6d2c12f7&oe=5B19517E\" alt=\"Satish Ven\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sheshufirefox\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sheshadri Sharma (4\/6)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/16998890_758118407695665_4675113951836594565_n.jpg?oh=946ce323c5b3824fbf8dbbe59fd9160f&oe=5B02616B\" alt=\"Sheshadri Sharma\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/aayush.sinha.146\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Aayush Sinha (4\/8)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/10968514_981734045189455_1830626709337028270_n.jpg?oh=428a495a9379b6b2202408aa5284923b&oe=5B12711E\" alt=\"Aayush Sinha\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/pranav.ys.5\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pranav YS (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/12541116_1676590859264773_7240167064125691378_n.jpg?oh=3d4d0b034a06ecf460b8668fcdd0fad2&oe=5AD7EEAA\" alt=\"Pranav YS\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100012822522252\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pankaj Thakur (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27752205_468390053598408_3401567454276318428_n.jpg?oh=1f2fb7ee2da724506757029fdb8a46b2&oe=5B1F151B\" alt=\"Pankaj Thakur\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/prajwal.bhadravathiravi\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Prajwal Bhadravathi Ravi (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c10.0.57.57\/p57x57\/26000914_361577560980364_1712446738221265545_n.jpg?oh=370dc4419b0767b7e79bc27e854bc06b&oe=5B03D96B\" alt=\"Prajwal Bhadravathi Ravi\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sachinr.doddaguni\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sachin R Doddaguni (4\/12)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c17.0.57.57\/p57x57\/10354686_10150004552801856_220367501106153455_n.jpg?oh=21091066fea75337ac98a3cf1f341740&oe=5B16DBF3\" alt=\"Sachin R Doddaguni\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kajal.chaudhary.5492\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kajal Chaudhary (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/23316350_1440015076106508_6579302328578807067_n.jpg?oh=4e3fc491c9a32f9581286452933b1e50&oe=5B227D7E\" alt=\"Kajal Chaudhary\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/usha.shastri.54\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Usha Shastri (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c0.0.57.57\/p57x57\/10152373_10203424125546031_1766227792_n.jpg?oh=8b0e95a8a60e09c79005a84f3c6a8b98&oe=5B225FD5\" alt=\"Usha Shastri\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/ashish.dwivedi.39566\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Ashish Dwivedi (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26239697_2005584583047675_396510917460842524_n.jpg?oh=eab2bd118623e449e2dcefa3fb64899e&oe=5B021392\" alt=\"Ashish Dwivedi\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/shreesha.bhat.963\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Shreesha Bhat Galimane (4\/15)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27541088_2089364294633310_7146912677909552069_n.jpg?oh=e16a6a514982d8f15ae0a1c81a719752&oe=5B1B0577\" alt=\"Shreesha Bhat Galimane\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/chethanhr.chazz\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Chethan Vilas (4\/16)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/13620256_595294770631304_6000009159215075898_n.jpg?oh=b5b3ea3db6040e8a79233a7e90c916a9&oe=5B1C56BD\" alt=\"Chethan Vilas\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kshitija.kallesh\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kshitija Vidya Kallesh (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26733802_1408840822577029_120794789359415364_n.jpg?oh=19b7eada0711726990750fb6cf4add09&oe=5B03C8E7\" alt=\"Kshitija Vidya Kallesh\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/vishesh.ug\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Vishesh Umesh Gujjar (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/21192739_1939744576303216_5388844998198614270_n.jpg?oh=24d78a736265c7c8c0adeb54324f5894&oe=5B08520E\" alt=\"Vishesh Umesh Gujjar\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/santosh.bhat.7359\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Santosh Bhat (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/19149245_1892258634388875_8828676164364322774_n.jpg?oh=741b3bf6f9080726d54251044ba34355&oe=5B09B6F5\" alt=\"Santosh Bhat\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100007305601325\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Rahul Kumar (4\/20)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c90.210.540.540\/s57x57\/21685964_1884540188466150_8711746607997503911_n.jpg?oh=09c4c1f9f707950987c0eb70e7f3ad58&oe=5B1329B8\" alt=\"Rahul Kumar\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sumantha.murali\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sumanth Sharma (4\/22)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c1.0.57.57\/p57x57\/20663860_764480790380420_5549902384541679375_n.jpg?oh=132d2d9ec2b0b1f77f83620fc1efeb2a&oe=5B044E32\" alt=\"Sumanth Sharma\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/archana.kashyap.90226\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sweekruthi Kashyap (4\/22)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26814579_2066880066876357_3732840647672074955_n.jpg?oh=e27797ae7d2fcfa8ca23cf06bc36dbb9&oe=5B081963\" alt=\"Sweekruthi Kashyap\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/vinayaka.cbg\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Vinayaka Bhat Galimane (4\/23)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/25994873_1536622213073481_4403814656121225467_n.jpg?oh=d26a01066699d858d20bfa367fba02a4&oe=5B0EC392\" alt=\"Vinayaka Bhat Galimane\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100004456147835\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pruthvi Kalyan Reddy (4\/28)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/14563438_693932530765279_8227735103682834751_n.jpg?oh=17a9ef5cfa963fe9902bc16e94e6b51d&oe=5B0F7604\" alt=\"Pruthvi Kalyan Reddy\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kushal.kushu.31\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kushal Kushu (4\/29)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/19366450_1032761963526766_3567943503656629473_n.jpg?oh=cdf7c11db05db93d8fd0e966d816ea98&oe=5B1B0A36\" alt=\"Kushal Kushu\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003C\/ul>\u003C\/div>\u003C\/div>\u003C\/div>\u003Cdiv class=\"clearfix uiMorePager stat_elem _52jv\" id=\"birthdays_pager\">\u003Cdiv>\u003Ca rel=\"ajaxify\" href=\"\/async\/birthdays\/?date=1525158000\" class=\"pam uiBoxLightblue uiMorePagerPrimary\">May\u003Ci class=\"mhs mts arrow img sp_m7lN5cdLBIi sx_fa6ba6\">\u003C\/i>\u003C\/a>\u003Cspan class=\"uiMorePagerLoader pam uiBoxLightblue\">\u003Cimg class=\"img\" src=\"https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yb\/r\/GsNJNwuI-UM.gif\" alt=\"\" width=\"16\" height=\"11\" \/>\u003C\/span>\u003C\/div>\u003C\/div>"
        }]
    ],
    "jsmods": {
        "instances": [
            ["__inst_1c03405d_i_0", ["MorePagerFetchOnScroll", "__elem_1c03405d_i_0"],
                [{
                    "__m": "__elem_1c03405d_i_0"
                }, 0, true], 1
            ]
        ],
        "elements": [
            ["__elem_1c03405d_i_0", "birthdays_pager", 1]
        ],
        "require": [
            ["__inst_1c03405d_i_0"],
            ["Tooltip"]
        ]
    },
    "js": ["lbOvC", "I1Wyg", "iaXyh", "RIWAf"],
    "css": ["trv4T", "eyM74", "0wVzo", "YGsVX", "rwXTv", "hAqW4", "bTiWO"],
    "bootloadable": {
        "TimeSliceInteractionsLiteTypedLogger": {
            "resources": ["ZN6iu", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "WebSpeedInteractionsTypedLogger": {
            "resources": ["lbOvC", "lTQVw", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "AsyncDOM": {
            "resources": ["lbOvC", "trv4T", "d25Q1"],
            "needsAsync": 1,
            "module": 1
        },
        "Dialog": {
            "resources": ["lbOvC", "YGsVX", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ErrorSignal": {
            "resources": ["lbOvC", "trv4T", "eVg16", "CHoRV"],
            "needsAsync": 1,
            "module": 1
        },
        "ExceptionDialog": {
            "resources": ["vdrq6", "lbOvC", "JeUwF", "YGsVX", "trv4T", "mzeym", "eVg16", "taIOX", "iaXyh"],
            "needsAsync": 1,
            "module": 1
        },
        "PageTransitions": {
            "resources": ["lbOvC", "np5Vl", "trv4T", "eVg16", "I1Wyg", "iaXyh"],
            "needsAsync": 1,
            "module": 1
        },
        "ReactDOM": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "QuickSandSolver": {
            "resources": ["lbOvC", "Klc20", "trv4T", "+ClWy", "6Q\/Yd"],
            "needsAsync": 1,
            "module": 1
        },
        "ConfirmationDialog": {
            "resources": ["oE4Do", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "Banzai": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "BanzaiODS": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ResourceTimingBootloaderHelper": {
            "resources": ["lbOvC", "CHoRV"],
            "needsAsync": 1,
            "module": 1
        },
        "TimeSliceHelper": {
            "resources": ["WmPot", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ContextualLayerInlineTabOrder": {
            "resources": ["lbOvC", "b2zWq", "Nv4jJ", "YGsVX", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "BanzaiStream": {
            "resources": ["lbOvC", "ZU1ro", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "SnappyCompressUtil": {
            "resources": ["lbOvC"],
            "needsAsync": 1,
            "module": 1
        },
        "KeyEventTypedLogger": {
            "resources": ["lbOvC", "trv4T", "VMKqM"],
            "needsAsync": 1,
            "module": 1
        }
    },
    "resource_map": {
        "ZN6iu": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yJ\/r\/r98JDkrPdB7.js",
            "crossOrigin": 1
        },
        "lbOvC": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iQRw4\/y-\/l\/en_US\/5WZyEzO-yKR.js",
            "crossOrigin": 1
        },
        "trv4T": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y5\/l\/0,cross\/Hams2CQ6T8x.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "lTQVw": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yk\/r\/8v3L65OKN6U.js",
            "crossOrigin": 1
        },
        "d25Q1": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yW\/r\/2Hfsrn8zSCU.js",
            "crossOrigin": 1
        },
        "YGsVX": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yN\/l\/0,cross\/tw4_CoryHby.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "eVg16": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iPWO4\/y7\/l\/en_US\/zUpriHPHyi0.js",
            "crossOrigin": 1
        },
        "CHoRV": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i3pY4\/yB\/l\/en_US\/QJ9nYHU0qO9.js",
            "crossOrigin": 1
        },
        "vdrq6": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3itvn4\/yT\/l\/en_US\/6_7pVZCnDMo.js",
            "crossOrigin": 1
        },
        "JeUwF": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y_\/r\/ash8xOAZVK-.js",
            "crossOrigin": 1
        },
        "mzeym": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i2nZ4\/y4\/l\/en_US\/SE27RbSq37K.js",
            "crossOrigin": 1
        },
        "taIOX": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3if8X4\/yf\/l\/en_US\/I3G_M2Fe60k.js",
            "crossOrigin": 1
        },
        "iaXyh": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3idkl4\/y-\/l\/en_US\/Wcgyvl_N-Xj.js",
            "crossOrigin": 1
        },
        "np5Vl": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yF\/r\/arfpg0J9xVr.js",
            "crossOrigin": 1
        },
        "I1Wyg": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i4KP4\/yO\/l\/en_US\/SZb_o9LvjeN.js",
            "crossOrigin": 1
        },
        "Klc20": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yS\/r\/fPmoZFDHfot.js",
            "crossOrigin": 1
        },
        "+ClWy": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yF\/r\/rhy6VMHHsHB.js",
            "crossOrigin": 1
        },
        "6Q\/Yd": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iGqd4\/yn\/l\/en_US\/zcxRQpdn3KC.js",
            "crossOrigin": 1
        },
        "oE4Do": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yW\/r\/STvuQMoVsgo.js",
            "crossOrigin": 1
        },
        "WmPot": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yA\/r\/KOciABKx4w7.js",
            "crossOrigin": 1
        },
        "b2zWq": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yj\/r\/1Q-q4laVvzx.js",
            "crossOrigin": 1
        },
        "Nv4jJ": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3ivjx4\/y7\/l\/en_US\/-wVIYTKb-J1.js",
            "crossOrigin": 1
        },
        "ZU1ro": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ym\/r\/tnX8h1hMAqX.js",
            "crossOrigin": 1
        },
        "VMKqM": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yq\/r\/VX_g1H0zcZv.js",
            "crossOrigin": 1
        },
        "eyM74": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y3\/l\/0,cross\/0uxWhoQ2bKZ.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "0wVzo": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y4\/l\/0,cross\/acUhycgW0b0.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "rwXTv": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yq\/l\/0,cross\/x7EQi00Ge7H.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "hAqW4": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ya\/l\/0,cross\/01llQAe-xml.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "bTiWO": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ya\/l\/0,cross\/Jxgn8lU3xE2.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "RIWAf": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3ikeI4\/yW\/l\/en_US\/jLnWPpCtWMp.js",
            "crossOrigin": 1
        }
    },
    "ixData": {},
    "gkxData": {
        "AT4kYIk7PhRqUACJJM8qs58t-WNCoM2ZYe35b1xv03xf3OtmC7RfXVIT9hWB6yTOgfA": {
            "result": false,
            "hash": "AT5oUVeShxEj-wBy"
        },
        "AT6ospK-Tdqu5qRhy-TcAU0nIA_ctyO-ghWqmAEjf7bDt3FzGNFL8C4Kn6qbsJrp6oPJYeq6bUEntlvCEgoH4eYQlTJ0DsJar1ZABa0GLxyieQ": {
            "result": false,
            "hash": "AT5lUwvU9ACQ1puA"
        },
        "AT6Afdq0Tt2jEesGOMGnSRKoZIl2eQfQBS7ISXiYFG3RHN4ykkPiZeyWuKALtD0ObEVGeeZuAFKdYpfxlBzUUPkd": {
            "result": false,
            "hash": "AT616ipsS9Q6IRps"
        },
        "AT7IsskI4XB9V3_ZpKFnRxAvs6BVPIgSDbDcq24b8ToUAOY2pCaSzuagN7f_cNx9vGp7vgNftn1_SRfogFUNGS0K": {
            "result": true,
            "hash": "AT5Na-Nz7G8XKMru"
        },
        "AT52sTP_5lkBPKbNz2mUZWsbcEDkBzQg0lQckIsVf32rCwFPbCAUTv2-qAeYwt3QMKM": {
            "result": false,
            "hash": "AT7Pq-Rl8e-_XQMy"
        },
        "AT68bJwSI-83elN-7JSMMH9zt32KbiF6pW-XMlf6NViAJ3CbAk_16Vq8cK1tl1029_ApvFwINR8hmoci3nMKFTDhDCBp1wrvYQbOKq0pCjZpqA": {
            "result": false,
            "hash": "AT7iq4cEmcKTjkfp"
        },
        "AT6DanO60hgFT7juQEF_b5acv5amdrLzodvaFbz5tWF8DGQCmmf0_a7wsRZnn4yNp9kI3S6KXc87dzKSPpUSy11k": {
            "result": false,
            "hash": "AT6MaFQR8z-lSlRA"
        }
    },
    "lid": "6523134272703508330"
}

字符串
看起来facebook site以某种方式解释了这个响应数据,并在页面上呈现了朋友的名字。我想要的数据是在一路向下滚动后最终呈现的网页。但是当我使用python的“requests”模块或查看页面源代码时,大多数这些HTML内容都不存在。
我该怎么办?

zsbz8rwp

zsbz8rwp1#

这看起来像某种类似json的响应,实际的html包含在__html字段中。
由于实际数据是以这种方式返回的,因此您必须通过几个步骤来完成:
1.加载JSON数据
1.创建Selector
1.从选择器中提取所需的数据
例如,获取名称的一种方法可能是:

>>> data = json.loads(response_text[response_text.index('{'):])
>>> sel = Selector(text=data['domops'][0][3]['__html'])
>>> sel.xpath('//a/img/@alt').getall()
['Satish Ven', 'Sheshadri Sharma', 'Aayush Sinha', 'Pranav YS', 'Pankaj Thakur', 'Prajwal Bhadravathi Ravi', 'Sachin R Doddaguni', 'Kajal Chaudhary', 'Usha Shastri', 'Ashish Dwivedi', 'Shreesha Bhat Galimane', 'Chethan Vilas', 'Kshitija Vidya Kallesh', 'Vishesh Umesh Gujjar', 'Santosh Bhat', 'Rahul Kumar', 'Sumanth Sharma', 'Sweekruthi Kashyap', 'Vinayaka Bhat Galimane', 'Pruthvi Kalyan Reddy', 'Kushal Kushu']

字符串
请注意,抓取facebook并不是一个好主意,你最好使用他们的api。

kgsdhlau

kgsdhlau2#

一个法律的解决方案可能是使用stalkscan.com功能,或其他类似的基于官方API的fb抓取网站。

knpiaxh1

knpiaxh13#

由于我目前正在开发一个Chrome扩展程序来清除网站上所有的建议/赞助内容,这似乎是一个合适的解决方案。请记住,Facebook正在不断改变其结构,由于相当奇怪的技巧,在DOM树中找到方法并不是一件容易的事情。
现在,你可以进入Events -> birthday,使用适当的DOM选择器和逻辑来选择所有的日期。其中一些需要在滚动后动态加载-这可以通过mutationObserver来处理。使用这种方法,你可以将生日页面滚动到底部一次,然后生成一个包含所有姓名/生日对的数组。所以这个方法是半自动的。
有太多的动态加载的内容,动态改变的id/classes +网站随着时间的推移不断变化,所以进入任何细节都是徒劳的。如果你能理解他们的DOM树,掌握DOM操作和CSS选择器+Chrome扩展制作的基本知识(这是令人惊讶的容易),那么你就可以实现目标:)

相关问题