使用python爬网多个页面,但其他页面显示的结果与第一个页面相同

n7taea2i  于 2022-11-26  发布在  Python
关注(0)|答案(2)|浏览(623)

我尝试通过bs4抓取数据。对于每一页,我想把所有的产品id的,这是确定的,当我从第一页的数据,但从第二页开始,它总是显示第一页的产品id的。这是我的代码(虽然我改变了第五页):

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen('https://tiki.vn/lam-sach-da-mat/c11232?sort=top_seller%3Fpage%3D5&page=5')
bs = BeautifulSoup(html, 'html.parser') 

result =bs.find_all(lambda tag: tag.get('class') == ['product-item'])

这里是result of 5th page in my code
我想将第5页的产品ID作为this
我想得到第5页的产品ID,但不明白为什么我的代码仍然显示第一页的结果。

8aqjt8rx

8aqjt8rx1#

看起来,包括广告在内,有107种产品。下面是一种直接抓取API端点并获取所有产品的方法:

import requests
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

headers = {'accept': 'application/json, text/plain, */*',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.79 Safari/537.36'
}

url = 'https://tiki.vn/api/personalish/v1/blocks/listings?limit=300&include=advertisement&aggregations=2&trackity_id=527749d7-0a68-f53e-54b5-fe2da48136f2&category=11232&page=1&sort=top_seller%3Fpage%3D5&urlKey=lam-sach-da-mat'
r = requests.get(url, headers=headers)
df = pd.json_normalize(r.json()['data'])
print(df)

结果:

id  sku name    url_key url_path    type    author_name book_cover  brand_name  short_description   price   list_price  badges  badges_new  discount    discount_rate   rating_average  review_count    order_count favourite_count thumbnail_url   thumbnail_width thumbnail_height    freegift_items  has_ebook   inventory_status    is_visible  productset_id   productset_group_name   seller  is_flower   is_gift_card    inventory   url_attendant_input_form    option_color    stock_item  salable_type    seller_product_id   installment_info    url_review  bundle_deal video_url   tiki_live   original_price  shippable   impression_info availability    quantity_sold.text  quantity_sold.value advertisement.ad    advertisement   quantity_sold
0   33606848    9815250596996   Kem tẩy da chết làm trắng sáng và đều màu da Paula’s Choice RESIST Daily Smoothing Treatment With 5% AHA 50 ml - 7660   dung-dich-loai-bo-te-bao-chet-lam-mem-da-paula-s-choice-resist-daily-smoothing-treatment-with-5-aha-50-ml-p33606848 dung-dich-loai-bo-te-bao-chet-lam-mem-da-paula-s-choice-resist-daily-smoothing-treatment-with-5-aha-50-ml-p33606848.html?spid=33606849          None    Paula's Choice      849000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'official_store', 'icon': 'https://salt.tikicdn.com/ts/upload/5d/4c/f7/0261315e75127c2ff73efd7a1f1ffdf2.png', 'icon_height': 14, 'icon_width': 68, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'freegift_items', 'placement': 'under_rating', 'text': 'Quà tặng', 'type': 'under_rating_text'}, {'code': 'asa_reward_badge', 'icon': 'https://salt.tikicdn.com/ts/upload/d6/51/17/cde193f3d0f6da18147a739247c95c93.png', 'icon_height': 20, 'icon_width': 53, 'placement': 'bottom', 'type': 'asa_reward'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 144 ASA (48k ₫)<br/>≈ 5.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}] 100000  11  4.8 42  0   0   https://salt.tikicdn.com/cache/280x280/ts/product/2e/43/38/ca1cbb77f9993e07db7ba3e107644d56.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        33606849    None        False   None    None    949000  True    [{'impression_id': 'thanos-product-VaFjRtzzGwSO09QA', 'metadata': {'price': 849000, 'rating_average': 4.8, 'reviews_count': 42, 'seller_product_id': 33606849}}, {'impression_id': '97c3dfe2-cf95-4161-94a3-529235d45ae1', 'metadata': {'advert_id': 3492748, 'business_id': 4769, 'flags': {'ad-2752': 5, 'p_cate': 8206, 'predictor': 'cb10', 'src': 'cat'}, 'product_id': 33606849, 'service_name': 'makesense', 'user_bucket': 570}}, {'impression_id': '99e7098f-6f87-4ee3-bd7a-7be637d2f402', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}] 1   Đã bán 207  207.0   [{'match_id': 0, 'advert_id': 0, 'business_id': 0, 'seller_id': 3946, 'clickUrl': '//tka.tiki.vn/pixel/pixel?data=djAwMUXcpZFK1GmJWEEH5wJs-tR0dD_QzrnUur3S0rkR7vFGb3Cx5LSWiqPQqZanWSaNuWGOindrBvik_aCq_9cjyQF8D906qQBOt-t08-ZoBweFhLNgyc5q11ZIVWlIHUQ19sfWQ8KQU_v-jzTMDXWv8osQqhXDUvwVkcKHHbvPz_q81AFAXFZp2IFbKnoZAYFibzaoW-UcoQiAnYZVtCWBvbQg2Qzx5TUVh6LJAgL0aMNk5tts6O8clarx2ICB8U95RnWeT6o8QjqNUl3NRakOED4nqSFEgtddT2Rci9Xqr-7vt_JEYULCGuKG2Oj7zqT-sAhWduFt3dkzhmsozBZvSURwk9vgVt1K4wvBf8wMX33iRyMCM1VIjd3PKGEV0QaEkQMGl_ulC_3fST17wZvrfdcFVqSPoGj98O63eir50lnrVNXYbpFlgDmIYMUqMs-rEkx_XvtSo76XIKoDmgn5GvLe2aewoNYvkI27vRCvV8Ufj7qhD9RAUXVFHv_DY5lVJRJ0j1vtnPYbnv8USOGUKu4RPRc93gxXukOuRxHq84a69M8zczLS25KVVfHnMmqbe3TZHvVg3zCtlc6tAiXiyJ0YhkxrhlRvEU32wpdX9Cv4M97rReqQJa7mZfFGxZ0rnWO58FSRO3Yt3xs_iAsMPHQ-0i8XTgFrh6BPaxS0xvM3EtgIjcjF63byLGz0NXcVj77whvoi2f9TFZMxy1O_Tte_Htf7TnrFNUCdo7xYVlIdymv3Jsfcy-YwW38uQ7Q9_4f9tcZGKF8BDuxaCPrKBZTp32HXOSd7zWsMX2wv3t0l4r4VjBaC2CZSSfvZNRlME4o0m-Q6YVNRb4Wk33DocSnphdBXztLhwpMaWSAFoErDNsZL9Qgqk4y-U8wb-UAV8BppXUKMpJkDwG8GxtGNUZ_PhZN4G1Jb4C-IdwLyeZxfwgcUV2LZ5k1D4WJ8lv707sAIHBCADaCxdRDJmjcX6-A7kDfpfT05W6tQzak5ElGkYC4YZxcr7TRW8EoJaV72glEkSBFdj1J5GNQqiajyh3XC88zQNSce-hI1keyZe-0qUJmZxcNEOUVoYZ7ifX-lz5jtWx_kbIYnZ_R3bvxf-FWyaNEOpXnh3s3iCluPd74Lmpea8AeLY2jGvqD77_cuvelXgbae4mP44E6uDCe_YOPF2ud3XgcssU_LoqxQrjj2x-nAYYy9tTIYPDsweEC89EKD0KciSFSB4UH5AQLzpTluPSFBGR9ki46I49xvOM3SS7fLaqcwyqWnnqMIDQ&CLICK&reqid=vFRZNdJMRr&pos=1&redirect=https%3A%2F%2Ftiki.vn%2Fdung-dich-loai-bo-te-bao-chet-lam-mem-da-paula-s-choice-resist-daily-smoothing-treatment-with-5-aha-50-ml-p33606848.html%3Fspid%3D33606849', 'impUrl': '//tka.tiki.vn/pixel/pixel?data=djAwMR4kJOdkR6ThYEk7cciKp_1p0OpnJbH7SMwoWHj-Kjq2Cfp3Kr1qYggzrNrQE94SkWBR63dQO0Rj9g43Lhxrv8ggtrbKXBXQrAcmo32rEnvX_c4JiVgX_dPpAdubrRE_zagc7UdYpVqjJEiBQxE1_ioxavawDQ2SN12opCjx-yV3SuJOQyd9daAyxHl76CHW8acVYXE4wCHCeuLpW4YdhZwd4gXzMxPiyQyxGUVYLOJZBObM9md59_Ow96AddUaasyn0Yry3RUv5GZ_46O7u0eFENwZDlwEE2jrz6IYGHPOre4hWQmTtK1HSZWi6UmPDyy4qDjcw45nqIbs8hFmXBoCwCEa3oRLQ1sP8iVHJvYTN1eNQblQmPqWgVECEm3bd2kEMOd2H3qoVw__KqnBfD4G4avOZ5CnN-DFQTURnjUvqeyKJNuIAsvx8CLyZx25T6Ni6S1gL16_v4X3w3-0NbRHpqZrbJ1vFYZqdXba65MtDtsLz35yyWc3fo2iNTgvXMf5qkYGiAnMNoYaP6yv0YuvAozh0ekqDyS9qbEqDIwa7R87K4f_IDuKWwrhqfvC6gLAlPZk8M3vTOi1lxV15y5jI3WLsW5-sv0T7ypCmNnv56QMxfPMDJipD5ae9XFZWpBuQRoIHYOgASgWFTKrs85Escv0JjXkcLKkNoJM5KrPzNVxs5JeQv1qLvUXzyR0UYwnc3qVBlxIpbSoZOoK2bien691jxWED6osm94CVSVv_fw7yHW12fP9sh-Req2vGQvGq3D40ndF2ag1xpAdytgsJoHVxgoAPQS58TneDHE-GBdigYJSIjIKJ5c0Hxh7pvkajdnMHqvhxR44_Zo6LRgXvE_ZL048BwQOlz0gXJrzj8rVmwALMWC6N5R4_0bLjV4kUFEdg6LhdWC3Uu6CYsW5qyXcpfB2ampI2Q9Y3vOPgWKSa59e68kaUvZU_XHqFDotp2Kxc6fsQWH3f_e9z2NF6_jlLjN1N_xO-EODoEFEYGC2lxuCHo7h0qZLQO_TsIHo2eEfTqtEfHj7uWM1gHnlyf7mi2esr2Z2ow2Ul8NYYlSoMi_HHdynp-ogLWbva8_Zg-2h27X8&SHOW&reqid=vFRZNdJMRr&pos=1', 'trueImpUrl': '//tka.tiki.vn/pixel/pixel?data=djAwMcxZvpGC5q0kmKZdC0f7E7NTPgcQz3QIsrPU8HOdN3XcwRPLThqpL4IOS8wEXwiUOs06YO5ZboJEkr9lfAKAxN3tx0uDN278ihaJR75TE4VxdG2GeapxtNnUPEAY_92eVJfWBCwT2w4bGphxkBPuXbfRGo2viNSDHGPBz0EQbY94JAW8aYFV-O0_zl71Umd-6E5gd1-MlcnwMuV-VFQIKrjyTS1Udes_05EiM0KNl6glpvJPbN0dnL1SwCHHIWYZoxBNUfnfnS6aFqieWT-Zy_JirTZ3zWycjMX7gGaNkel3nUf4lQz4v7y6VHDkbfNtuKMJ5-OpzOF4A39-_ehIGtMohJjhx853kTRf3M5CO6_Fsw7LOKlKBPt6vioGblWSYiMF7fgO7qmrpv6Pv5DIaMEjJVE1qaGiKZmClN0xRLQS47bUFfrv7MTpAgCpj8YneFs0q7to23HiInMZpMSYBhFIXFIMk_RtA32f2pTeO399OG1_fV6J05VxACqfzfm_1zxWt1LSlUGME_Sb1F25uJP542ZPru8sgo6Q5Vy46A-zpRfk02MqXefuHEtw2cSRKxAzaK4yK7xyZiPK8cBarDgbPv8TuEpsT0bhu6x9lOB-fyvTZFAN8LS9uQ1nfeFn2icW0d472Q6w1TWtO5IC_fst1or5KHW9qKC_P4EDERumpobf7GAb-34ojWcAQtOWPHJpmqRGvLV0MR_ISNFwIkXmOW8Bgytfkuenox9_7Niqg9x3qEEdmA1Rr0R0KbEkiIkZRDS7ZsGeVIVjn1cTSaLbGwLNIVAIytKyNa6_BhYNvnjihOFjVweuIaNhvjmyWAMht-X9NkzNnWsQPNZy2ha3JABfCw4fkD4cOk8m16NH6avet0fG1cLGOYmKOneUWk7ihIWWRNFB0OoN2HQDsREfhQ2qEUWBrmDPHI-1Slt5MbcnS8cm0sF8mBu5LO4-RV-9i3e7VI0Ti7Cbcj5f6kD8QVkQSAijsSCDAwKA3oRPpjl7eeCu1qrmXCMW2VEBe0ftOFZSNg6PKQ-Pyvrxwkdpjf_yICGdOfGH_t0KNPi3DnKOlVrTKjg_osk&VIEW&reqid=vFRZNdJMRr&pos=1', 'properties': {'product_id': '33606849', 'matched_query': 0, 'image': '', 'url': ''}, 'type': 'ProductAdvertType', 'impression_info': [], 'image_ratio': 0}]    NaN NaN
1   33606786    5722576750824   Lotion tẩy tế bào chết làm sáng da Paula’s Choice Skin Perfecting 8% AHA Lotion 100ml 2060  lotion-tay-te-bao-chet-lam-sang-da-paula-s-choice-skin-perfecting-8-aha-lotion-100ml-2060-p33606786 lotion-tay-te-bao-chet-lam-sang-da-paula-s-choice-skin-perfecting-8-aha-lotion-100ml-2060-p33606786.html?spid=66640131          None    Paula's Choice      663000  0   []  [{'code': 'tikinow', 'icon': 'https://salt.tikicdn.com/ts/upload/3f/76/87/4c636b7bea11521f46f733b7839df4de.png', 'icon_height': 16, 'icon_width': 32, 'placement': 'delivery_info', 'text': 'Giao siêu tốc 2H', 'type': 'delivery_info_badge'}, {'code': 'trusted_store', 'icon': 'https://salt.tikicdn.com/ts/upload/e8/6a/e3/7f998ef1eb5ab0536aac53f02a698c8a.png', 'icon_height': 14, 'icon_width': 54, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 73 ASA (24k ₫)<br/>≈ 3.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]  0   0   4.7 20  0   0   https://salt.tikicdn.com/cache/280x280/ts/product/2f/6d/8d/081edbe77b16439c4fa0b18263cbede7.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        66640131    None        False   None    None    663000  True    [{'impression_id': 'thanos-product-QvglqBASVoDvOhFS', 'metadata': {'price': 663000, 'rating_average': 4.7, 'reviews_count': 20, 'seller_product_id': 66640131}}, {'impression_id': '75f5fa8b-3b32-4280-86a4-141778a1cb1f', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}]  1   Đã bán 36   36.0    NaN NaN NaN
2   11239286    9792297299199   Gel tẩy da chết Arrahan Lemon White Peeling Gel (180ml) gel-tay-da-chet-arrahan-lemon-white-peeling-gel-180ml-p11239286 gel-tay-da-chet-arrahan-lemon-white-peeling-gel-180ml-p11239286.html?spid=20116852          None    Arrahan     61900   0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'trusted_store', 'icon': 'https://salt.tikicdn.com/ts/upload/e8/6a/e3/7f998ef1eb5ab0536aac53f02a698c8a.png', 'icon_height': 14, 'icon_width': 54, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 7 ASA (2k ₫)<br/>≈ 3.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]   0   0   4.7 77  0   0   https://salt.tikicdn.com/cache/280x280/ts/product/93/cb/da/afd6b13fe3654bf4351b260b801c41e3.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        20116852    None        False   None    None    61900   True    [{'impression_id': 'thanos-product-UDr0lE1YpujdRftZ', 'metadata': {'price': 61900, 'rating_average': 4.7, 'reviews_count': 77, 'seller_product_id': 20116852}}, {'impression_id': 'e81a5e80-1c98-4d19-abdc-a60250309814', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}]   1   Đã bán 539  539.0   NaN NaN NaN
3   33606848    8573828662870   Kem tẩy da chết làm trắng sáng và đều màu da Paula’s Choice RESIST Daily Smoothing Treatment With 5% AHA 50 ml - 7660   dung-dich-loai-bo-te-bao-chet-lam-mem-da-paula-s-choice-resist-daily-smoothing-treatment-with-5-aha-50-ml-p33606848 dung-dich-loai-bo-te-bao-chet-lam-mem-da-paula-s-choice-resist-daily-smoothing-treatment-with-5-aha-50-ml-p33606848.html?spid=66638723          None    Paula's Choice      529000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'trusted_store', 'icon': 'https://salt.tikicdn.com/ts/upload/e8/6a/e3/7f998ef1eb5ab0536aac53f02a698c8a.png', 'icon_height': 14, 'icon_width': 54, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 58 ASA (19k ₫)<br/>≈ 3.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}] 0   0   4.7 7   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/f8/10/ef/714f6b435ade504ce920caeff4ace16f.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        66638723    None        False   None    None    529000  True    [{'impression_id': 'thanos-product-zTwvu1Q7UONamIJN', 'metadata': {'price': 529000, 'rating_average': 4.7, 'reviews_count': 7, 'seller_product_id': 66638723}}, {'impression_id': '95894dfd-049c-41e4-848a-e179e8a5c03b', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}]   1   Đã bán 27   27.0    NaN NaN NaN
4   20525156    3751926198377   Tẩy Tế Bào Chết 3W Clinic Collagen Crystal Peeling Gel 180ml    tay-te-bao-chet-3w-clinic-collagen-crystal-peeling-gel-180ml-p20525156  tay-te-bao-chet-3w-clinic-collagen-crystal-peeling-gel-180ml-p20525156.html?spid=20525157           None    3W Clinic       119000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'trusted_store', 'icon': 'https://salt.tikicdn.com/ts/upload/e8/6a/e3/7f998ef1eb5ab0536aac53f02a698c8a.png', 'icon_height': 14, 'icon_width': 54, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 3 ASA (981 ₫)<br/>≈ 0.8% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]    31000   21  4.7 12  0   0   https://salt.tikicdn.com/cache/280x280/ts/product/9a/01/46/71fa72df01b8addc69770f67b3bcedab.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        20525157    None        False   None    None    150000  True    [{'impression_id': 'thanos-product-avJJKrffpf79iq8i', 'metadata': {'price': 119000, 'rating_average': 4.7, 'reviews_count': 12, 'seller_product_id': 20525157}}, {'impression_id': 'c9e9e842-8065-4d9e-8269-0d3b466e311b', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}]  1   Đã bán 51   51.0    NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
102 4701145 6917512701766   Kem Tẩy tế bào chết cho mặt Byphasse Exfoliant Face Scrub Dành cho mọi loại da  kem-tay-te-bao-chet-cho-mat-byphasse-exfoliant-face-scrub-danh-cho-moi-loai-da-p4701145 kem-tay-te-bao-chet-cho-mat-byphasse-exfoliant-face-scrub-danh-cho-moi-loai-da-p4701145.html?spid=27924960          None    Byphasse        119000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_badge', 'icon': 'https://salt.tikicdn.com/ts/upload/d6/51/17/cde193f3d0f6da18147a739247c95c93.png', 'icon_height': 20, 'icon_width': 53, 'placement': 'bottom', 'type': 'asa_reward'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 20 ASA (7k ₫)<br/>≈ 5.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]    0   0   4.0 8   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/a1/7c/77/7acfba66ad481b870be5fdb1d10a4662.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        27924960    None        False   None    None    119000  True    [{'impression_id': 'thanos-product-Khf0Kz3w7kEtxJ1U', 'metadata': {'price': 119000, 'rating_average': 4, 'reviews_count': 8, 'seller_product_id': 27924960}}, {'impression_id': 'cabf20ec-d814-48ba-b000-7126ed1a22d5', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}] 1   Đã bán 49   49.0    NaN NaN NaN
103 38465349    8446201287222   Gel Tẩy Tế Bào Chết Keana Baking Soda Moist Peeling (120G) - HÀNG CHÍNH HÃNG    gel-tay-te-bao-chet-keana-baking-soda-moist-peeling-120g-hang-chinh-hang-p38465349  gel-tay-te-bao-chet-keana-baking-soda-moist-peeling-120g-hang-chinh-hang-p38465349.html?spid=38465350           None    Keana       421200  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_badge', 'icon': 'https://salt.tikicdn.com/ts/upload/d6/51/17/cde193f3d0f6da18147a739247c95c93.png', 'icon_height': 20, 'icon_width': 53, 'placement': 'bottom', 'type': 'asa_reward'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 72 ASA (24k ₫)<br/>≈ 5.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]   118800  22  4.5 2   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/7e/a0/59/8dd959d52b59306d83523204062ad713.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        38465350    None        False   None    None    540000  True    [{'impression_id': 'thanos-product-Rcuvm8ucH1kfYALI', 'metadata': {'price': 421200, 'rating_average': 4.5, 'reviews_count': 2, 'seller_product_id': 38465350}}, {'impression_id': '3281c96d-3a6c-4b26-838b-4b50e9f9a618', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}]   1   Đã bán 3    3.0 NaN NaN NaN
104 15213464    4407455438680   Tẩy bào chết Belif Mild And Effective Facial Scrub 100ml    tay-bao-chet-belif-mild-and-effective-facial-scrub-100ml-p15213464  tay-bao-chet-belif-mild-and-effective-facial-scrub-100ml-p15213464.html?spid=76083479           None    Belif       630000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 16 ASA (5k ₫)<br/>≈ 0.8% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]  0   0   0.0 0   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/2a/cf/b8/fb265c0ce6944bbb6aa822eca1642be3.png 280 280 []  False   available   False   345     None    False   False   None        []  None        76083479    None        False   None    None    630000  True    [{'impression_id': 'thanos-product-v8EPXz3gtAHsbWNz', 'metadata': {'price': 630000, 'rating_average': 0, 'reviews_count': 0, 'seller_product_id': 76083479}}, {'impression_id': 'f801f336-5090-46f9-ba75-41a7e22a0dc2', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}] 1   Đã bán 1    1.0 NaN NaN NaN
105 51088975    9244203860400   Dấm táo The Inkey List Apple Cider Vinegar Acid Peel 30ml   dam-tao-the-inkey-list-apper-cider-vinegar-acid-peel-30ml-p51088975 dam-tao-the-inkey-list-apper-cider-vinegar-acid-peel-30ml-p51088975.html?spid=51088976          None    The Inkey List      589000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'trusted_store', 'icon': 'https://salt.tikicdn.com/ts/upload/e8/6a/e3/7f998ef1eb5ab0536aac53f02a698c8a.png', 'icon_height': 14, 'icon_width': 54, 'placement': 'top', 'type': 'icon_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 65 ASA (21k ₫)<br/>≈ 3.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}] 0   0   5.0 1   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/87/6f/86/0bae14bd8ebd26ae57a95f8bb47de9da.png 280 280 []  False   available   False   345     None    False   False   None        []  None        51088976    None        False   None    None    589000  True    [{'impression_id': 'thanos-product-RKuBQgdldtq0QZut', 'metadata': {'price': 589000, 'rating_average': 5, 'reviews_count': 1, 'seller_product_id': 51088976}}, {'impression_id': '4a762b28-db43-4da1-a039-6bf01d078413', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}] 1   Đã bán 3    3.0 NaN NaN NaN
106 24408456    5696255831404   Gel Giúp Loại Bỏ Tế Bào Chết IASO   gel-giup-loai-bo-te-bao-chet-p24408456  gel-giup-loai-bo-te-bao-chet-p24408456.html?spid=24408458           None    IASO        441000  0   []  [{'code': 'delivery_info_badge', 'placement': 'delivery_info', 'text': 'Giao tiết kiệm', 'type': 'delivery_info_badge'}, {'code': 'freeship_plus', 'placement': 'under_rating', 'text': 'Freeship+', 'type': 'under_rating_text'}, {'code': 'asa_reward_html_badge', 'placement': 'under_price', 'text': 'Tặng tới 49 ASA (16k ₫)<br/>≈ 3.6% hoàn tiền', 'text_color': '#808089', 'type': 'asa_reward_html'}]   49000   10  0.0 0   0   0   https://salt.tikicdn.com/cache/280x280/ts/product/78/29/34/b52258f69bfe3349bfe9c55a7dd9095c.jpg 280 280 []  False   available   False   345     None    False   False   None        []  None        24408458    None        False   None    None    490000  True    [{'impression_id': 'thanos-product-mSLaaOhB4aLTtZa7', 'metadata': {'price': 441000, 'rating_average': 0, 'reviews_count': 0, 'seller_product_id': 24408458}}, {'impression_id': '4fd674dd-ee1d-459a-9c0d-1cf00d99662b', 'metadata': {'product_id': 345, 'service_name': 'reco', 'version': 'p_category_mpid_listing_v1_202211190600'}}] 1   NaN NaN NaN NaN NaN
guicsvcw

guicsvcw2#

顺便说一句,你可以用soup.select_one('li > a.current[data-view-id="product_list_pagination_item"][data-view-label]').get('data-view-label')这样的东西来检查actual page number of the html

说明:无论您将链接用于哪个页面,总是首先加载第一个页面,然后动态更新该页面(使用JavaScript和API)。您可以转到network tab on devtools [打开后可能需要刷新页面,并确保the "preserve log" option选中],然后单击日志中第一个请求的[名称][其结尾应与地址栏中的链接相同] ;“响应”中的html是requests.get获取的内容-您可能会注意到this html is of the first page.

如果滚动查看日志中的其他请求,您应该会找到一个指向https://tiki.vn/api/personalish/v1/blocks/listings?limit=40&include=advertisement&aggregations=2&trackity_id=3dddf2b8-1eb2-e891-0cdf-c23b37663c28&category=11232&page=5&sort=top_seller%3Fpage%3D5&urlKey=lam-sach-da-mat的请求
the products are probably loaded from this。所有参数PPEr都是固定的,或者可以在页面url中找到,**trackity_id**除外;如果查看request initiator chain,可以看到哪个JavaScript文件发出了请求,并且可以尝试找出trackity_id是如何生成的;但就我个人而言,我发现使用selenium更容易。

**建议的解决方案1:**看起来您实际上可以只使用已知的参数(categoryurlKeysort)来使用API:

# import cloudscraper
r = cloudscraper.create_scraper().get('https://tiki.vn/api/personalish/v1/blocks/listings?limit=300&category=11232&sort=top_seller%3Fpage%3D5&urlKey=lam-sach-da-mat')
productList = r.json()['data']
print('### [{id}_{sku}: {name}] for first 10 products of', f'{len(productList)} ###\n')
for p in productList[:10]: print(f"{p['id']}_{p['sku']}: {p['name']}")

(我使用cloudscraper是因为我对urlopen不是很熟悉,而且我也不擅长使用requests设置正确的标题以避免403错误...)这将打印

### [{id}_{sku}: {name}] for first 10 products of 100 ###

33606786_5722576750824: Lotion tẩy tế bào chết làm sáng da Paula’s Choice Skin Perfecting 8% AHA Lotion 100ml 2060
11239286_9792297299199: Gel tẩy da chết Arrahan Lemon White Peeling Gel (180ml)
33606848_8573828662870: Kem tẩy da chết làm trắng sáng và đều màu da Paula’s Choice RESIST Daily Smoothing Treatment With 5% AHA 50 ml - 7660
20525156_3751926198377: Tẩy Tế Bào Chết 3W Clinic Collagen Crystal Peeling Gel 180ml
67089667_9204550497315: Combo 2 chai tiện lợi - Natureine AQUA PEEL Moisture Peeling Gel - Gel tẩy tế bào da chết, cấp ẩm Nhật Bản - Chính Hãng
21481823_9335684703529: Gel tẩy tế bào chết sáng da hồng sâm Hàn Quốc My Gold Korea Red Ginseng Peeling Gel (130ml) – Hàng Chính Hãng
46203526_8584500833846: Bông Tẩy Da Chết Cosrx One-Step Original Clear Pad 70 Sheets (New 2019)
1941543_2999847759227: Kem tẩy tế bào chết mặt Organic Shop Organic Coffee & Powder 75ml
57783000_9733773668061: Natureine AQUA PEEL Moisture Peeling Gel - Gel tẩy tế bào da chết, cấp ẩm Nhật Bản - Chính Hãng
7319657_7325473003642: Trial Tinh chất dành cho da mụn cao cấp Resist BHA 9 0.83 ml

然而,我觉得应该有超过只有100个产品-分页与 selenium (下图)表明,应该有177个产品。

**建议的解决方案2:**您可以使用我编写的this function循环访问页面,以获取和解析html(使用selenium + bs 4)

maxPages = 10  # or as you prefer
nextUrl = 'https://tiki.vn/lam-sach-da-mat/c11232?sort=top_seller'
pgi_sel = 'data-view-id="product_list_pagination_item"'
for pn in range(1, maxPages+1):
    curPage_xpath = f'//li/a[@class="current"][@{pgi_sel}][@data-view-label="{pn}"]'
    soup = linkToSoup_selenium(nextUrl, ecx=curPage_xpath)
    if soup is None or type(soup) == str: break

    ###################### EXTRACT DATA ######################
    # this is just printing the page# and 1st five IDs, but you can extract whatever you need from soup at this point
    curPg = soup.select_one(f'li > a.current[{pgi_sel}][data-view-label]')
    curPg = f'page {curPg.get("data-view-label")}' if curPg else '!! page ERROR !!'

    pageProds = soup.select('a.product-item[href*=".html?spid="]')
    curPg += f" [{len(pageProds)} products]:"
    first5ids = [a.get('href').split('.html?spid=')[-1] for a in pageProds][:5]
    print(f'{curPg:>22} ', " ".join([f'{i:>10}' for i in first5ids]), '...')
    ##########################################################

    nxtPg = soup.select_one(f'li > a[{pgi_sel}][href]:has(img[alt="arrow-right"])')
    if nxtPg is None or 'disabled' in nxtPg.get('class', ''): break
    nextUrl = nxtPg.get('href')

而印刷的

page 1 [40 products]:    66640131   20116852   66638723   20525157   67089668 ...
 page 2 [40 products]:    63465592   20911921   54388844   58555745   13385021 ...
 page 3 [40 products]:     1515345   57703788    1060978   54929902    2076819 ...
 page 4 [40 products]:    35737314   26299382    7029351   14970693   32139853 ...
 page 5 [11 products]:    52274203   51988147   50422842   36828505   45439018 ...

(If您不希望限制为maxPages,您可以使用类似while True的内容来代替for pn in range(maxPages),但是您还需要使用计数器或其他内容来获取pn以获得ecx,因为这是告诉函数等待,直到加载了html的该部分。)

相关问题