与来自beautifulsoup的html变量交互

lhcgjxsq  于 2021-07-13  发布在  Java
关注(0)|答案(1)|浏览(367)

我已经编写了代码,从bookings.com的特定网址获取酒店名称和价格。我试图让工具只输出一个酒店的名称和价格,我正在寻找。我可以在页面上输出所有酒店的名称和价格,但是当我运行if语句试图输出一个单数时,它就不起作用了。我试着在选择酒店名称和价格的代码中加上str(),但是没有输出结果。当前代码只返回“错误的酒店”。我是否无法在刮取变量后操作它们?因为我也想比较一下酒店的价格。

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36'}

url = 'https://www.booking.com/searchresults.en-gb.html?aid=355028&sid=d2a902f346650dc0b748848763652bdc&sb=1&src=searchresults&src_elem=sb&error_url=https%3A%2F%2Fwww.booking.com%2Fsearchresults.en-gb.html%3Faid%3D355028%3Bsid%3Dd2a902f346650dc0b748848763652bdc%3Btmpl%3Dsearchresults%3Bcheckin_month%3D5%3Bcheckin_monthday%3D8%3Bcheckin_year%3D2021%3Bcheckout_month%3D5%3Bcheckout_monthday%3D13%3Bcheckout_year%3D2021%3Bcity%3D-2601889%3Bclass_interval%3D1%3Bdest_id%3D-2601889%3Bdest_type%3Dcity%3Bdtdisc%3D0%3Bfrom_sf%3D1%3Bgroup_adults%3D1%3Bgroup_children%3D0%3Binac%3D0%3Bindex_postcard%3D0%3Blabel_click%3Dundef%3Bno_rooms%3D1%3Boffset%3D0%3Bpostcard%3D0%3Broom1%3DA%3Bsb_price_type%3Dtotal%3Bshw_aparth%3D1%3Bslp_r_match%3D0%3Bsrc%3Dsearchresults%3Bsrc_elem%3Dsb%3Bsrpvid%3D6eda76c5afe000a5%3Bss%3DLondon%3Bss_all%3D0%3Bssb%3Dempty%3Bsshis%3D0%3Bssne%3DLondon%3Bssne_untouched%3DLondon%3Btop_ufis%3D1%3Bsig%3Dv1yWyN9mHA%3B&ss=London+Marriott+Hotel+County+Hall%2C+London%2C+Greater+London%2C+United+Kingdom&is_ski_area=&ssne=London&ssne_untouched=London&city=-2601889&checkin_year=2021&checkin_month=5&checkin_monthday=8&checkout_year=2021&checkout_month=5&checkout_monthday=13&group_adults=1&group_children=0&no_rooms=1&from_sf=1&ss_raw=Marriott+London&ac_position=1&ac_langcode=en&ac_click_type=b&dest_id=36867&dest_type=hotel&place_id_lat=51.5010959924622&place_id_lon=-0.119165182113647&search_pageview_id=6eda76c5afe000a5&search_selected=true&search_pageview_id=6eda76c5afe000a5&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=0'

response=requests.get(url, headers=headers)

soup=BeautifulSoup(response.content, "lxml")

for item in soup.select('.sr_property_block'):
    try:
        hotelname = item.select('.sr-hotel__name')[0].get_text()
        hotelprice = item.select('.bui-price-display__value')[0].get_text()

        if hotelname == 'London Marriott Hotel County Hall':
            print(hotelname)
            print(hotelprice)
        else:
            print('Wrong Hotel')        

        #print('---------------')

    except Exception as e:
        print('')
ctzwtxfj

ctzwtxfj1#

有两个空间 hotelname 刮除后,一个前导空格字符和一个尾随空格字符。使用 strip() 消除中的前导和尾随空格字符 hotelname . hotelname = hotelname.strip()

相关问题