python BeautifulSoup -我怎么能在另一个发现里面找到...?

6uxekuva  于 2023-05-27  发布在  Python
关注(0)|答案(1)|浏览(96)

我一直在尝试为自己做一些代码我的想法是解析一个网站,我看我的动画,特别是他们的计划,知道当我看的动画发布(哪一天/什么时候)
这里是链接,如果你需要它:https://anime-sama.fr/planning下面是我的代码:

import urllib3
from bs4 import BeautifulSoup

http = urllib3.PoolManager()
r = http.request('GET','https://anime-sama.fr/planning')
html = r.data
soup = BeautifulSoup(html, 'html.parser')

days = ["\nLundi\n", "\nMardi\n", "\nMercredi\n", "\nJeudi\n", "\nVendredi\n", "\nSamedi\n", "\nDimanche\n"]
animes = ["Insomniacs After School", "Demon Slayer", "Eden's Zero", "Cheat Skill Level Up", "Tonikaku Kawaii", "The Girl Downstairs", "Hell's Paradise", "Noble New World Adventures"]

jour = soup.find_all('h2', class_="titreJours text-white text-center text-2xl font-bold uppercase mt-6 mb-4 border-b-4 border-slate-500")

for anime in range(len(animes)):
        for day in range(len(days)):
            for h in range(len(jour)):
                if jour[day].string == days[day]:
                    an = jour[day].find_all('h1', class_="text-gray-200 font-semibold text-xs text-center uppercase line-clamp-2 md:line-clamp-3 hover:text-clip")
                    for title in range(len(an)):
                        if an[title].string == animes[anime]: 
                            time = an.parent.find('button', class_="rounded rounded-lg bg-opacity-50 text-blue-100 text-md font-medium mx-0.5 mt-1 px-1 py-0.5 Heure").string
                            print(f"{days[day]} {time}: {anime}")

我的想法是把我正在看的动画放在“animes”变量中,然后对于每一个动画,它会搜索是否可以在“Lundi”(Lundi=星期一)列中找到它,然后在“Mardi”(Mardi=星期二)等。(其中“jour[day]”变量作为列的变量,“an”变量是实际动画的名称)
如果它找到了动画,它会尝试找到小时并将其放入“time”变量中。问题是,在第21行和第26行,我试图在另一个find函数中使用find函数,这会返回一个错误(由于try / except,有时会返回“E3”)。
编辑:我不得不改变一些代码,所以它不显示错误,但它仍然不工作,因为它仍然不打印的信息,我正在寻找在年底;-;
enter image description here

px9o7tmv

px9o7tmv1#

您不需要所有这些循环来查找匹配的文本。您可以在find()find_all()调用中指定text=something,以搜索具有指定内容的元素。
您可以使用itertools.product()来生成所有组合,而不是animesdays的嵌套循环。

import urllib3
from bs4 import BeautifulSoup
from itertools import product

http = urllib3.PoolManager()
r = http.request('GET','https://anime-sama.fr/planning')
html = r.data
soup = BeautifulSoup(html, 'html.parser')

days = ["\nLundi\n", "\nMardi\n", "\nMercredi\n", "\nJeudi\n", "\nVendredi\n", "\nSamedi\n", "\nDimanche\n"]
animes = ["Insomniacs After School", "Demon Slayer", "Eden's Zero", "Cheat Skill Level Up", "Tonikaku Kawaii", "The Girl Downstairs", "Hell's Paradise", "Noble New World Adventures"]

try:
    for anime, day in product(animes, days):
        try:
            jour = soup.find('h2', class_="titreJours text-white text-center text-2xl font-bold uppercase mt-6 mb-4 border-b-4 border-slate-500", text=day)
            if jour:
                try:
                    an = jour.find('h1', class_="text-gray-200 font-semibold text-xs text-center uppercase line-clamp-2 md:line-clamp-3 hover:text-clip", text=anime)
                    if an:
                        try:
                            time = an.parent.find('button', class_="rounded rounded-lg bg-opacity-50 text-blue-100 text-md font-medium mx-0.5 mt-1 px-1 py-0.5 Heure").string
                            print(f"{day} {time}: {anime}")
                        except:
                            print("E5")
                    except:
                        print("E4")
                except:
                    print("E2")
        except:
            print("E1")
except:
    print("E0")

相关问题