我是Python新手,正试图从Indeed中抓取远程数据分析师职位,并将其发送到csv文件中。我故意添加代码来解决我一直面临的SSL证书问题。结果显示,我的文件中添加了作业,但除了标题之外,什么都没有显示。
你能帮我找出我做错了什么吗?非常感谢。
下面是我的代码:
import requests
import csv
from bs4 import BeautifulSoup
# Define the dataanalyst variable.
dataanalyst = "data analyst"
def get_job_postings(dataanalyst):
"""Gets the job postings from Indeed for the given keyword."""
# Get the Indeed search URL for the given keyword.
search_url = "https://www.indeed.com/jobs?q=data+analyst&l=remote&vjk=30f58c7471301c42".format(keyword)
# Make a request to the Indeed search URL.
response = requests.get(search_url, verify=False)
# Parse the response and get the job postings.
soup = BeautifulSoup(response.content, "html.parser")
job_postings = soup.find_all("div", class_="jobsearch-result")
return job_postings
def write_job_postings_to_csv(job_postings, filename):
"""Writes the job postings to a CSV file."""
# Create a CSV file to store the job postings.
with open(filename, "w", newline="") as csvfile:
# Create a CSV writer object.
writer = csv.writer(csvfile)
# Write the header row to the CSV file.
writer.writerow(["Title", "Company", "Location", "Description"])
# Write the job postings to the CSV file.
for job_posting in job_postings:
title = job_posting.find("h2", class_="jobtitle").text
company = job_posting.find("span", class_="company").text
location = job_posting.find("span", class_="location").text
description = job_posting.find("div", class_="job-snippet").text
writer.writerow([title, company, location, description])
if __name__ == "__main__":
# Define the dataanalyst variable.
dataanalyst = "data+analyst"
# Get the keyword from the user.
keyword = "data analyst"
# Get the job postings from Indeed.
job_postings = get_job_postings(dataanalyst)
# Write the job postings to a CSV file.
write_job_postings_to_csv(job_postings, "remote_data_analyst_positions.csv")
print("The job postings have been successfully scraped and written to a CSV file.")
以下是我的最终结果:
PS C:\Users\chlor\OneDrive\Documents\Python> & C:/Users/chlor/AppData/Local/Programs/Python/Python311/python.exe c:/Users/chlor/OneDrive/Documents/Python/Indeed_DataAnalyst_Remote/DataAnalyst_Remote.py C:\Users\chlor\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py:1045:InsecureRequestWarning:正在向主机“localhost”发出未经验证的HTTPS请求。强烈建议添加证书验证。请参阅:https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings warnings.warn(已成功抓取职务公告并将其写入CSV文件。PS C:\Users\chlor\OneDrive\Documents\Python>
我希望这会将职位空缺写入我的CSV文件。
1条答案
按热度按时间fnvucqvd1#
假设您能够使用请求获取完整的HTML内容,只需更改
到
我已经测试过了,可以,但是我使用Selenium来处理HTML内容,因为请求不能获取完整的内容