pandas “dataframe”对象没有属性“append”

aydmsdu9  于 2023-08-01  发布在  其他
关注(0)|答案(2)|浏览(422)

我得到以下错误:

attributeerror: 'dataframe' object has no attribute 'append'

字符串
当使用下面代码中的concat_append函数时。输出将生成一个空白数据集(见下图)。
我该如何解决这个问题?

import pandas as pd
import requests
from bs4 import BeautifulSoup

final=pd.DataFrame()
for j in range(1,800):

headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.3; Win 64 ; x64) Apple WeKit /537.36(KHTML , like Gecko) Chrome/80.0.3987.162 Safari/537.36'}
response=requests.get('https://www.ambitionbox.com/list-of-companies?campaign=desktop_nav&page={}',headers=headers).text
soup=BeautifulSoup(response,'html.parser')
company=soup.find_all('div',class_='company-content-wrapper')
name=[]
rating=[]
reviews=[]
ctype=[]
hq=[]
how_old=[]
no_of_employee=[]

for i in company:
    name.append(i.find('h2').text.strip())
    rating.append(i.find('p',class_='rating').text.strip())
    reviews.append(i.find('a',class_='review-count').text.strip())
    info_entities=i.find_all('p',class_='infoEntity')
    if len(info_entities)>0:
        ctype.append(info_entities[0].text.strip())
    else:
        ctype.append("N/A")

    if len(info_entities)>1:
        hq.append(info_entities[1].text.strip())
    else:
        hq.append("N/A")

    if len(info_entities)>2:
        how_old.append(info_entities[2].text.strip())
    else:
        how_old.append("N/A")

    if len(info_entities)>3:
        no_of_employee.append(info_entities[3].text.strip())
    else:
        no_of_employee.append("N/A")

dataset=pd.DataFrame({
    'Name':name,
    'Rating':rating,
    'reviews':reviews,
    'Company Type':ctype,
    'Headquaters':hq,
    'Company Age':how_old,
    'No. of Employee':no_of_employee
})

final=pd.concat([final,dataset],ignore_index=True)
print(final)
final.to_csv('Company_Dataset.csv')


的数据

envsm3lx

envsm3lx1#

这个Python脚本在我的笔记本电脑上运行得很好(通过外部for循环的缩进来使它工作),output.csv不为空。正如你提到的,append之前被删除了,从2023年4月开始使用concat。在Stackoverflow上,我们可以找到这个here。所以我不认为这是问题所在。
你能试着用更小的页面列表重新运行,用正确的代码格式修改你的问题,或者给予任何其他信息吗?这是我用来运行的代码(我只是改变使用一个页码列表来快速查看输出)

import pandas as pd
import requests
from bs4 import BeautifulSoup

final = pd.DataFrame()

for j in [1, 2, 600, 601]:
    
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win 64 ; x64) AppleWeKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36'
    }
    response = requests.get(f'https://www.ambitionbox.com/list-of-companies?campaign=desktop_nav&page={j}', headers=headers).text
    soup = BeautifulSoup(response, 'html.parser')
    company = soup.find_all('div', class_='company-content-wrapper')
    name = []
    rating = []
    reviews = []
    ctype = []
    hq = []
    how_old = []
    no_of_employee = []

    for i in company:
        name.append(i.find('h2').text.strip())
        rating.append(i.find('p', class_='rating').text.strip())
        reviews.append(i.find('a', class_='review-count').text.strip())
        info_entities = i.find_all('p', class_='infoEntity')
        if len(info_entities) > 0:
            ctype.append(info_entities[0].text.strip())
        else:
            ctype.append("N/A")

        if len(info_entities) > 1:
            hq.append(info_entities[1].text.strip())
        else:
            hq.append("N/A")

        if len(info_entities) > 2:
            how_old.append(info_entities[2].text.strip())
        else:
            how_old.append("N/A")

        if len(info_entities) > 3:
            no_of_employee.append(info_entities[3].text.strip())
        else:
            no_of_employee.append("N/A")

    dataset = pd.DataFrame({
        'Name': name,
        'Rating': rating,
        'Reviews': reviews,
        'Company Type': ctype,
        'Headquarters': hq,
        'Company Age': how_old,
        'No. of Employees': no_of_employee
    })

    final = pd.concat([final, dataset], ignore_index=True)

final.to_csv('Company_Dataset_2.csv', index=True)

字符串

gjmwrych

gjmwrych2#

米兰-您输入的回复格式有误。你写道:

response=requests.get('https://www.ambitionbox.com/list-of-companies?campaign=desktop_nav&page={}',headers=headers).text

字符串
当你应该有这个:一个f字符串,j放在大括号中。

response=requests.get(f'https://www.ambitionbox.com/list-of-companies?campaign=desktop_nav&page={j}',headers=headers).text


SophieTP纠正了他们答案中的错误,因此他们没有错误。当纠正这一点时,它工作得非常好。

相关问题