pandas 如何修复数据存储?

yb3bgrhw  于 2022-12-28  发布在  其他
关注(0)|答案(1)|浏览(147)

我是初学者,我只是为YouTube数据API搜索列表创建分页循环,返回100个YouTube搜索结果,但当它需要转换为Pandas Dataframe 时,它只使用返回数据的最后一部分。
例如,如果我的最大结果是40(而不是50),它将只返回我的30行。
请问,我怎样才能修复存储在我的变量的数据?

#import 
from google.colab import auth
auth.authenticate_user()

import gspread
from google.auth import default
creds, _ = default()

gc = gspread.authorize(creds)

!pip install google-api-python-client
from googleapiclient.discovery import build
import pandas as pd
import seaborn as sb
import csv
import re
import requests
import numpy as np 
from google.colab import data_table
data_table.enable_dataframe_formatter()

from google.colab import drive
api_key = "***"

from googleapiclient.discovery import build
from pprint import PrettyPrinter
from google.colab import files

youtube = build('youtube','v3',developerKey = api_key)

#print(type(youtube))
pp = PrettyPrinter()
nextPageToken = ''

for x in range(2):
 
    request = youtube.search().list(
        q = query,
        part='id',
        maxResults=50,
        order="date",
#        publishedAfter='2022-05-09T00:00:00.000Z',
#        publishedBefore='2022-07-09T00:00:00.000Z',
        pageToken=nextPageToken,
        type='video')
    

    print(type(request))
    res = request.execute()
    pp.pprint(res) 

    if 'nextPageToken' in res:
          nextPageToken = res['nextPageToken']
ids = [item['id']['videoId'] for item in res['items']]
results = youtube.videos().list(id=ids, part='snippet').execute()
for result in results.get('items', []):
    print(result ['id'])
    print(result ['snippet']['channelTitle'])
    print(result ['snippet']['title'])
    print(result ['snippet']['description'])
vngu2lb8

vngu2lb81#

您的问题似乎与pandas无关。
Python将上一次赋值的for循环作用域变量排除在外,这就是为什么

print(result ['id'])
    print(result ['snippet']['channelTitle'])
    print(result ['snippet']['title'])
    print(result ['snippet']['description'])

仅执行50次(即传递给Search: listmaxResults,而不是100次(即使调用了两次Search: list)。
如果你想用刚从Search: list中获取的id s调用Videos: list,那么只需要缩进最后一段代码就可以了。

ids = [item['id']['videoId'] for item in res['items']]
    results = youtube.videos().list(id=ids, part='snippet').execute()
    for result in results.get('items', []):
        print(result ['id'])
        print(result ['snippet']['channelTitle'])
        print(result ['snippet']['title'])
        print(result ['snippet']['description'])

相关问题