我是初学者,我只是为YouTube数据API搜索列表创建分页循环,返回100个YouTube搜索结果,但当它需要转换为Pandas Dataframe 时,它只使用返回数据的最后一部分。
例如,如果我的最大结果是40(而不是50),它将只返回我的30行。
请问,我怎样才能修复存储在我的变量的数据?
#import
from google.colab import auth
auth.authenticate_user()
import gspread
from google.auth import default
creds, _ = default()
gc = gspread.authorize(creds)
!pip install google-api-python-client
from googleapiclient.discovery import build
import pandas as pd
import seaborn as sb
import csv
import re
import requests
import numpy as np
from google.colab import data_table
data_table.enable_dataframe_formatter()
from google.colab import drive
api_key = "***"
from googleapiclient.discovery import build
from pprint import PrettyPrinter
from google.colab import files
youtube = build('youtube','v3',developerKey = api_key)
#print(type(youtube))
pp = PrettyPrinter()
nextPageToken = ''
for x in range(2):
request = youtube.search().list(
q = query,
part='id',
maxResults=50,
order="date",
# publishedAfter='2022-05-09T00:00:00.000Z',
# publishedBefore='2022-07-09T00:00:00.000Z',
pageToken=nextPageToken,
type='video')
print(type(request))
res = request.execute()
pp.pprint(res)
if 'nextPageToken' in res:
nextPageToken = res['nextPageToken']
ids = [item['id']['videoId'] for item in res['items']]
results = youtube.videos().list(id=ids, part='snippet').execute()
for result in results.get('items', []):
print(result ['id'])
print(result ['snippet']['channelTitle'])
print(result ['snippet']['title'])
print(result ['snippet']['description'])
1条答案
按热度按时间vngu2lb81#
您的问题似乎与
pandas
无关。Python将上一次赋值的for循环作用域变量排除在外,这就是为什么
仅执行50次(即传递给Search: list的
maxResults
,而不是100次(即使调用了两次Search: list
)。如果你想用刚从
Search: list
中获取的id
s调用Videos: list,那么只需要缩进最后一段代码就可以了。