scrapy 如何访问嵌套键及其值以一次修改或删除所有项?

tpgth1q7  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(87)

学习python和一些东西,因为我需要为我的项目收集大量的数据,我被困在这里了。

"status": "ok",
  "status_message": "Query was successful",
  "data": {
    "product_count": 40993,
    "limit": 20,
    "page_number": 1,
    "products": [
      {
        "id": 41789,
        "url": "https://anything1.com",
        "product_name": "product1",
        "manufacturing_date": "19.12.2014",
        "rating": 5.3,
        "material": "something",
        "description": "",
        "cover_image": "anycover1.com",
        "state": "ok",
        "variants": [
          {
            "url": "https://anyvariant1.com",
            "product_code": "55BEF7",
            "material": "something",
            "size": "small",
            "dimensions": ""          },
          {
            "url": "https://anyvariant2.com",
            "product_code": "55BEF8",
            "material": "something",
            "size": "medium",
            "dimensions": ""          },
          {
            "url": "https://anyvariant3.com",
            "product_code": "55BEF9",
            "material": "something",
            "size": "large",
            "dimensions": ""          }          
        ]
      },
      {
        "id": 41790,
        "url": "https://anything2.com",
        "product_name": "product2",
        "manufacturing_date": "02.10.2014",
        "rating": 7.2,
        "material": "something",
        "description": "",
        "cover_image": "anycover2.com",
        "state": "ok",
        "variants": [
          {
            "url": "https://anyvariant4.com",
            "product_code": "55BEG7",
            "material": "something",
            "size": "small",
            "dimensions": ""          },
          {
            "url": "https://anyvariant5.com",
            "product_code": "55BEG8",
            "material": "something",
            "size": "medium",
            "dimensions": ""          },
          {
            "url": "https://anyvariant6.com",
            "product_code": "55BEG9",
            "material": "something",
            "size": "large",
            "dimensions": ""          }          
        ]
      },
      {
        _______
      },
      {
        _______
      }      
    ]
  },
  "@meta": {
    "server_time": 1651288705,
    "execution_time": "0.01 ms"
  }
}

这是我的scraper代码的样子;

data = json.loads(response.body)
    data_main = data['data']['products']
    product_list = []
    for item in data_main:
        id = item['id']
        url = item['url']
        product_name = item['product_name']
        rating = item['rating']
        cover_image = item['cover_image']
        description = item['description']
        product = {
            'id': id,
            'url': url,
            'name': product_name,
            'image': cover_image,
            'rating': rating,
            'description': description
        }
        product_list.append(product)
    return product_list

有了这个键和值的id,url,name,image,rating,description是可访问的。但是不能访问和修改嵌套的键和它们的值都在一次(并忽略一些键和值)。那么我怎么做呢?如果有任何其他更好的代码来实现我需要的,那么请建议。非常感谢。

exdqitrt

exdqitrt1#

这里的嵌套键和值是指variants下的那些键和值,访问它们的方式与遍历项的方式大致相同:

variant_list = []
for variant in item[variants]:
    url = variant['url']
    # and so on... for whatever other keys you're interested in
    new_variant = {'url':url} # and whatever other keys you want
    variant_list.append(new_variant)

但是,我不得不想知道,为什么要重新构建与JSON提供的字典相似的字典?出于许多目的,您还不如继续使用JSON提供的字典。

相关问题