读取R中包含嵌套列表的JSON文件

v8wbuo2f  于 2023-02-01  发布在  其他
关注(0)|答案(1)|浏览(106)

我有一个大型json数据集,我想将其转换为R中的数据框
(很抱歉,如果这可能是一个重复的问题,但其他答案没有帮助我)我的Json文件如下:

[{"src": "http://www.europarl.eu", "peid": "PE529.899v01-00", "reference": "2014/2021(INI)", "date": "2014-03-05T00:00:00", "committee": ["AFET"], "seq": 1, "id": "PE529.899-1", "orig_lang": "en", "new": ["- having regard to its resolution of 13", "December 20071 on Justice for the", "'Comfort Women' (sex slaves in Asia", "before and during World War II) as well", "as the statements by Japanese Chief", "Cabinet Secretary Yohei Kono in 1993", "and by the then Prime Minister Tomiichi", "Murayama in 1995, the resolutions of the", "Japanese parliament (the Diet) of 1995", "and 2005 expressing apologies for", "wartime victims, including victims of the", "'comfort women' system,", "_______________________", "1", "OJ C 323E, 18.12.2008, p.531"], "authors": "Reinhard Bütikofer on behalf of the Verts/ALE Group", "meps": [96739], "location": [["Motion for a resolution", "Citation 6 a (new)"]], "meta": {"created": "2019-07-03T05:06:17"}, "changes": {}}
,{"src": "http://www.europarl.eu", "peid": "PE529.863v01-00", "reference": "2014/2016(INI)", "date": "2014-02-27T00:00:00", "committee": ["AFET"], "seq": 1, "id": "PE529.863-1", "orig_lang": "en", "new": ["- having regard to the Statement by the", "Vice-President of the Commission/ High", "Representative of the Union for Foreign", "affairs and Security Policy (VP/HR)", "Catherine Ashton of 20 March 2013 on", "the Magnitsky case in the Russian", "Federation,"], "authors": "Jacek Protasiewicz", "meps": [23782], "location": [["Motion for a resolution", "Citation 4 a (new)"]], "meta": {"created": "2019-07-03T05:06:17"}, "changes": {}}
,{"src": "http://www.europarl.eu", "peid": "PE529.713v01-00", "reference": "2013/2149(INI)", "date": "2014-02-12T00:00:00", "committee": ["AFET"], "seq": 238, "id": "PE529.713-238", "orig_lang": "en", "old": ["A. whereas the European Neighbourhood", "Policy (ENP), in particular the Eastern", "Partnership (EaP), aims to extend the", "values and ideas of the founders of the EU;"], "new": ["A. whereas the European Neighbourhood", "Policy (ENP) embraces the values and", "ideas of the founders of the EU, notably", "the principles of Peace, Solidarity and", "Prosperity;"], "authors": "Mário David", "meps": [96973], "location": [["Motion for a resolution", "Recital A"]], "meta": {"created": "2019-07-03T05:06:18"}, "changes": {}}
,{"src": "http://www.europarl.eu", "peid": "PE529.899v01-00", "reference": "2014/2021(INI)", "date": "2014-03-05T00:00:00", "committee": ["AFET"], "seq": 2, "id": "PE529.899-2", "orig_lang": "en", "new": ["- having regard to the catastrophic", "earthquake and subsequent tsunami", "which devastated important parts of", "Japan's coast on 11 March 2011 and led", "to the destruction of the Fukushima", "nuclear power plant, causing possibly the", "greatest radiation disaster in human", "history,"], "authors": "Reinhard Bütikofer on behalf of the Verts/ALE Group", "meps": [96739], "location": [["Motion for a resolution", "Citation 11 a (new)"]], "meta": {"created": "2019-07-03T05:06:18"}, "changes": {}}

我希望有一个 Dataframe 如下:

src               peid          reference                date           committee        seq        id        orig_lang             new                  ...  
http://www.europarl.eu PE529.899v01-00  2014/2021(INI)    2014-03-05T00:00:00       AFET           1      PE529.899-1       en      ["- having ... p.531"]          ...
http://www.europarl.eu PE529.863v01-00  2014/2016(INI)    2014-02-27T00:00:00       AFET          128     PE529.899-1       en      ["- having ..."Federation,"]  ...
http://www.europarl.eu PE529.713v01-00  2013/2149(INI)    2014-02-12T00:00:00       AFET          238     PE529.899-1       en      ["- having ..."Federation,"]    ...
http://www.europarl.eu PE529.899v01-00  2014/2021(INI)    2014-03-05T00:00:00       AFET           1      PE529.899-1       en      ["- having ..."Federation,"]    ...

(上表未完整填写)
我已经尝试了以下代码:

library(rjson)
library(jsonlite)
Data <- fromJSON(file="data.json")

但每行如下所示:

[[1]]
[[1]]$src
[1] "http://www.europarl.eu/sides/getDoc.do?pubRef=-//EP//NONSGML+COMPARL+PE-529.899+01+DOC+PDF+V0//EN&language=EN"

[[1]]$peid
[1] "PE529.899v01-00"

[[1]]$reference
[1] "2014/2021(INI)"

[[1]]$date
[1] "2014-03-05T00:00:00"

[[1]]$committee
[1] "AFET"

[[1]]$seq
[1] 1

[[1]]$id
[1] "PE529.899-1"

[[1]]$orig_lang
[1] "en"

[[1]]$new
[1] "- having regard to its resolution of 13"   "December 20071 on Justice for the"        
[3] "'Comfort Women' (sex slaves in Asia"       "before and during World War II) as well"  
[5] "as the statements by Japanese Chief"       "Cabinet Secretary Yohei Kono in 1993"     
[7] "and by the then Prime Minister Tomiichi"   "Murayama in 1995, the resolutions of the" 
[9] "Japanese parliament (the Diet) of 1995"    "and 2005 expressing apologies for"        
[11] "wartime victims, including victims of the" "'comfort women' system,"                  
[13] "_______________________"                   "1"                                        
[15] "OJ C 323E, 18.12.2008, p.531"             

[[1]]$authors
[1] "Reinhard Bütikofer on behalf of the Verts/ALE Group"

[[1]]$meps
[1] 96739

[[1]]$location
[[1]]$location[[1]]
[1] "Motion for a resolution" "Citation 6 a (new)"     

[[1]]$meta
[[1]]$meta$created
[1] "2019-07-03T05:06:17"

[[1]]$changes
list()

dput版本如下:

list(list(src = "http://www.europarl.eu", 
    peid = "PE529.899v01-00", reference = "2014/2021(INI)", date = "2014-03-05T00:00:00", 
    committee = "AFET", seq = 1, id = "PE529.899-1", orig_lang = "en", 
    new = c("- having regard to its resolution of 13", "December 20071 on Justice for the", 
    "'Comfort Women' (sex slaves in Asia", "before and during World War II) as well", 
    "as the statements by Japanese Chief", "Cabinet Secretary Yohei Kono in 1993", 
    "and by the then Prime Minister Tomiichi", "Murayama in 1995, the resolutions of the", 
    "Japanese parliament (the Diet) of 1995", "and 2005 expressing apologies for", 
    "wartime victims, including victims of the", "'comfort women' system,", 
    "_______________________", "1", "OJ C 323E, 18.12.2008, p.531"
    ), authors = "Reinhard Bütikofer on behalf of the Verts/ALE Group", 
    meps = 96739, location = list(c("Motion for a resolution", 
    "Citation 6 a (new)")), meta = list(created = "2019-07-03T05:06:17"), 
    changes = list()))

我遇到的一个问题是在第9列中,如您所见,我希望将所有15个组件放在 Dataframe 的一个单元格中

[[1]]$new
 [1] "- having regard to its resolution of 13"   "December 20071 on Justice for the"        
 [3] "'Comfort Women' (sex slaves in Asia"       "before and during World War II) as well"  
 [5] "as the statements by Japanese Chief"       "Cabinet Secretary Yohei Kono in 1993"     
 [7] "and by the then Prime Minister Tomiichi"   "Murayama in 1995, the resolutions of the" 
 [9] "Japanese parliament (the Diet) of 1995"    "and 2005 expressing apologies for"        
[11] "wartime victims, including victims of the" "'comfort women' system,"                  
[13] "_______________________"                   "1"                                        
[15] "OJ C 323E, 18.12.2008, p.531"

我怎样才能得到我上面提到的表?

wkyowqbh

wkyowqbh1#

我们可以通过paste ing(str_c)将lengths大于1的嵌套list元素转换为单个字符串,然后将命名列表绑定到_dfr的列

library(purrr)
library(dplyr)
library(stringr)
map_dfr(Data, ~ map(.x, unlist) %>%
     map_dfr(~ if(length(.x) > 1) str_c(.x, collapse = ";") else .x))

或者使用递归函数rrapplybind,将length大于1的元素作为list

library(rrapply)
map_dfr(Data, ~ rrapply(.x, how = "bind"))

相关问题