如何将JSON文件中的信息转换为R中的 Dataframe ?

5lwkijsr  于 2023-02-20  发布在  其他
关注(0)|答案(1)|浏览(116)

我在名为data.jsonJSON文件中有一些信息,如下所示:

{
  "header" : {
    "apiVersion" : "v1",
    "code" : "200",
    "service" : "catalogwebservice",
    "developerMessage" : "",
    "userMessage" : "OK",
    "errorCode" : "1",
    "docLink" : "https://ega-archive.org",
    "errorStack" : ""
  },
  "response" : {
    "numTotalResults" : 12,
    "resultType" : "SampleData",
    "result" : [ {
      "alias" : "JKDFG093.T2",
      "egaStableId" : "EGAN00003456789",
      "centerName" : "Novartis",
      "creationTime" : "2016-05-13Y17:08.001Z",
      "title" : "JKDFG093.T2",
      "bioSampleId" : "MADFG110656789",
      "subjectId" : "JKDFG093",
      "gender" : "male",
      "phenotype" : "Cancer",
      "attributes" : null
    }, {
      "alias" : "JKDFG093.T1",
      "egaStableId" : "EGAN00003456780",
      "centerName" : "Novartis",
      "creationTime" : "2016-05-13Y17:08.001Z",
      "title" : "JKDFG093.T1",
      "bioSampleId" : "MADFG110656790",
      "subjectId" : "JKDFG093",
      "gender" : "female",
      "phenotype" : "Cancer",
      "attributes" : null
    }, {
      "alias" : "JKDFG087.T1",
      "egaStableId" : "EGAN00003456781",
      "centerName" : "Novartis",
      "creationTime" : "2016-05-13Y17:08.001Z",
      "title" : "JKDFG087.T1",
      "bioSampleId" : "MADFG110656791",
      "subjectId" : "JKDFG087",
      "gender" : "male",
      "phenotype" : "Cancer",
      "attributes" : null
    } ]
  }
}

我想将JSON文件中的信息转换为 Dataframe 。我需要来自上述JSON文件的alias, egaStableId, centerName, creationTime, title, bioSampleId, subjectId, gender, phenotype, and attributes等信息作为列名,并在 Dataframe 中显示它们各自的信息。
我加载了JSON file in R并尝试将其转换为 Dataframe ,但最终出现了一些错误。

library(rjson)
data <- rjson::fromJSON(file = "data.json")
json_data_frame <- as.data.frame(data)

我得到的错误:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 1, 0

任何帮助都很感激。谢谢!!

6uxekuva

6uxekuva1#

您需要深入研究数据以获得结果,即$response$result
将此处的json替换为您的文件名:

jsonlite::fromJSON(json)$response$result
#         alias     egaStableId centerName          creationTime       title    bioSampleId subjectId gender phenotype attributes
# 1 JKDFG093.T2 EGAN00003456789   Novartis 2016-05-13Y17:08.001Z JKDFG093.T2 MADFG110656789  JKDFG093   male    Cancer         NA
# 2 JKDFG093.T1 EGAN00003456780   Novartis 2016-05-13Y17:08.001Z JKDFG093.T1 MADFG110656790  JKDFG093 female    Cancer         NA
# 3 JKDFG087.T1 EGAN00003456781   Novartis 2016-05-13Y17:08.001Z JKDFG087.T1 MADFG110656791  JKDFG087   male    Cancer         NA

也就是说,如果查看fromJSON的输出,您会看到列表中嵌套了一个框架(嵌套得有点深):

str(jsonlite::fromJSON(json))
# List of 2
#  $ header  :List of 8
#   ..$ apiVersion      : chr "v1"
#   ..$ code            : chr "200"
#   ..$ service         : chr "catalogwebservice"
#   ..$ developerMessage: chr ""
#   ..$ userMessage     : chr "OK"
#   ..$ errorCode       : chr "1"
#   ..$ docLink         : chr "https://ega-archive.org"
#   ..$ errorStack      : chr ""
#  $ response:List of 3
#   ..$ numTotalResults: int 12
#   ..$ resultType     : chr "SampleData"
#   ..$ result         :'data.frame':   3 obs. of  10 variables:
#   .. ..$ alias       : chr [1:3] "JKDFG093.T2" "JKDFG093.T1" "JKDFG087.T1"
#   .. ..$ egaStableId : chr [1:3] "EGAN00003456789" "EGAN00003456780" "EGAN00003456781"
#   .. ..$ centerName  : chr [1:3] "Novartis" "Novartis" "Novartis"
#   .. ..$ creationTime: chr [1:3] "2016-05-13Y17:08.001Z" "2016-05-13Y17:08.001Z" "2016-05-13Y17:08.001Z"
#   .. ..$ title       : chr [1:3] "JKDFG093.T2" "JKDFG093.T1" "JKDFG087.T1"
#   .. ..$ bioSampleId : chr [1:3] "MADFG110656789" "MADFG110656790" "MADFG110656791"
#   .. ..$ subjectId   : chr [1:3] "JKDFG093" "JKDFG093" "JKDFG087"
#   .. ..$ gender      : chr [1:3] "male" "female" "male"
#   .. ..$ phenotype   : chr [1:3] "Cancer" "Cancer" "Cancer"
#   .. ..$ attributes  : logi [1:3] NA NA NA

我使用的是jsonlite,我相信它与rjson非常接近,应该可以执行相同的操作。如果 yours 未被列为

..$ result         :'data.frame': 3 obs. of  10 variables:

str-输出中,然后将其 Package 在as.data.frame中,如下所示

as.data.frame(jsonlite::fromJSON(json)$response$result)

相关问题