从具有R中的边属性的表创建边列表

gjmwrych  于 2023-01-18  发布在  其他
关注(0)|答案(1)|浏览(111)

对于使用igraph进行的网络分析,我想从包含边属性的表创建一个边列表,该属性包含在Origin变量中。我导入了一个excel文件,如下所示

之后,我将第二列分隔为多列并修剪空格。

test<-separate(ID_Kontakt_import_test, 'Contacts 1', paste("Contacts", 1:20, sep="_"), sep=",", extra="drop")
test<-data.frame(lapply(test,trimws),stringsAsFactors = FALSE)

这是我的数据集的一部分。

structure(list(ID = c("ID_003", "ID_004", "ID_009", "ID_009"), 
    Contacts_1 = c("ID_001", "ID_001", "ID_001", "ID_398"), Contacts_2 = c("ID_002", 
    "ID_002", "ID_002", NA), Contacts_3 = c("ID_004", "ID_003", 
    "ID_003", NA), Contacts_4 = c("ID_005", "ID_005", "ID_004", 
    NA), Contacts_5 = c("ID_006", "ID_006", "ID_005", NA), Contacts_6 = c("ID_007", 
    "ID_007", "ID_006", NA), Contacts_7 = c("ID_008", "ID_008", 
    "ID_007", NA), Contacts_8 = c("ID_009", "ID_009", "ID_008", 
    NA), Contacts_9 = c(NA, NA, "ID_011", NA), Contacts_10 = c(NA, 
    NA, "ID_012", NA), Contacts_11 = c(NA, NA, "ID_013", NA), 
    Contacts_12 = c(NA, NA, "ID_016", NA), Contacts_13 = c(NA, 
    NA, "ID_017", NA), Contacts_14 = c(NA, NA, "ID_028", NA), 
    Contacts_15 = c(NA, NA, "ID_040", NA), Contacts_16 = c(NA_character_, 
    NA_character_, NA_character_, NA_character_), Contacts_17 = c(NA_character_, 
    NA_character_, NA_character_, NA_character_), Contacts_18 = c(NA_character_, 
    NA_character_, NA_character_, NA_character_), Contacts_19 = c(NA_character_, 
    NA_character_, NA_character_, NA_character_), Contacts_20 = c(NA_character_, 
    NA_character_, NA_character_, NA_character_), Origin = c("1", 
    "1", "1", "2")), class = "data.frame", row.names = c(NA, 
-4L))

我已经创建了一个没有边缘属性的边缘列表,方法是将数据框转换为矩阵,然后用cbind创建一个边缘列表,但是我不知道如何在第三列中添加边缘属性。

m <- as.matrix(test)
el <- cbind(m[, 1], c(m[, -1])) #create edgelist 

el<-na.omit(el) #drop NA
dups <- duplicated(t(apply(el, 1, sort)))
el2<-el[!dups, ] #drop duplicates

因此,我希望所有边的数据基本上都像这样
| 第1版|第2版|起源|
| - ------|- ------|- ------|
| 编号_003|编号_001|1个|
| 编号_003|编号_009|1个|
| 编号_009|编号_040|1个|
| 编号_009|编号_389|第二章|

5fjcxozz

5fjcxozz1#

使用tidyr/dplyr

library(tidyr)
library(dplyr)

df2 <- df %>%
    tidyr::pivot_longer(cols = contains("Contacts"), values_to = "V2") %>%
    dplyr::select(V1 = ID, V2, Origin)

df2[complete.cases(df2),]

# A tibble: 32 × 3
   V1     V2     Origin
   <chr>  <chr>  <chr> 
 1 ID_003 ID_001 1     
 2 ID_003 ID_002 1     
 3 ID_003 ID_004 1     
 4 ID_003 ID_005 1     
 5 ID_003 ID_006 1     
 6 ID_003 ID_007 1     
 7 ID_003 ID_008 1     
 8 ID_003 ID_009 1     
 9 ID_004 ID_001 1     
10 ID_004 ID_002 1     
# … with 22 more rows

相关问题