pandas 请告诉我如何使用for循环用元素填充空矩阵

b4wnujal  于 2023-03-16  发布在  其他
关注(0)|答案(1)|浏览(90)

我是一个初学者谁是研究生物信息学与scanpy这些天。我正在努力提高,所以任何帮助是非常欢迎的,谢谢很多!

##This lists contains gene names.
Angio=['ADAM17','AXIN1','AXIN2','CCND2','DKK1','DKK4'] 
Hypoxia=['ADAM17','AXIN1','DLL1','FZD8','FZD1'] 
Infla=['DLL1','FZD8','CCND2','DKK1','ADAM17','JAG2','JAG1'] 
Glycolysis=['MYC','NKD1','PPARD','JAG2','JAG1'] 
Oxophos=['SKP2','TCF7','NUMB']
P53=['NUMB','FZD8','CCND2','AXIN2','KAT2A'] 

df = pd.DataFrame(columns=['Angio', 'Hypoxia', 'Infla', 
                           'Glycolysis', 'Oxophos', 'P53'],
                  index=['Angio', 'Hypoxia', 'Infla', 
                           'Glycolysis', 'Oxophos', 'P53'])

print(df)
           Angio  Hypoxia   Infla   Glycolysis  Oxophos  P53
Angio       NaN     NaN      NaN        NaN       NaN    NaN
Hypoxia     NaN     NaN      NaN        NaN       NaN    NaN
Infla       NaN     NaN      NaN        NaN       NaN    NaN
Glyco       NaN     NaN      NaN        NaN       NaN    NaN
Oxophos     NaN     NaN      NaN        NaN       NaN    NaN
P53         NaN     NaN      NaN        NaN       NaN    NaN

#The function below is to obtain the jaccard similarity score.
#Input is a list of the six above.
def jaccard(list1, list2):
    intersection = len(list(set(list1).intersection(list2)))
    union = (len(list1) + len(list2)) - intersection
    return float(intersection) / union

这六个列表包含基因名称。
这些列表以行和列的名称“df”命名。
通过使用'df'的行和列的名称作为jaccard函数的输入来获得值。(因为前面的6个列表名称是行和列的名称)
此时,您希望使用'for循环'将'df'的NaN替换为从jaccard获得的值。
我一直想解决这个问题,但是没有用。我不知道该怎么办。所以我有点迷路了,这里...请帮帮我。谢谢。

gwbalxhn

gwbalxhn1#

如果你可以把你的列表转换成字典,我建议以下解决方案:

import pandas as pd

##This dict contains gene names lists. 
genes_dict = {
    'Angio':['ADAM17','AXIN1','AXIN2','CCND2','DKK1','DKK4'],
    'Hypoxia':['ADAM17','AXIN1','DLL1','FZD8','FZD1'],
    'Infla':['DLL1','FZD8','CCND2','DKK1','ADAM17','JAG2','JAG1'],
    'Glycolysis':['MYC','NKD1','PPARD','JAG2','JAG1'],
    'Oxophos':['SKP2','TCF7','NUMB'],
    "P53":['NUMB','FZD8','CCND2','AXIN2','KAT2A'],
}

#The function below is to obtain the jaccard similarity score.
#Input is a list of the six above.
def jaccard(list1, list2):
    intersection = len(list(set(list1).intersection(list2)))
    union = (len(list1) + len(list2)) - intersection
    return float(intersection) / union

names_list = list(genes_dict.keys())

res = {}
for i in range(len(names_list)):
    res[names_list[i]] = {}
    for j in range(len(names_list)):
        res[names_list[i]][names_list[j]] = jaccard(genes_dict[names_list[i]],genes_dict[names_list[j]])
        
        
df = pd.DataFrame(res)

相关问题