I have a dataframe such as :
Names1 Gene_name Status
SP1 GENE1 0
SP1 GENE1 1
SP1 GENE1 1
SP1 GENE1 2
SP1 GENE1 2
SP1 GENE2 0
SP3 GENE2 0
SP1 GENE2 1
SP2 GENE2 2
SP4 GENE3 1
SP4 GENE3 2
SP5 GENE3 0
SP5 GENE3 0
Then I would like to fill a new dataframe where each Gene_name
is a column, and each Names
is a row :
Names GENE1 GENE2 GENE3
SP1
SP2
SP3
SP4
SP5
and fill cells Values
depending on the Satus
for each Names
groups
- if only 0 > value = 0
- if only 1 > value = 1
- if both 0 & 1 > value = 0-1
- if both 0 & 2 > value = 0-2
- if both 1 & 2 > value = 1-2
- if both 0 & 1 & 2 > value = 0-1-2
So for example GENE1
in SP1
both present a 0,1 and 2 status, so I fill 0-1-2
within the cell:
Names GENE1 GENE2 GENE3
SP1 0-1-2
SP2
SP3
SP4
SP5
then, SP2,SP3,SP4 and SP5
do not have value for the GENE1
, so I put NA
:
Names GENE1 GENE2 GENE3
SP1 0-1-2
SP2 NA
SP3 NA
SP4 NA
SP5 NA
Then for the GENE2:
GENE2
in SP1
both present a 0 and 1 status, so I fill 0-1
within the cell:
Names GENE1 GENE2 GENE3
SP1 0-1-2 0-1
SP2 NA
SP3 NA
SP4 NA
SP5 NA
GENE2
in SP2
present only a value 2 status, so I fill 2
within the cell:
Names GENE1 GENE2 GENE3
SP1 0-1-2 0-1
SP2 NA 2
SP3 NA
SP4 NA
SP5 NA
GENE2
in SP3
present only a value 0 status, so I fill 0
within the cell:
Names GENE1 GENE2 GENE3
SP1 0-1-2 0-1
SP2 NA 2
SP3 NA 0
SP4 NA
SP5 NA
and the other Names have no GENE2
values, so I put NA
:
Names GENE1 GENE2 GENE3
SP1 0-1-2 0-1
SP2 NA 2
SP3 NA 0
SP4 NA NA
SP5 NA NA
and so on... At the end I should get a full dataframe such as :
Names GENE1 GENE2 GENE3
SP1 0-1-2 0-1 NA
SP2 NA 2 NA
SP3 NA 0 NA
SP4 NA NA 0-2
SP5 NA NA 0
Does someone have an idea please ?
Here is the dict format of the dataframe if it can helps :
{'Names1': {0: 'SP1', 1: 'SP1', 2: 'SP1', 3: 'SP1', 4: 'SP1', 5: 'SP1', 6: 'SP3', 7: 'SP1', 8: 'SP2', 9: 'SP4', 10: 'SP4', 11: 'SP5', 12: 'SP5'}, 'Gene_name': {0: 'GENE1', 1: 'GENE1', 2: 'GENE1', 3: 'GENE1', 4: 'GENE1', 5: 'GENE2', 6: 'GENE2', 7: 'GENE2', 8: 'GENE2', 9: 'GENE3', 10: 'GENE3', 11: 'GENE3', 12: 'GENE3'}, 'Status': {0: 0, 1: 1, 2: 1, 3: 2, 4: 2, 5: 0, 6: 0, 7: 1, 8: 2, 9: 1, 10: 2, 11: 0, 12: 0}}
3条答案
按热度按时间6ie5vjzr1#
代码
输出功率
生成所需输出
实验结果:
olqngx592#
上面的解决方案会更简洁,但只是想提出一个替代解决方案相同:
输出
hfwmuf9z3#
使用数据透视表,解决方案如下所示: