我有一个asm文件,它是用IDA Pro生成的。它的所有函数看起来都像这样。
; =============== S U B R O U T I N E =======================================
release ; DATA XREF: attribute_manager_create+78↓o
; attribute_manager_create+7C↓o ...
var_30 = -0x30
var_24 = -0x24
arg_0 = 0
arg_4 = 4
PUSH {R4-R9,LR}
MOV R7, R0
LDR R0, [R0,#0x34]
SUB SP, SP, #0x14
MOV R9, R3
LDR R3, [R0]
MOV R5, R1
MOV R8, R2
BLX R3
LDR R0, [R7,#0x30]
ADD R6, SP, #0x30+var_24
LDR R3, [R0,#4]
BLX R3
MOV R4, R0
B loc_7A7C
; ---------------------------------------------------------------------------
loc_7A70 ; CODE XREF: release+5C↓j
LDR R3, [SP,#0x30+var_24]
CMP R3, R5
BEQ loc_7AB4
loc_7A7C ; CODE XREF: release+38↑j
LDR R3, [R4]
MOV R1, R6
MOV R0, R4
BLX R3
CMP R0, #0
BNE loc_7A70
loc_7A94 ; CODE XREF: release+A0↓j
LDR R3, [R4,#8]
MOV R0, R4
BLX R3
LDR R0, [R7,#0x34]
LDR R3, [R0,#0xC]
BLX R3
ADD SP, SP, #0x14
POP {R4-R9,PC}
; ---------------------------------------------------------------------------
loc_7AB4 ; CODE XREF: release+44↑j
LDR R3, [SP,#0x30+arg_4]
STR R3, [SP,#0x30+var_30]
MOV R2, R9
LDR R3, [SP,#0x30+arg_0]
LDR R6, [R5,#4]
MOV R1, R8
MOV R0, R5
BLX R6
B loc_7A94
; End of function release
我想解析这个文件,并得到一个字典,其中的键将是函数的名称,值将是一个由组合在一起的指令组成的字符串。我将更详细地解释。
我有一个字典,其中每个Arm指令对应一个特定的字母。
arm_dict = {"MOV": "a","MVN": "b","ADD": "c","SUB": "d","MUL": "e","LSL": "f","LSR": "g","ASR": "h","ROR": "i","CMP": "j","AND": "k","ORR": "l","EOR": "m","LDR": "n","STR": "o","LDM": "p","STM": "q","PUSH": "r","POP": "s","B": "t","BL": "u","BLX": "v","BEQ": "w","SWI": "x","SVC": "y","NOP": "z"}
解析时,需要指令变成这个字母。例如,字典中的上述函数应该是这样的:
{'release': 'randanaavncnvat...'}
如果代码包含不在arm_dict中的指令,则跳过该指令。
我尝试过使用包含“S U B R O U T I N E”和“End of function”的字符串进行线性解析,但我无法摆脱指令操作数。如果有人能提供一些示例代码或建议,我会很高兴。
1条答案
按热度按时间kcwpcxri1#
以下是根据彼得的建议编辑的一个混乱的版本:
有很多潜在的边缘情况下,它可能会失败,但它应该做得很好。