首页
论坛
专栏
课程

[原创]反编译原理(4)-代码生成

2019-6-15 04:49 1940

[原创]反编译原理(4)-代码生成

2019-6-15 04:49
1940

Retargetable Decompiler

Retargetable Decompiler,即反编译器前端用DSL(Domain Specific Language)/ADL(Architecture Description Language)实现了前端汇编指令的语义转化成中端指令,再定义处理器相关的编译器相关(ABI、Endian等),反编译器中端和后端几乎不用修改就能反编译该处理器的汇编指令,也就意味着,假设某二进制程序能被反汇编,Retargetable Decompiler则使它也能够被反编译。

DSL/ADL

用来生成代码的程序可被称为metaprogram(元程序),编写这种程序就称为metaprogramming(元编程)。
反编译器的前端DSL/ADL可称为元程序,通常需要考虑实现Disassemblers, Simulators/Emulators,Assemblers, 尤其是Disassemblers和Simulators/Emulators的实现。
当前有以下几种ADL语言:

  • SLED

    NJMCT(New Jersey Machine Code Toolkit)使用的ADL程序,用函数式语言SML/NJ实现,并且用了一个非常奇怪的程序Icon。

  • SSL

    反编译器Boomerang使用的ADL程序,用Lex/Yacc实现。

  • TableGen

    LLVM的代码生成工具,具有部分C++语言模板特性和函数式语言特性,根据此工具有了反汇编引擎Capstone和汇编引擎Keystone,由于该工具不支持Simulators/Emulators,因此Unicorn基于QEMU实现。

  • CGEN

    Binutils部分处理器的代码生成工具,用函数式语言Scheme实现,CGEN采用Guile运行环境,应该可以采用更流行的Racket,Chicken等。

  • SLEIGH

    Ghidra使用的ADL程序,继承扩展于SLED。在Java端用Antlr3实现,在C++端用Lex/Yacc实现,关于Assemblers的实现,Ghidra独自实现了LR语法分析工具。如果要改进SLEIGH的ADL文法实现,由于Antlr4 文法和实现分离,可以用Antlr4统一SLEIGH在Java和C++的ADL文法实现。

  • Others

    GDB的IGEN生成虚拟机的Semantics,GCC的代码生成工具等。

Processor

  • Architectures

    通常有三种架构:Cisc、Risc、VLIW,ADL处理VLIW比较麻烦。
  • VM Context

    反编译器的前端汇编指令转化成中端IR时,类似于Simulators/Emulators的Semantics实现,因此需要VM Context,需要了解处理器的General Purpose Registers、Status Registers、Special Purpose Registers,还有定义VM Context Status Virtual Registers。
  • ISA

    需要了解处理器的Instruction Formats,Instruction Operands,Instruction Fields,Instructions Mnemonics,Instruction Endian,Data Endian,Instruction Semantics,ABI等。

A Case Study

根据CGEN已实现的ADL描述来实现SLEIGH的处理器ADL。
Fujitsu FR30/FR80/FR81常应用于相机的主控制器,少量应用于汽车的ECU。本文根据CGEN的fr30.cpu部分实现SLEIGH的FR.sinc/FR.slaspec。

VM Context(CGEN)

VM Context定义CGEN实现。

;......
;......
;......
(dnh h-pc "program counter" (PC PROFILE) (pc) () () ())

(define-keyword
  (name gr-names)
  (enum-prefix H-GR-)
  (values (r0 0) (r1 1) (r2 2) (r3 3) (r4 4) (r5 5) (r6 6) (r7 7)
      (r8 8) (r9 9) (r10 10) (r11 11) (r12 12) (r13 13) (r14 14) (r15 15)
      (ac 13) (fp 14) (sp 15))
)

(define-hardware
  (name h-gr)
  (comment "general registers")
  (attrs PROFILE CACHE-ADDR)
  (type register WI (16))
  (indices extern-keyword gr-names)
)

(define-keyword
  (name cr-names)
  (enum-prefix H-CR-)
  (values (cr0 0) (cr1 1) (cr2 2) (cr3 3)
      (cr4 4) (cr5 5) (cr6 6) (cr7 7)
      (cr8 8) (cr9 9) (cr10 10) (cr11 11)
      (cr12 12) (cr13 13) (cr14 14) (cr15 15))
)

(define-hardware
  (name h-cr)
  (comment "coprocessor registers")
  (attrs)
  (type register WI (16))
  (indices extern-keyword cr-names)
)

(define-keyword
  (name dr-names)
  (enum-prefix H-DR-)
  (values (tbr 0) (rp 1) (ssp 2) (usp 3) (mdh 4) (mdl 5))
)

(define-hardware
  (name h-dr)
  (comment "dedicated registers")
  (type register WI (6))
  (indices extern-keyword dr-names)
  (get (index) (c-call WI "@cpu@_h_dr_get_handler" index))
  (set (index newval) (c-call VOID "@cpu@_h_dr_set_handler" index newval))
)

(define-hardware
  (name h-ps)
  (comment "processor status")
  (type register UWI)
  (indices keyword "" ((ps 0)))
  (get () (c-call UWI "@cpu@_h_ps_get_handler"))
  (set (newval) (c-call VOID "@cpu@_h_ps_set_handler" newval))
)

(dnh h-r13 "General Register 13 explicitly required"
    ()
    (register WI)
    (keyword "" ((r13 0)))
    () ()
)

(dnh h-r14 "General Register 14 explicitly required"
    ()
    (register WI)
    (keyword "" ((r14 0)))
    () ()
)

(dnh h-r15 "General Register 15 explicitly required"
    ()
    (register WI)
    (keyword "" ((r15 0)))
    () ()
)

(dsh h-nbit  "negative         bit" ()           (register BI))
(dsh h-zbit  "zero             bit" ()           (register BI))
(dsh h-vbit  "overflow         bit" ()           (register BI))
(dsh h-cbit  "carry            bit" ()           (register BI))
(dsh h-ibit  "interrupt enable bit" ()           (register BI))
(define-hardware
  (name h-sbit)
  (comment "stack bit")
  (type register BI)
  (get () (c-call BI "@cpu@_h_sbit_get_handler"))
  (set (newval) (c-call VOID "@cpu@_h_sbit_set_handler" newval))
)
(dsh h-tbit  "trace trap       bit" ()           (register BI))
(dsh h-d0bit "division 0       bit" ()           (register BI))
(dsh h-d1bit "division 1       bit" ()           (register BI))

(define-hardware
  (name h-ccr)
  (comment "condition code bits")
  (type register UQI)
  (get () (c-call UQI "@cpu@_h_ccr_get_handler"))
  (set (newval) (c-call VOID "@cpu@_h_ccr_set_handler" newval))
)
(define-hardware
  (name h-scr)
  (comment "system condition bits")
  (type register UQI)
  (get () (c-call UQI "@cpu@_h_scr_get_handler"))
  (set (newval) (c-call VOID "@cpu@_h_scr_set_handler" newval))
)
(define-hardware
  (name h-ilm)
  (comment "interrupt level mask")
  (type register UQI)
  (get () (c-call UQI "@cpu@_h_ilm_get_handler"))
  (set (newval) (c-call VOID "@cpu@_h_ilm_set_handler" newval))
)
;......
;......
;......

VM Context(SLEIGH)

VM Context定义SLEIGH实现,VM Context标志位寄存器每个标志位不使用位域表示,而是每个位一个字节。虚拟机状态变量好像非必须。

#......
#......
#......
define endian=big;

define alignment=2;

define space ram type=ram_space size=4 default;
define space register type=register_space size=4;

# general registers
define register offset=0 size=4 [ 
    R0  R1  R2  R3 R4  R5  R6  R7  R8  R9  R10  R11  R12
    R13
    R14
    R15   
];

# coprocessor register
define register offset=0x50 size=4 [ 
    CR0  CR1  CR2  CR3
    CR4  CR5  CR6  CR7
    CR8  CR9  CR10  CR11
    CR12  CR13  CR14  CR15
];

# dedicated registers
define register offset=0x100 size=4 [ 
    TBR  RP  SSP  USP  MDH  MDL
];

define register offset=0x150 size=4 [PC];

# processor status register
define register offset=0x200 size=1 [
    _    _    _    _    _    _    _    _ 
    _    _    _    I4  I3   I2  I1   I0  #ILM
    _    _    _    _    _    D1 D0 T  #SCR
    _    _    S    I    N   Z   V    C #CCR
];

define register offset=0x250 size=1 [
    _

    #ILM overlaps I4, I3, I2, I1, I0
    ILM

    #SCR overlaps D1, D0, T
    SCR

    #CCR overlaps S, I, N, Z, V, C
    CCR
];

define register offset=0x250 size=4 [PS];

#define register offset=0x200 size=4 [PS];
#@define C    "PS[0,1]"
#@define V    "PS[1,1]"
#@define Z    "PS[2,1]"
#@define N    "PS[3,1]"
#@define  I     "PS[4,1]"
#@define S     "PS[5,1]"
#@define T     "PS[8,1]"
#@define D0   "PS[9,1]"
#@define D1   "PS[10,1]"
#@define I0     "PS[16,1]"
#@define I1     "PS[17,1]"
#@define I2     "PS[18,1]"
#@define I3     "PS[19,1]"
#@define I4     "PS[20,1]"

define register offset=0x300 size=4   contextreg;
define context contextreg

    #R15/SP refer to USP or SSP depending on S flag
    ctx_usp_enabled = (0,0)

#......
#......
#......

ISA Instruction Fields(CGEN)

指令位域定义CGEN实现,CGEN好像是用低位在前获取Instruction Fileds,不知道是不是和Instruction Endian相关。

;......
;......
;......

(dnf f-op1       "1st 4 bits of opcode"  ()  0  4)
(dnf f-op2       "2nd 4 bits of opcode"  ()  4  4)
(dnf f-op3       "3rd 4 bits of opcode"  ()  8  4)
(dnf f-op4       "4th 4 bits of opcode"  () 12  4)
(dnf f-op5       "5th bit of opcode"     ()  4  1)
(dnf f-cc        "condition code"        ()  4  4)
(dnf f-ccc       "coprocessor calc code" () 16  8)
(dnf f-Rj        "register Rj"           ()  8  4)
(dnf f-Ri        "register Ri"           () 12  4)
(dnf f-Rs1       "register Rs"           ()  8  4)
(dnf f-Rs2       "register Rs"           () 12  4)
(dnf f-Rjc       "register Rj"           () 24  4)
(dnf f-Ric       "register Ri"           () 28  4)
(dnf f-CRj       "coprocessor register"  () 24  4)
(dnf f-CRi       "coprocessor register"  () 28  4)
(dnf f-u4        "4 bit 0 extended"      ()  8  4)
(dnf f-u4c       "4 bit 0 extended"      () 12  4)
(df  f-i4        "4 bit sign extended"   ()  8  4 INT #f #f)
(df  f-m4        "4 bit minus extended"  ()  8  4 UINT
     ((value pc) (and WI value (const #xf)))
     ((value pc) (or  WI value (sll WI (const -1) (const 4))))
)
(dnf f-u8        "8 bit unsigned"        ()  8  8)
(dnf f-i8        "8 bit unsigned"        ()  4  8)

(dnf  f-i20-4     "upper 4 bits of i20"  ()  8  4)
(dnf  f-i20-16    "lower 16 bits of i20" () 16 16)
(dnmf f-i20       "20 bit unsigned"      () UINT
      (f-i20-4 f-i20-16)
      (sequence () ; insert
        (set (ifield f-i20-4)  (srl (ifield f-i20) (const 16)))
        (set (ifield f-i20-16) (and (ifield f-i20) (const #xffff)))
        )
      (sequence () ; extract
        (set (ifield f-i20) (or (sll (ifield f-i20-4) (const 16))
                    (ifield f-i20-16)))
        )
)

(dnf f-i32       "32 bit immediate"      (SIGN-OPT) 16 32)

(df  f-udisp6    "6 bit unsigned offset" ()  8  4 UINT
     ((value pc) (srl UWI value (const 2)))
     ((value pc) (sll UWI value (const 2)))
)
(df  f-disp8     "8 bit signed offset"   ()  4  8 INT #f #f)
(df  f-disp9     "9 bit signed offset"   ()  4  8 INT
    ((value pc) (sra WI value (const 1)))
    ((value pc) (sll WI value (const 1)))
)
(df  f-disp10    "10 bit signed offset"  ()  4  8 INT
     ((value pc) (sra WI value (const 2)))
     ((value pc) (sll WI value (const 2)))
)
(df  f-s10       "10 bit signed offset"  ()  8  8 INT
     ((value pc) (sra WI value (const 2)))
     ((value pc) (sll WI value (const 2)))
)
(df  f-u10       "10 bit unsigned offset" ()  8  8 UINT
     ((value pc) (srl UWI value (const 2)))
     ((value pc) (sll UWI value (const 2)))
)
(df  f-rel9 "9 pc relative signed offset" (PCREL-ADDR) 8 8 INT
     ((value pc) (sra WI (sub WI value (add WI pc (const 2))) (const 1)))
     ((value pc) (add WI (sll WI value (const 1)) (add WI pc (const 2))))
)
(dnf f-dir8      "8  bit direct address"  ()  8  8)
(df  f-dir9      "9  bit direct address"  ()  8  8 UINT
     ((value pc) (srl UWI value (const 1)))
     ((value pc) (sll UWI value (const 1)))
)
(df  f-dir10     "10 bit direct address"  ()  8  8 UINT
     ((value pc) (srl UWI value (const 2)))
     ((value pc) (sll UWI value (const 2)))
)
(df  f-rel12     "12 bit pc relative signed offset" (PCREL-ADDR) 5 11 INT
     ((value pc) (sra WI (sub WI value (add WI pc (const 2))) (const 1)))
     ((value pc) (add WI (sll WI value (const 1)) (add WI pc (const 2))))
)

(dnf f-reglist_hi_st  "8 bit register mask for stm" () 8 8)
(dnf f-reglist_low_st "8 bit register mask for stm" () 8 8)
(dnf f-reglist_hi_ld  "8 bit register mask for ldm" () 8 8)
(dnf f-reglist_low_ld "8 bit register mask for ldm" () 8 8)

;......
;......
;......

ISA Instruction Fields(SLEIGH)

指令位域定义SLEIGH实现,SLEIGH好像是用高位在前获取Instruction Fileds,不知道是不是和Instruction Endian相关。

#......
#......
#......

define token instr1 (16)

    #f-op1  (0  4)   
    #CGEN start=0,end=0+4-1=3
    #SLEIGH start=15-3=12,end=12+4-1=15
    op1_12_4 = (12,15)

    #f-op2  (4  4)    
    #CGEN start=4,end=4+4-1=7
    #SLEIGH start=15-7=8,end=8+4-1=11
    op2_08_4 = (8,11)

    #f-op3  (8  4)     
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    op3_04_4 = (4,7)

    #f-op4  (12  4)     
    #CGEN start=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    op4_00_4 = (0,3)

    #f-op5  (4  1) 
    #CGEN start=4,end=4+1-1=4
    #SLEIGH start=15-4=11,end=11+1-1=11
    op5_11_1 = (11,11)

    #f-cc  (4  4)  
    #CGEN start=4,end=4+4-1=7
    #SLEIGH start=15-7=8,end=8+4-1=11
    cc_08_4 = (8,11)

    #f-Rj  (8  4)  
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    Rj_04_4 = (4,7)

    #f-Ri  (12  4)      
    #CGEN start=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    Ri_00_4 = (0,3)

    #f-Rs1  (8  4)  
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    Rs1_04_4 = (4,7)

    #f-Rs2  (12  4)      
    #CGEN start=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    Rs2_00_4 = (0,3)

    #f-u4  (8  4)    
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    u4_04_4= (4,7)

    #f-u4c  (12  4)    
    #CGEN start=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    u4c_00_4 = (0,3)

    #f-i4  (8  4)   
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    i4_04_4= (4,7) signed

    #f-m4  (8  4) 
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    m4_ 04_4= (4,7)

    #f-u8  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    u8_00_8 = (0,7)

    #f-i8  (4  8) 
    #CGEN start=4,end=4+8-1=11
    #SLEIGH start=15-11=4,end=4+8-1=11
    i8_04_8 = (4,11)

    #f-i20  (8  4) 
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    i20_16_4 = (4,7)

    #f-udisp6  (8  4)   
    #CGEN start=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    udisp6_04_4 = (4,7)

    #f-disp8  (4  8)   
    #CGEN start=4,end=4+8-1=11
    #SLEIGH start=15-11=4,end=4+8-1=11
    disp8_04_8 = (4,11) signed

    #f-disp9  (4  8)  
    #CGEN start=4,end=4+8-1=11
    #SLEIGH start=15-11=4,end=4+8-1=11
    disp9_04_8 = (4,11) signed

    #f-disp10  (4  8)    
    #CGEN start=4,end=4+8-1=11
    #SLEIGH start=15-11=4,end=4+8-1=11
    disp10_04_8 = (4,11) signed

    #f-s10  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    s10_00_8 = (0,7) signed

    #f-u10  (8  8)    
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    u10_00_8 = (0,7)

    #f-rel9  (8  8)   
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    rel9_00_8 = (0,7) signed

    #f-dir8  (8  8)   
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    dir8_00_8 = (0,7)

    #f-dir9  (8  8) 
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    dir9_00_8 = (0,7)

    #f-dir10  (8  8)   
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    dir10_00_8 = (0,7)

    #f-rel12  (5  11)  
    #CGEN start=5,end=5+11-1=15
    #SLEIGH start=15-15=0,end=0+11-1=10
    rel12_00_11 = (0,10) signed

    #f-reglist_hi_st  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    reglist_hi_st_00_8 = (0,7)

    #f-reglist_low_st  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    reglist_low_st_00_8 = (0,7)

    #f-reglist_hi_ld  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    reglist_hi_ld_00_8 = (0,7)

    #f-reglist_low_st  (8  8)
    #CGEN start=8,end=8+8-1=15
    #SLEIGH start=15-15=0,end=0+8-1=7
    reglist_low_st_00_8 = (0,7)
;

define token instr2 (16)

    #f-ccc  (16  8)
    #CGEN start=16-16=0,end=0+8-1=7
    #SLEIGH start=15-7=8,end=8+8-1=15 
    ccc_08_8 = (8,15)

    #f-i20-16  (16  16)
    #CGEN start=16-16=0,end=0+16-1=15
    #SLEIGH start=15-15=0,end=0+16-1=15 
    i20_00_16 = (0,15)

    # f-Rjc  (24  4)
    #CGEN start=24-16=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    Rjc_04_4 = (4,7)

    #f-Ric  (28  4)
    #CGEN start=28-16=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    Ric_00_4 = (0,3)

    #f-CRj  (24  4)
    #CGEN start=24-16=8,end=8+4-1=11
    #SLEIGH start=15-11=4,end=4+4-1=7
    CRj_04_4 = (4,7)

    #f-CRi  (28  4)
    #CGEN start=28-16=12,end=12+4-1=15
    #SLEIGH start=15-15=0,end=0+4-1=3
    CRi_00_4 = (0,3)

    i32_16_16= (0,15)
;

define token instr3 (16)

    i32_00_16= (0,15)
;

#......
#......
#......

ISA Instruction Operands(CGEN)

指令操作数格式定义CGEN实现。

;......
;......
;......
(define-attr
  (for operand)
  (type boolean)
  (name HASH-PREFIX)
  (comment "immediates have an optional '#' prefix")
)

(dnop Ri      "destination register"         ()            h-gr   f-Ri)
(dnop Rj      "source register"              ()            h-gr   f-Rj)
(dnop Ric     "target register coproc insn"  ()            h-gr   f-Ric)
(dnop Rjc     "source register coproc insn"  ()            h-gr   f-Rjc)
(dnop CRi     "coprocessor register"         ()            h-cr   f-CRi)
(dnop CRj     "coprocessor register"         ()            h-cr   f-CRj)
(dnop Rs1     "dedicated register"           ()            h-dr   f-Rs1)
(dnop Rs2     "dedicated register"           ()            h-dr   f-Rs2)
(dnop R13     "General Register 13"          ()            h-r13  f-nil)
(dnop R14     "General Register 14"          ()            h-r14  f-nil)
(dnop R15     "General Register 15"          ()            h-r15  f-nil)
(dnop ps      "Program Status register"      ()            h-ps   f-nil)
(dnop u4      "4  bit unsigned immediate"    (HASH-PREFIX) h-uint f-u4)
(dnop u4c     "4  bit unsigned immediate"    (HASH-PREFIX) h-uint f-u4c)
(dnop u8      "8  bit unsigned immediate"    (HASH-PREFIX) h-uint f-u8)
(dnop i8      "8  bit unsigned immediate"    (HASH-PREFIX) h-uint f-i8)
(dnop udisp6  "6  bit unsigned immediate"    (HASH-PREFIX) h-uint f-udisp6)
(dnop disp8   "8  bit signed   immediate"    (HASH-PREFIX) h-sint f-disp8)
(dnop disp9   "9  bit signed   immediate"    (HASH-PREFIX) h-sint f-disp9)
(dnop disp10  "10 bit signed   immediate"    (HASH-PREFIX) h-sint f-disp10)

(dnop s10     "10 bit signed   immediate"    (HASH-PREFIX) h-sint f-s10)
(dnop u10     "10 bit unsigned immediate"    (HASH-PREFIX) h-uint f-u10)
(dnop i32     "32 bit immediate"             (HASH-PREFIX) h-uint f-i32)

(define-operand
  (name m4)
  (comment "4  bit negative immediate")
  (attrs HASH-PREFIX)
  (type h-sint)
  (index f-m4)
  (handlers (print "m4"))
)

(define-operand
  (name i20)
  (comment "20 bit immediate")
  (attrs HASH-PREFIX)
  (type h-uint)
  (index f-i20)
)

(dnop dir8    "8  bit direct address"        ()  h-uint f-dir8)
(dnop dir9    "9  bit direct address"        ()  h-uint f-dir9)
(dnop dir10   "10 bit direct address"        ()  h-uint f-dir10)

(dnop label9  "9  bit pc relative address"   ()  h-iaddr f-rel9)
(dnop label12 "12 bit pc relative address"   ()  h-iaddr f-rel12)

(define-operand 
  (name    reglist_low_ld)
  (comment "8 bit low register mask for ldm")
  (attrs)
  (type    h-uint)
  (index   f-reglist_low_ld)
  (handlers (parse "low_register_list_ld")
        (print "low_register_list_ld"))
)

(define-operand 
  (name    reglist_hi_ld)
  (comment "8 bit high register mask for ldm")
  (attrs)
  (type    h-uint)
  (index   f-reglist_hi_ld)
  (handlers (parse "hi_register_list_ld")
        (print "hi_register_list_ld"))
)

(define-operand 
  (name    reglist_low_st)
  (comment "8 bit low register mask for stm")
  (attrs)
  (type    h-uint)
  (index   f-reglist_low_st)
  (handlers (parse "low_register_list_st")
        (print "low_register_list_st"))
)

(define-operand 
  (name    reglist_hi_st)
  (comment "8 bit high register mask for stm")
  (attrs)
  (type    h-uint)
  (index   f-reglist_hi_st)
  (handlers (parse "hi_register_list_st")
        (print "hi_register_list_st"))
)
;......
;......
;......

ISA Instruction Operands(SLEIGH)

指令操作数格式定义SLEIGH实现,当前操作数格式只部分实现了以下三种(图片来自FR81用户手册):


#......
#......
#......

attach variables [Ri_00_4 Rj_04_4 Ric_00_4 Rjc_04_4 ]
  [
    R0    R1    R2    R3    R4    R5    R6    R7  
    R8    R9    R10  R11 R12  R13  R14  R15   
  ];

attach variables [CRi_00_4 CRj_04_4]
  [ 
    CR0    CR1    CR2    CR3    CR4    CR5    CR6    CR7
    CR8    CR9    CR10  CR11  CR12 CR13  CR14  CR15
  ];

attach variables [Rs2_00_4 Rs1_04_4]
  [
    TBR  RP  SSP  USP  MDH  MDL
  ];

macro pack_ilm(x) {
    x = zext((I4 << 4) | (I3 << 3) | (I2 << 2) | (I1 << 1) | (I0 << 0));
}

macro unpack_ilm(x) {
    I4 = (x & 0x10)!=0;
    I3 = (x & 0x8)!=0;
    I2 = (x & 0x4)!=0;
    I1 = (x & 0x2)!=0;
    I0 = (x & 0x1)!=0;
}

macro pack_scr(x) {
    x = zext((D1 << 2) | (D0 << 1) | (T << 0));
}

macro unpack_scr(x) {
    D1 = (x & 0x4)!=0;
    D0 = (x & 0x2)!=0;
    T = (x & 0x1)!=0;
}

macro pack_ccr(x) {
    x = zext((S << 5) | (I << 4) | (N << 3) | (Z << 2) | (V << 1) | (C << 0));
}

macro unpack_ccr(x) {
    S = (x & 0x20)!=0;
    I = (x & 0x10)!=0;
    N = (x & 0x8)!=0;
    Z = (x & 0x4)!=0;
    V = (x & 0x2)!=0;
    C = (x & 0x1)!=0;
}

macro pack_ps(x) {
    x = zext(
      (I4 << 20) | (I3 << 19) | (I2 << 18) | (I1 << 17) | (I0 << 16) |
      (D1 << 10) | (D0 << 9) | (T << 8) |
      (S << 5) | (I << 4) | (N << 3) | (Z << 2) | (V << 1) | (C << 0)
    );
}

macro unpack_ps(x) {
    #unpack  ILM
    I4 = (x & 0x100000)!=0;
    I3 = (x & 0x80000)!=0;
    I2 = (x & 0x40000)!=0;
    I1 = (x & 0x20000)!=0;
    I0 = (x & 0x10000)!=0;

    #unpack SCR
    D1 = (x & 0x400)!=0;
    D0 = (x & 0x200)!=0;
    T = (x & 0x100)!=0;

    #unpack CCR
    S = (x & 0x20)!=0;
    I = (x & 0x10)!=0;
    N = (x & 0x8)!=0;
    Z = (x & 0x4)!=0;
    V = (x & 0x2)!=0;
    C = (x & 0x1)!=0;  
}

# destination register
Rj: Rj_04_4 is Rj_04_4  {
  export  Rj_04_4;
}

# source register
Ri: Ri_00_4 is Ri_00_4  {
  export  Ri_00_4;
}

u4: u4_04_4 is u4_04_4  {
  export  *[const]:1 u4_04_4;
}

u4c: u4c_00_4 is u4c_00_4  {
  export  *[const]:1 u4c_00_4;
}

i4: i4_04_4 is i4_04_4  {
  export  *[const]:1 i4_04_4;
}

m4: immCalc is m4_04_4 
  [ 
      immCalc = (0xF0 | m4_04_4);
  ]  { 
      export  *[const]:1 immCalc; 
}

label9: rel9 is rel9_00_8 
  [ 
      rel9 = (rel9_00_8 << 1) + inst_start + 2;
  ] {
      export  *[ram]:4 rel9;
}

#......
#......
#......

ISA Binary Instruction Definitions(CGEN)

常见的二元运算指令定义CGEN实现

;......
;......
;......


(define-attr
  (for insn)
  (type boolean)
  (name NOT-IN-DELAY-SLOT)
  (comment "insn can't go in delay slot")
)

(define-pmacro (set-z-and-n x)
  (sequence ()
        (set zbit (eq x (const 0)))
        (set nbit (lt x (const 0))))
)


(define-pmacro (binary-int-op name insn comment opc1 opc2 op arg1 arg2)
  (dni name
       (.str insn " " comment)
       ()
       (.str insn " $" arg1 ",$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (sequence ()
         (set vbit ((.sym op -oflag) arg2 arg1 (const 0)))
         (set cbit ((.sym op -cflag) arg2 arg1 (const 0)))
         (set arg2 (op arg2 arg1))
         (set-z-and-n arg2))
       ()
  )
)


(define-pmacro (binary-int-op-n name insn comment opc1 opc2 op arg1 arg2)
  (dni name
       (.str insn " " comment)
       ()
       (.str insn " $" arg1 ",$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (set arg2 (op arg2 arg1))
       ()
  )
)


(define-pmacro (binary-int-op-c name insn comment opc1 opc2 op arg1 arg2)
  (dni name
       (.str insn " " comment)
       ()
       (.str insn " $" arg1 ",$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (sequence ((WI tmp))
         (set tmp  ((.sym op c)      arg2 arg1 cbit))
         (set vbit ((.sym op -oflag) arg2 arg1 cbit))
         (set cbit ((.sym op -cflag) arg2 arg1 cbit))
         (set arg2 tmp)
         (set-z-and-n arg2))
       ()
  )
)

(binary-int-op   add   add   "reg/reg"   OP1_A OP2_6 add Rj Ri)
(binary-int-op   addi  add   "immed/reg" OP1_A OP2_4 add u4 Ri)
(binary-int-op   add2  add2  "immed/reg" OP1_A OP2_5 add m4 Ri)
(binary-int-op-c addc  addc  "reg/reg"   OP1_A OP2_7 add Rj Ri)
(binary-int-op-n addn  addn  "reg/reg"   OP1_A OP2_2 add Rj Ri)
(binary-int-op-n addni addn  "immed/reg" OP1_A OP2_0 add u4 Ri)
(binary-int-op-n addn2 addn2 "immed/reg" OP1_A OP2_1 add m4 Ri)

(binary-int-op   sub   sub   "reg/reg"   OP1_A OP2_C sub Rj Ri)
(binary-int-op-c subc  subc  "reg/reg"   OP1_A OP2_D sub Rj Ri)
(binary-int-op-n subn  subn  "reg/reg"   OP1_A OP2_E sub Rj Ri)

; Integer compare instruction
;
(define-pmacro (int-cmp name insn comment opc1 opc2 arg1 arg2)
  (dni name
       (.str insn " " comment)
       ()
       (.str insn " $" arg1 ",$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (sequence ((WI tmp1))
         (set vbit (sub-oflag arg2 arg1 (const 0)))
         (set cbit (sub-cflag arg2 arg1 (const 0)))
         (set tmp1 (sub       arg2 arg1))
         (set-z-and-n tmp1)
       )
       ()
  )
)

(int-cmp cmp  cmp  "reg/reg"   OP1_A OP2_A Rj Ri)
(int-cmp cmpi cmp  "immed/reg" OP1_A OP2_8 u4 Ri)
(int-cmp cmp2 cmp2 "immed/reg" OP1_A OP2_9 m4 Ri)


(define-pmacro (binary-logical-op name insn comment opc1 opc2 op arg1 arg2)
  (dni name
       (.str insn " " comment)
       ()
       (.str insn " $" arg1 ",$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (sequence ()
         (set arg2 (op arg2 arg1))
         (set-z-and-n arg2))
       ()
  )
)

(binary-logical-op and and "reg/reg" OP1_8 OP2_2 and Rj Ri)
(binary-logical-op or  or  "reg/reg" OP1_9 OP2_2 or  Rj Ri)
(binary-logical-op eor eor "reg/reg" OP1_9 OP2_A xor Rj Ri)

(define-pmacro (les-units model) ; les: load-exec-store
  (model (unit u-exec) (unit u-load) (unit u-store))
)


(define-pmacro (binary-logical-op-m name insn comment opc1 opc2 mode op arg1 arg2)
  (dni name
       (.str insn " " comment)
       (NOT-IN-DELAY-SLOT)
       (.str insn " $" arg1 ",@$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (sequence ((mode tmp))
         (set mode tmp (op mode (mem mode arg2) arg1))
         (set-z-and-n tmp)
         (set mode (mem mode arg2) tmp))
       ((les-units fr30-1))
  )
)

(binary-logical-op-m andm and  "reg/mem" OP1_8 OP2_4 WI and Rj Ri)
(binary-logical-op-m andh andh "reg/mem" OP1_8 OP2_5 HI and Rj Ri)
(binary-logical-op-m andb andb "reg/mem" OP1_8 OP2_6 QI and Rj Ri)
(binary-logical-op-m orm  or   "reg/mem" OP1_9 OP2_4 WI or  Rj Ri)
(binary-logical-op-m orh  orh  "reg/mem" OP1_9 OP2_5 HI or  Rj Ri)
(binary-logical-op-m orb  orb  "reg/mem" OP1_9 OP2_6 QI or  Rj Ri)
(binary-logical-op-m eorm eor  "reg/mem" OP1_9 OP2_C WI xor Rj Ri)
(binary-logical-op-m eorh eorh "reg/mem" OP1_9 OP2_D HI xor Rj Ri)
(binary-logical-op-m eorb eorb "reg/mem" OP1_9 OP2_E QI xor Rj Ri)


(dni bandl
     "bandl #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "bandl $u4,@$Ri"
     (+ OP1_8 OP2_0 u4 Ri)
     (set QI (mem QI Ri)
       (and QI
         (or  QI u4 (const #xf0))
         (mem QI Ri)))
     ((les-units fr30-1))
)

(dni borl
     "borl #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "borl $u4,@$Ri"
     (+ OP1_9 OP2_0 u4 Ri)
     (set QI (mem QI Ri) (or QI u4 (mem QI Ri)))
     ((les-units fr30-1))
)

(dni beorl
     "beorl #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "beorl $u4,@$Ri"
     (+ OP1_9 OP2_8 u4 Ri)
     (set QI (mem QI Ri) (xor QI u4 (mem QI Ri)))
     ((les-units fr30-1))
)

(dni bandh
     "bandh #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "bandh $u4,@$Ri"
     (+ OP1_8 OP2_1 u4 Ri)
     (set QI (mem QI Ri)
       (and QI
         (or QI (sll QI u4 (const 4)) (const #x0f))
         (mem QI Ri)))
     ((les-units fr30-1))
)

(define-pmacro (binary-or-op-mh name insn opc1 opc2 op arg1 arg2)
  (dni name
       (.str name " #" arg1 ",@" args)
       (NOT-IN-DELAY-SLOT)
       (.str name " $" arg1 ",@$" arg2)
       (+ opc1 opc2 arg1 arg2)
       (set QI (mem QI arg2)
         (insn QI
           (sll QI arg1 (const 4))
           (mem QI arg2)))
       ((les-units fr30-1))
  )
)

(binary-or-op-mh borh  or  OP1_9 OP2_1 or  u4 Ri)
(binary-or-op-mh beorh xor OP1_9 OP2_9 xor u4 Ri)

(dni btstl
     "btstl #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "btstl $u4,@$Ri"
     (+ OP1_8 OP2_8 u4 Ri)
     (sequence ((QI tmp))
           (set tmp (and QI u4 (mem QI Ri)))
           (set zbit (eq tmp (const 0)))
           (set nbit (const 0)))
     ((fr30-1 (unit u-load) (unit u-exec (cycles 2))))
)

(dni btsth
     "btsth #u4,@Ri"
     (NOT-IN-DELAY-SLOT)
     "btsth $u4,@$Ri"
     (+ OP1_8 OP2_9 u4 Ri)
     (sequence ((QI tmp))
           (set tmp (and QI (sll QI u4 (const 4)) (mem QI Ri)))
           (set zbit (eq tmp (const 0)))
           (set nbit (lt tmp (const 0))))
     ((fr30-1 (unit u-load) (unit u-exec (cycles 2))))
)
;......
;......
;......

ISA Binary Instruction Definitions(SLEIGH)

常见的二元运算指令定义SLEIGH实现,语义转化有可能出错。

#......
#......
#......
macro set_z_and_n(x)  {
    Z = (x == 0);
    N = (x s< 0);
}

macro addflags(op1, op2) {          
    C = (carry(op1, op2));     
    V = (scarry(op1, op2)); 
}

macro addflags2(op1, op2, op3) {    
    local tmp = (carry(op1,op2)); 
    C = (carry(zext(tmp), op3));
    tmp = (scarry(op1,op2)); 
    V = (scarry(zext(tmp), op3));   
}

macro subflags(op1, op2) {          
    C = (op1 < op2); 
    V = (sborrow(op1, op2)); 
}

macro subflags2(op1, op2, op3) {    
    C = (op1 < (op2 + op3)); 
    local tmp = (sborrow(op1, op2));
    V = (sborrow(zext(tmp), op3));
}

:ADD  Rj,  Ri    
is  op1_12_4=0xA & op2_08_4=0x6 & Rj & Ri {
    addflags(Ri, Rj);
    Ri = Ri + Rj;
    set_z_and_n(Ri);
}

:ADD "#"^u4, Ri    
is  op1_12_4=0xA & op2_08_4=0x4 & u4 & Ri  {
    local tmp1:4 = zext(u4);
    addflags(Ri, tmp1);
    Ri = Ri + tmp1;
    set_z_and_n(Ri);
}

:ADD2 "#"^m4, Ri    
is  op1_12_4=0xA & op2_08_4=0x5 & m4 & Ri  {
    local tmp1:4 = sext(m4);
    addflags(Ri, tmp1);
    Ri = Ri + tmp1;
    set_z_and_n(Ri);
}

:ADDC Rj, Ri    
is op1_12_4=0xA & op2_08_4=0x7 & Rj & Ri  {
    local tmp1:4 = zext(C);
    addflags2(Ri, Rj, tmp1);
    Ri = Ri + Rj + tmp1;
    set_z_and_n(Ri);
}

:ADDN Rj, Ri     
is op1_12_4=0xA & op2_08_4=0x2 & Rj & Ri  {
    Ri = Ri + Rj;
}

:ADDN "#"^u4, Ri    
is op1_12_4=0xA & op2_08_4=0x0 & u4 & Ri  {
    local tmp1:4 = zext(u4);
    Ri = Ri + tmp1;
}

:ADDN2 "#"^m4, Ri    
is op1_12_4=0xA & op2_08_4=0x1 & m4 & Ri  {
    local tmp1:4 = sext(m4);
    Ri = Ri + tmp1;
}

:SUB Rj, Ri    
is op1_12_4=0xA & op2_08_4=0xC & Rj & Ri  {
    subflags(Ri, Rj);
    Ri = Ri - Rj;
    set_z_and_n(Ri);
}

:SUBC Rj, Ri    
is op1_12_4=0xA & op2_08_4=0xD & Rj & Ri  {
    local tmp1:4 = zext(C);
    subflags2(Ri, Rj, tmp1);
    Ri = Ri - Rj - tmp1;
    set_z_and_n(Ri);
}

:SUBN Rj, Ri    
is op1_12_4=0xA & op2_08_4=0xE & Rj & Ri  {
    Ri = Ri - Rj;
}

:CMP Rj, Ri    
is op1_12_4=0xA & op2_08_4=0xA & Rj & Ri  {
    subflags(Ri, Rj);
    local tmp1:4 = Ri - Rj;
    set_z_and_n(tmp1);
}

:CMP "#"^u4, Ri    
is op1_12_4=0xA & op2_08_4=0x8 & u4 & Ri  {
    local tmp1:4 = zext(u4);
    subflags(Ri, tmp1);
    local tmp2:4 = Ri - tmp1; 
    set_z_and_n(tmp2);
}

:CMP2 "#"^m4, Ri   
is op1_12_4=0xA & op2_08_4=0x9 & m4 & Ri  {
    local tmp1:4 = sext(m4);
    subflags(Ri, tmp1);
    local tmp2:4 = Ri - tmp1;   
    set_z_and_n(tmp2);
}

:AND Rj, Ri    
is op1_12_4=0x8 & op2_08_4=0x2 & Rj & Ri  {
    Ri = Ri & Rj;
    set_z_and_n(Ri);
}

:OR Rj, Ri    
is op1_12_4=0x9 & op2_08_4=0x2 & Rj & Ri  {
    Ri = Ri | Rj;
    set_z_and_n(Ri);
}

:EOR Rj, Ri    
is op1_12_4=0x9 & op2_08_4=0xA & Rj & Ri  {
    Ri = Ri ^ Rj;
    set_z_and_n(Ri);
}

:AND  Rj, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x4 & Rj & Ri  {
    local tmp1:4 = *:4 Ri;
    tmp1= tmp1 & Rj;
    set_z_and_n(tmp1);
    *:4 Ri = tmp1;
}

:ANDH Rj, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x5 & Rj & Ri  {
    local tmp1:2 = *:2 Ri;
    local tmp2:2 = Rj:2;
    tmp2 = ((tmp2 & 0xFF00) & (tmp1 & 0xFF00)) & 0xFF00;
    set_z_and_n(tmp2);
    *:2 Ri =((tmp2) | (tmp1 & 0xFF));
}

:ANDB Rj, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x6 & Rj & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = Ri:1;
    tmp1 = tmp1 & tmp2;
    set_z_and_n(tmp1);
    *:1 Ri = tmp1;
}

:OR Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x4 & Rj & Ri  {
    local tmp1:4 = *:4 Ri;
    tmp1= tmp1 | Rj;
    set_z_and_n(tmp1);
    *:4 Ri = tmp1;
}

:ORH Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x5 & Rj & Ri  {
    local tmp1:2 = *:2 Ri;
    local tmp2:2 = Rj:2;
    tmp2 = ((tmp2 & 0xFF00) | (tmp1 & 0xFF00)) & 0xFF00;
    set_z_and_n(tmp2);
    *:2 Ri =((tmp2) | (tmp1 & 0xFF));
}

:ORB Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x6 & Rj & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = Ri:1;
    tmp1 = tmp1 | tmp2;
    set_z_and_n(tmp1);
    *:1 Ri = tmp1;
}

:EOR Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0xC & Rj & Ri  {
    local tmp1:4 = *:4 Ri;
    tmp1= tmp1 ^ Rj;
    set_z_and_n(tmp1);
    *:4 Ri = tmp1;
}

:EORH Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0xD & Rj & Ri  {
    local tmp1:2 = *:2 Ri;
    local tmp2:2 = Rj:2;
    tmp2 = ((tmp2 & 0xFF00) ^ (tmp1 & 0xFF00)) & 0xFF00;
    set_z_and_n(tmp2);
    *:2 Ri =((tmp2) | (tmp1 & 0xFF));
}

:EORB Rj, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0xE & Rj & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = Ri:1;
    tmp1 = tmp1 ^ tmp2;
    set_z_and_n(tmp1);
    *:1 Ri = tmp1;
}

:BANDL "#"^u4, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x0 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = (0xF0 | u4);
    tmp1= tmp1 & tmp2;
    *:1 Ri = tmp1;
}

:BORL "#"^u4, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x0 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = u4;
    tmp1= tmp1 | tmp2;
    *:1 Ri = tmp1;
}

:BEORL "#"^u4, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x8 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = u4;
    tmp1= tmp1 ^ tmp2;
    *:1 Ri = tmp1;
}

:BANDH "#"^u4, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x1& u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = ((u4 << 4) | 0xF);
    tmp1= tmp1 & tmp2;
    *:1 Ri = tmp1;
}

:BORH "#"^u4, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x1 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = (u4 << 4);
    tmp1= tmp1 | tmp2;
    *:1 Ri = tmp1;
}

:BEORH "#"^u4, "@"^Ri    
is op1_12_4=0x9 & op2_08_4=0x9 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = (u4 << 4);
    tmp1= tmp1 ^ tmp2;
    *:1 Ri = tmp1;
}

:BTSTL "#^u4, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x8 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = u4;
    tmp1= tmp1 & tmp2;
    Z = (tmp1 == 0);
    N = 0;
}

:BTSTH "#"^u4, "@"^Ri    
is op1_12_4=0x8 & op2_08_4=0x9 & u4 & Ri  {
    local tmp1:1 = *:1 Ri;
    local tmp2:1 = (u4 << 4);
    tmp1= tmp1 & tmp2;
    Z = (tmp1 == 0);
    N = (tmp1 s< 0);
}

#......
#......
#......

ISA Conditional Branches Instruction Definitions(CGEN)

条件跳转指令定义CGEN实现

;......
;......
;......

(define-pmacro (cond-branch cc condition)
  (begin
    (dni (.sym b cc d)
     (.str (.sym b cc :d) " label9")
     (NOT-IN-DELAY-SLOT)
     (.str (.sym b cc :d) " $label9")
     (+ OP1_F (.sym CC_ cc) label9)
     (delay (const 1)
        (if condition (set pc label9)))
     ((fr30-1 (unit u-cti)))
    )
    (dni (.sym b cc)
     (.str (.sym b cc) " label9")
     (NOT-IN-DELAY-SLOT)
     (.str (.sym b cc) " $label9")
     (+ OP1_E (.sym CC_ cc) label9)
     (if condition (set pc label9))
     ((fr30-1 (unit u-cti)))
    )
  )
)

(cond-branch ra (const BI 1))
(cond-branch no (const BI 0))
(cond-branch eq      zbit)
(cond-branch ne (not zbit))
(cond-branch c       cbit)
(cond-branch nc (not cbit))
(cond-branch n       nbit)
(cond-branch p  (not nbit))
(cond-branch v       vbit)
(cond-branch nv (not vbit))
(cond-branch lt      (xor vbit nbit))
(cond-branch ge (not (xor vbit nbit)))
(cond-branch le      (or (xor vbit nbit) zbit))
(cond-branch gt (not (or (xor vbit nbit) zbit)))
(cond-branch ls      (or cbit zbit))
(cond-branch hi (not (or cbit zbit)))
;......
;......
;......

ISA Conditional Branches Instruction Definitions(SLEIGH)

到底是用逻辑运算指令还是位运算指令实现条件跳转指令时标志位的计算?Ghidra官方类似的一些处理器实现是用逻辑运算指令,感觉要用位运算指令,好像有问题。

#......
#......
#......
CC: "RA"     is cc_08_4=0x0 { 
    local tmp:1 = 1; 
    export tmp; 
}

CC: "NO"    is cc_08_4=0x1 {
    local tmp:1 = 0; 
    export tmp; 
}

CC: "EQ"     is cc_08_4=0x2 {
    local tmp:1 = (Z);
    export tmp; 
}

CC: "NE"     is cc_08_4=0x3 { 
    local tmp:1 = !(Z);
    export tmp;
}

CC: "C"       is cc_08_4=0x4 {
    local tmp:1 = (C);
    export tmp;
}

CC: "NC"    is cc_08_4=0x5 { 
    local tmp:1 = !(C);
    export tmp;
}

CC: "N"       is cc_08_4=0x6 { 
    local tmp:1 = (N);
    export tmp; 
}

CC: "P"       is cc_08_4=0x7 { 
    local tmp:1 = !(N);
    export tmp;
}

CC: "V"       is cc_08_4=0x8 {
    local tmp:1 = (V);
    export tmp; 
}

CC: "NV"    is cc_08_4=0x9 { 
    local tmp:1 = !(V);
    export tmp; 
}

CC: "LT"     is cc_08_4=0xA {
    #local tmp:1 = (N ^^ V);
    local tmp:1 = (N ^ V);
    export tmp; 
}

CC: "GE"    is cc_08_4=0xB { 
    #local tmp:1 = !(N ^^ V);
    local tmp:1 = !(N ^ V);
    export tmp; 
}

CC: "LE"    is cc_08_4=0xC {
    #local tmp:1 = ((N ^^ V) || Z);
    local tmp:1 = ((N ^ V) | Z);
    export tmp; 
}

CC: "GT"   is cc_08_4=0xD { 
    #local tmp:1 = !((N ^^ V) || Z);
    local tmp:1 = !((N ^ V) | Z);
    export tmp; 
}

CC: "LS"    is cc_08_4=0xE { 
    #local tmp:1 = (C || Z); 
    local tmp:1 = (C | Z); 
    export tmp;
}


CC: "HI"    is cc_08_4=0xF { 
    #local tmp:1 = !(C || Z); 
    local tmp:1 = !(C | Z); 
    export tmp;
}

# Conditional branches without delay slots
:B^CC label9  
is op1_12_4=0xE & CC & label9  {
    PC = &label9;

    #if (CC) 
    #    goto label9;

    if (!CC) 
        goto inst_next; 
    goto label9;
}

# Conditional branches with delay slots
:B^CC^":D" label9  
is op1_12_4=0xF & CC & label9 {
    PC = &label9;

    #delayslot(1);
    #if (CC) 
    #    goto label9;

    if (!CC) 
        goto inst_next; 
    delayslot(1);
    goto label9;
}
#......
#......
#......

另一版本条件跳转指令定义SLEIGH实现,用逻辑运算实现条件跳转指令时标志位的计算,显式生成反编译器中端SSA PHI指令(好像非必须):

#......
#......
#......
CC: "RA"     is cc_08_4=0x0 { 
    local tmp:1 = 1; 
    export tmp; 
}

CC: "NO"    is cc_08_4=0x1 {
    local tmp:1 = 0; 
    export tmp; 
}

CC: "EQ"     is cc_08_4=0x2 {
    local tmp:1 = (Z);
    export tmp; 
}

CC: "NE"     is cc_08_4=0x3 { 
    local tmp:1 = !(Z);
    export tmp;
}

CC: "C"       is cc_08_4=0x4 {
    local tmp:1 = (C);
    export tmp;
}

CC: "NC"    is cc_08_4=0x5 { 
    local tmp:1 = !(C);
    export tmp;
}

CC: "N"       is cc_08_4=0x6 { 
    local tmp:1 = (N);
    export tmp; 
}

CC: "P"       is cc_08_4=0x7 { 
    local tmp:1 = !(N);
    export tmp;
}

CC: "V"       is cc_08_4=0x8 {
    local tmp:1 = (V);
    export tmp; 
}

CC: "NV"    is cc_08_4=0x9 { 
    local tmp:1 = !(V);
    export tmp; 
}

CC: "LT"     is cc_08_4=0xA {
    local tmp:1 = (N ^^ V);
    export tmp; 
}

CC: "GE"    is cc_08_4=0xB { 
    local tmp:1 = !(N ^^ V);
    export tmp; 
}

CC: "LE"    is cc_08_4=0xC {
    local tmp:1 = ((N ^^ V) || Z);
    export tmp; 
}

CC: "GT"   is cc_08_4=0xD { 
    local tmp:1 = !((N ^^ V) || Z);
    export tmp; 
}

CC: "LS"    is cc_08_4=0xE { 
    local tmp:1 = (C || Z); 
    export tmp;
}


CC: "HI"    is cc_08_4=0xF { 
    local tmp:1 = !(C || Z); 
    export tmp;
}

COND:CC is  CC {
    if (!CC)
      goto inst_next;
}

# Conditional branches without delay slots
:B^COND label9  
is op1_12_4=0xE & COND & label9  {
    # SLEIGH predefine the most important of the symbols:inst_start and inst_next,PC or NPC unused???
    PC = &label9;
    # explicitly create SSA PHI instruction(Decompiler MIR, enum Pcode MULTIEQUAL = 60)
    build COND;
    goto label9;
}

# Conditional branches with delay slots
:B^COND^":D" label9  
is op1_12_4=0xF & COND & label9 {
    # SLEIGH predefine the most important of the symbols:inst_start and inst_next,PC or NPC unused???
    PC = &label9;
    # explicitly create SSA PHI instruction(Decompiler MIR, enum Pcode MULTIEQUAL = 60)
    build COND;
    delayslot(1);
    goto label9;
}
#......
#......
#......

Conclusion

Ghidra vs IDA

Ghidra二进制反编译器是第一个完成度极高的非学术界层面的开源Retargetable Decompiler,与二进制反编译器的事实标准IDA反编译器相比,IDA可能是为了多卖License,公开发行的反编译器却不是Retargetable Decompiler,本来大概价值2至4份License的反编译器卖成了至少价值5份License的价格,变成了一个处理器一个License,有些处理器还分成32位处理器和64位处理器,则需要两个License,需要的处理器反编译器越多,价格越高。Retargetable Decompiler是Ghidra二进制反编译器最大的优势。

Future Work

好像有一些可视化工具可以编写处理器ADL,然后生成反汇编引擎,Ghidra官方一些处理器sinc/slaspec实现有工具生成的痕迹,有可能NSA内部也有类似的可视化工具用来编写反编译前端ADL,实际上,可视化工具开发可认为是基于模型的开发,是否可以采用Eclipse Modeling Framework+Acceleo实现Eclipse插件,用来可视化编写Ghidra反编译器前端ADL?



[公告]安全服务和外包项目请将项目需求发到看雪企服平台:https://qifu.kanxue.com

最后于 2019-6-18 00:02 被vasthao编辑 ,原因: 描述错误更正
上传的附件:
打赏 + 4.00
打赏次数 2 金额 + 4.00
收起 
赞赏  天水姜伯约   +2.00 2019/06/15
赞赏  junkboy   +2.00 2019/06/15
最新回复 (4)
月落之汀 1 2019-6-15 10:19
2
0
Lz辛苦
pureGavin 2019-6-15 14:10
3
0
mark,楼主辛苦了
葫芦娃 1 2019-6-15 21:48
4
0
高端
游客
登录 | 注册 方可回帖
返回