首页
论坛
专栏
课程

[翻译]Say hello to x86_64 Assembly [part 7]

2020-1-14 11:02 300

[翻译]Say hello to x86_64 Assembly [part 7]

2020-1-14 11:02
300

Say hello to x86_64 Assembly [part 7]

最近在学习x64汇编,在github上面找到了一点学习资料,入门级别的,因为想细致的学习一下,所以顺便久把作者的内容都翻译了一下,也不知道自己翻译的是否合适,请大家看看有问题的地方请批评指正.第一次做翻译,做的不好请大家原谅,

 

作者原文


 

这是Say hello to x86_64 Assembly的第七部分,在这里我们将学习如何将C与汇编程序一起使用。

 

It is seventh part of Say hello to x86_64 Assembly and here we will look on how we can use C together with assembler.

 

实际上,我们有三种方法可以一起使用:

  • -从C代码调用程序集例程
  • -从程序集代码调用c例程
  • -在C代码中使用内联程序集

Actually we have 3 ways to use it together:

  • Call assembly routines from C code
  • Call c routines from assembly code
  • Use inline assembly in C code

让我们编写3个简单的Hello world程序,演示如何将assembly和C结合使用。

 

Let’s write 3 simple Hello world programs which shows us how to use assembly and C together.

从C调用汇编

Call assembly from C

 

首先让我们编写如下简单的C程序:

 

First of all let’s write simple C program like this:

#include <string.h>

int main() {
    char* str = "Hello World\n";
    int len = strlen(str);
    printHelloWorld(str, len);
    return 0;
}

在这里我们可以看到定义了两个变量的C代码:我们将要写入stdout的Hello world字符串和这个字符串的长度。接下来我们调用printHelloWorld汇编函数,并将这两个变量作为参数。当我们使用x86_64Linux时,我们必须知道x86_64Linux调用转换,所以我们将知道如何编写printHelloWorld函数,如何获取传入参数等…当我们调用函数时,前六个参数通过rdi、rsi、rdx、rcx、r8和r9通用寄存器,所有其他都通过堆栈。因此,我们可以从rdi和rsi寄存器中获取第一个和第二个参数,并调用写入 syscall并使用ret指令从函数返回:

 

Here we can see C code which defines two variables: our Hello world string which we will write to stdout and length of this string. Next we call printHelloWorld assembly function with this 2 variables as parameters. As we use x86_64 Linux, we must know x86_64 linux calling convetions, so we will know how to write printHelloWorld function, how to get incoming parameters and etc… When we call function first six parameters passes through rdi, rsi, rdx, rcx, r8 and r9 general purpose registers, all another through the stack. So we can get first and second parameter from rdi and rsi registers and call write syscall and than return from function with ret instruction:

global printHelloWorld

section .text
printHelloWorld:
        ;; 1 arg
        mov r10, rdi
        ;; 2 arg
        mov r11, rsi
        ;; call write syscall
        mov rax, 1
        mov rdi, 1
        mov rsi, r10
        mov rdx, r11
        syscall
        ret

现在我们可以构建它;

 

Now we can build it with:

build:
    nasm -f elf64 -o casm.o casm.asm
    gcc casm.o casm.c -o casm

内联汇编

Inline assembly

 

下面的方法是直接在C代中码编写汇编代码。这里有特殊的语法。它的总体观点是:

 

The following method is to write assembly code directly in C code. There is special syntax for this. It has general view:

asm [volatile] ("assembly code" : output operand : input operand : clobbers);

正如我们在gcc文档中看到的,volatile关键字意味着:

 

As we can read in gcc documentation volatile keyword means:

扩展asm语句的典型用途是操作输入值以生成输出值。但是,你的asm语句也可能产生副作用。如果是,您可能需要使用volatile限定符来禁用某些优化
The typical use of Extended asm statements is to manipulate input values to produce output values. However, your asm statements may also produce side effects. If so, you may need to use the volatile qualifier to disable certain optimizations

每个操作数由约束字符串和括号中的C表达式描述。有许多限制:

 

Each operand is described by constraint string followed by C expression in parentheses. There are a number of constraints:

  • r—通用寄存器中的保留变量值
  • g—允许使用任何寄存器、内存或立即整数操作数,但不是通用寄存器的寄存器除外。
  • f—浮点寄存器
  • m—允许使用内存操作数,通常使用计算机支持的任何类型的地址。
  • 等等…

  • r - Kept variable value in general purpose register

  • g - Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.
  • f - Floating point register
  • m - A memory operand is allowed, with any kind of address that the machine supports in general.
  • and etc…

所以我们的hello world是:

 

So our hello world will be:

#include <string.h>

int main() {
    char* str = "Hello World\n";
    long len = strlen(str);
    int ret = 0;

    __asm__("movq $1, %%rax \n\t"
        "movq $1, %%rdi \n\t"
        "movq %1, %%rsi \n\t"
        "movl %2, %%edx \n\t"
        "syscall"
        : "=g"(ret)
        : "g"(str), "g" (len));

    return 0;
}

这里我们可以看到与前面的示例和内联汇编定义中相同的2个变量。首先,我们将1放入rax和rdi寄存器(编写系统调用号和stdout),就像在我们的普通汇编hello world中那样。接下来,我们对rsi和rdi寄存器执行类似的操作,但第一个操作数以%symbol开始,而不是$。这意味着str是由%1引用的输出操作数,len是由%2引用的第二个输出操作数,因此我们用%n表示法将str和len的值放入rsi和rdi,其中n是输出操作数的个数。此外,寄存器名前面还有一个%%。

 

Here we can see the same 2 variables as in previous example and inline assembly definition. First of all we put 1 to rax and rdi registers (write system call number, and stdout) as we did it in our plain assembly hello world. Next we do similar operation with rsi and rdi registers but first operands starts with % symbol instead $. It means str is the output operand referred by %1 and len second output operand referred by %2, so we put values of str and len to rsi and rdi with %n notation, where n is number of output operand. Also there is %% prefixed to the register name.

这有助于GCC区分操作数和寄存器。操作数有一个%作为前缀
    This helps GCC to distinguish between the operands and registers. operands have a single % as prefix

我们可以用以下方法构建它:

 

We can build it with:

build:
    gcc casm.c -o casm

从汇编调用C

Call C from assembly

 

最后一个方法是从汇编代码中调用C函数。例如,我们有以下简单的C代码,其中一个函数只打印Hello world:

 

And the last method is to call C function from assembly code. For example we have following simple C code with one function which just prints Hello world:

#include <stdio.h>

extern int print();

int print() {
    printf("Hello World\n");
    return 0;
}

现在,我们可以在汇编代码中将此函数定义为extern,并使用call指令调用它,就像我们在前面的文章中多次这样:

 

Now we can define this function as extern in our assembly code and call it with call instruction as we do it much times in previous posts:

global _start

extern print

section .text

_start:
        call print

        mov rax, 60
        mov rdi, 0
        syscall

构建它:

 

Build it with:

build:
    gcc  -c casm.c -o c.o
    nasm -f elf64 casm.asm -o casm.o
    ld   -dynamic-linker /lib64/ld-linux-x86-64.so.2 -lc casm.o c.o -o casm

现在我们可以运行第三个hello world了。

 

and now we can run our third hello world.



2020安全开发者峰会(2020 SDC)议题征集 中国.北京 7月!

最新回复 (0)
游客
登录 | 注册 方可回帖
返回