Typical Assembly Language
Introduction
Assembly languages are usually unstructured, i.e. there are no control structures such as if or while statements: these have to be manually implemented using labels and jump (goto) instructions. There may exist macros that mimic control structures. The typical look of an assembly program is however still a single column of instructions with arguments, one per line, each representing one machine instruction.
The working of the language reflects the actual hardware architecture – most architectures are based on registers so usually there is a small number (something like 16) of registers which may be called something like R0 to R15, or A, B, C etc. Sometimes registers may even be subdivided (e.g. in x86 there is an eax 32bit register and half of it can be used as the ax 16bit register). These registers are the fastest available memory (faster than the main RAM memory) and are used to perform calculations. Some registers are general purpose and some are special: typically there will be e.g. the FLAGS register which holds various 1bit results of performed operations (e.g. overflow, zero result etc.). Some instructions may only work with some registers (e.g. there may be kind of a “pointer” register used to hold addresses along with instructions that work with this register, which is meant to implement arrays). Values can be moved between registers and the main memory (with instructions called something like move, load or store).
Writing instructions works similarly to how you call a function in high level language: you write its name and then its arguments, but in assembly things are more complicated because an instruction may for example only allow certain kinds of arguments – it may e.g. allow a register and immediate constant (kind of a number literal/constant), but not e.g. two registers. You have to read the documentation for each instruction. While in high level language you may write general expressions as arguments (like myFunc(x + 2 * y,myFunc2())), here you can only pass specific values.
There are also no complex data types, assembly only works with numbers of different size, e.g. 16 bit integer, 32 bit integer etc. Strings are just sequences of numbers representing ASCII values, it is up to you whether you implement null terminated strings or Pascal style strings. Pointers are just numbers representing addresses. It is up to you whether you interpret a number as signed or unsigned (some instructions treat numbers as unsigned, some as signed). Instructions are typically written as three-letter abbreviations and follow some unwritten naming conventions so that different assembly languages at least look similar. Common instructions found in most assembly languages are for example:
- MOV (move): move a number between registers and/or main memory (RAM).
- JMP (jump, also e.g. BRA for branch): unconditional jump to far away instruction.
- JEQ (jump if equal, also BEQ etc.): jump if result of previous comparison was equality.
- ADD (add): add two numbers.
- NOP (no operation): do nothing (used e.g. for delays or as placeholders).
- CMP (compare): compare two numbers and set relevant flags (typically for a subsequent conditional jump).
How to
On Unices the objdump utility from GNU binutils can be used to disassemble compiled programs, i.e view the instructions of the program in assembly (other tools like ndisasm can also be used). Use it e.g. as:
objdump -d my_compiled_program Let’s now write a simple Unix program in x86 assembly (AT&T syntax). Write the following source code into a file named e.g. program.s:
.global _start # include the symbol in object file
str:
.ascii "it works\n" # the string data
.text
_start: # execution starts here
mov $5, %rbx # store loop counter in rbx
.loop:
# make a Linux "write" syscall:
# args to syscall will be passed in regs.
mov $1, %rax # says syscalls type (1 = write)
mov $1, %rdi # says file to write to (1 = stdout)
mov $str, %rsi # says the address of the string to write
mov $9, %rdx # says how many bytes to write
syscall # makes the syscall
sub $1, %rbx # decrement loop counter
cmp $0, %rbx # compare it to 0
jne .loop # if not equal, jump to start of the loop
# make an "exit" syscall to properly terminate:
mov $60, %rax # says syscall type (60 = exit)
mov $0, %rdi # says return value (0 = success)
syscall # makes the syscall
The program just writes out it works five times: it uses a simple loop and a Unix system call for writing a string to standard output (i.e. it won’t work on Windows and similar shit).
Now assembly source code can be manually assembled into executable by running assemblers like as or nasm to obtain the intermediate object file and then linking it with ld, but to assemble the above written code simply we may just use the gcc compiler which does everything for us:
gcc -nostdlib -no-pie -o program program.s
Now we can run the program with
./program And we should see
it works
it works
it works
it works
it works
As an exercise you can objdump the final executable and see that the output basically matches the original source code. Furthermore try to disassemble some primitive C programs and see how a compiler e.g. makes if statements or functions into assembly.
Example Let’s take the following C code:
#include <stdio.h>
char incrementDigit(char d)
{
return // remember this is basically an if statement
d >= '0' && d < '9' ?
d + 1 :
'?';
}
int main(void)
{
char c = getchar();
putchar(incrementDigit(c));
return 0;
}
We will now compile it to different assembly languages (you can do this e.g. with gcc -S my_program.c). This assembly will be pretty long as it will contain boilerplate and implementations of getchar and putchar from standard library, but we’ll only be looking at the assembly corresponding to the above written code. Also note that the generated assembly will probably differ between compilers, their versions, flags such as optimization level etc. The code will be manually commented.
The x86 assembly may look like this:
incrementDigit:
pushq %rbp # save base pointer
movq %rsp, %rbp # move base pointer to stack top
movl %edi, %eax # move argument to eax
movb %al, -4(%rbp) # and move it to local var.
cmpb $47, -4(%rbp) # compare it to '0'
jle .L2 # if <=, jump to .L2
cmpb $56, -4(%rbp) # else compare to '9'
jg .L2 # if >, jump to .L4
movzbl -4(%rbp), %eax # else get the argument
addl $1, %eax # add 1 to it
jmp .L4 # jump to .L4
.L2:
movl $63, %eax # move '?' to eax (return val.)
.L4:
popq %rbp # restore base pointer
ret
main:
pushq %rbp # save base pointer
movq %rsp, %rbp # move base pointer to stack top
subq $16, %rsp # make space on stack
call getchar # push ret. addr. and jump to func.
movb %al, -1(%rbp) # store return val. to local var.
movsbl -1(%rbp), %eax # move with sign extension
movl %eax, %edi # arg. will be passed in edi
call incrementDigit
movsbl %al, %eax # sign extend return val.
movl %eax, %edi # pass arg. in edi again
call putchar
movl $0, %eax # values are returned in eax
leave
ret
The ARM assembly may look like this:
incrementDigit:
sub sp, sp, #16 // make room on stack
strb w0, [sp, 15] // load argument from w0 to local var.
ldrb w0, [sp, 15] // load back to w0
cmp w0, 47 // compare to '0'
bls .L2 // branch to .L2 if <
ldrb w0, [sp, 15] // load argument again to w0
cmp w0, 56 // compare to '9'
bhi .L2 // branch to .L2 if >=
ldrb w0, [sp, 15] // load argument again to w0
add w0, w0, 1 // add 1 to it
and w0, w0, 255 // mask out lowest byte
b .L3 // branch to .L3
.L2:
mov w0, 63 // set w0 (ret. value) to '?'
.L3:
add sp, sp, 16 // shift stack pointer back
ret
main:
stp x29, x30, [sp, -32]! // shift stack and store x regs
mov x29, sp
bl getchar
strb w0, [sp, 31] // store w0 (ret. val.) to local var.
ldrb w0, [sp, 31] // load it back to w0
bl incrementDigit
and w0, w0, 255 // mask out lowest byte
bl putchar
mov w0, 0 // set ret. val. to 0
ldp x29, x30, [sp], 32 // restore x regs
ret
The RISC-V assembly may look like this:
incrementDigit:
addi sp,sp,-32 # shift stack (make room)
sw s0,28(sp) # save frame pointer
addi s0,sp,32 # shift frame pointer
mv a5,a0 # get arg. from a0 to a5
sb a5,-17(s0) # save to to local var.
lbu a4,-17(s0) # get it to a4
li a5,47 # load '0' to a4
bleu a4,a5,.L2 # branch to .L2 if a4 <= a5
lbu a4,-17(s0) # load arg. again
li a5,56 # load '9' to a5
bgtu a4,a5,.L2 # branch to .L2 if a4 > a5
lbu a5,-17(s0) # load arg. again
addi a5,a5,1 # add 1 to it
andi a5,a5,0xff # mask out the lowest byte
j .L3 # jump to .L3
.L2:
li a5,63 # load '?'
.L3:
mv a0,a5 # move result to ret. val.
lw s0,28(sp) # restore frame pointer
addi sp,sp,32 # pop stack
jr ra # jump to addr in ra
main:
addi sp,sp,-32 # shift stack (make room)
sw ra,28(sp) # store ret. addr on stack
sw s0,24(sp) # store stack frame pointer on stack
addi s0,sp,32 # shift frame pointer
call getchar
mv a5,a0 # copy return val. to a5
sb a5,-17(s0) # move a5 to local var
lbu a5,-17(s0) # load it again to a5
mv a0,a5 # move it to a0 (func. arg.)
call incrementDigit
mv a5,a0 # copy return val. to a5
mv a0,a5 # get it back to a0 (func. arg.)
call putchar
li a5,0 # load 0 to a5
mv a0,a5 # move it to a0 (ret. val.)
lw ra,28(sp) # restore return addr.
lw s0,24(sp) # restore frame pointer
addi sp,sp,32 # pop stack
jr ra # jump to addr in ra