Intro to x86-64
Date: 31, May, 2021
Author: Dhilip Sanjay S
Click Here to go to the TryHackMe room.
Introduction
Computers execute machine code, which is encoded as bytes, to carry out tasks on a computer. Since different computers have different processors, the machine code executed on these computers is specific to the processor.
Intel x86-64 instruction set architecture is commonly used today.
This machine code is usually produced by a compiler, which takes the source code of a file, and after going through some intermediate stages, produces machine code that can be executed by a computer.
16-bit -> 32 bit -> 64 bit (instruction set)
All these instruction sets have been created for backward compatibility, so code compiled for 32 bit architecture will run on 64 bit machines.
Before an executable file is produced, the source code is first compiled into assembly(.s files), after which the assembler converts it into an object program(.o files), and operations with a linker finally make it an executable.
Radare2
radare2 is a framework for reverse engineering and analysing binaries.
It can be used to disassemble binaries(translate machine code to assembly, which is actually readable) and debug said binaries(by allowing a user to step through the execution and view the state of the program).
To debug the executable:
r2 -d <executable>To analyze all:
aaTo set the syntax to AT&T:
e asm.syntax=attHelp command:
?To find a list of the functions run:
afl[0x7fa314fcf090]> afl 0x55824f4fa560 1 42 entry0 0x55824f6fafe0 1 4124 reloc.__libc_start_main 0x55824f4fa590 4 50 -> 40 sym.deregister_tm_clones 0x55824f4fa5d0 4 66 -> 57 sym.register_tm_clones 0x55824f4fa620 5 58 -> 51 entry.fini0 0x55824f4fa550 1 6 sym..plt.got 0x55824f4fa660 1 10 entry.init0 0x55824f4fa730 1 2 sym.__libc_csu_fini 0x55824f4fa734 1 9 sym._fini 0x55824f4fa6c0 4 101 sym.__libc_csu_init 0x55824f4fa66a 1 78 main 0x55824f4fa540 1 6 sym.imp.__printf_chk 0x55824f4fa510 3 23 sym._init 0x55824f4fa000 3 97 -> 123 map.home_tryhackme_introduction_intro.r_xPrint Disassembly Function:
pdf @main
The values on the complete left column are memory addresses of the instructions, and these are usually stored in a structure called the stack.
The middle column contains the instructions encoded in bytes(what is usually the machine code)
The last column actually contains the human readable instructions.
List of registers:
%rax
%eax
%rbx
%ebx
%rcx
%ecx
%rdx
%edx
%rsi
%esi
%rdi
%edi
%rsp
%esp
%rbp
%ebp
%r8
%r8d
%r9
%r9d
%r10
%r10d
%r11
%r11d
%r12
%r12d
%r13
%r13d
%r14
%r14d
%r15
%r15d
The first 6 registers are known as general purpose registers.
The
%rspis the stack pointer and it points to the top of the stack which contains the most recent memory address. The stack is a data structure that manages memory for programs.%rbpis a frame pointer and points to the frame of the function currently being executed - every function is executed in a new frame.
Mov instruction
To move data using registers, the following instruction is used:
movq source, destinationTransferring constants(which are prefixed using the $ operator) e.g.
movq $3 raxwould move the constant 3 to the registerTransferring values from/to a register e.g.
movq %rax %rbxwhich involves moving value from rax to rbxTransferring values from/to memory which is shown by putting registers inside brackets e.g.
movq %rax (%rbx)which means move value stored in %rax to memory location represented by %rbx.
Basic Data Types
Byte
b
1
Word
w
2
Double Word
l
4
Quad
q
8
Single Precision
s
4
Double Precision
l
8
Memory manipulation using registers
(Rb, Ri) = MemoryLocation[Rb + Ri]
D(Rb, Ri) = MemoryLocation[Rb + Ri + D]
(Rb, Ri, S) = MemoryLocation(Rb + S * Ri]
D(Rb, Ri, S) = MemoryLocation[Rb + S * Ri + D]
Important instructions
leaq source, destination: this instruction sets destination to the address denoted by the expression in source (Load Effective Address - It doesn't mov the contents!)
addq source, destination: destination = destination + source
subq source, destination: destination = destination - source
imulq source, destination: destination = destination * source
salq source, destination: destination = destination << source where << is the left bit shifting operator
sarq source, destination: destination = destination >> source where >> is the right bit shifting operator
xorq source, destination: destination = destination XOR source
andq source, destination: destination = destination & source
orq source, destination: destination = destination | source
If statements
If statements use 3 important instructions in assembly:
cmpq source2, source1: it is like computingsource1-source2without setting destinationExample:
cmpl var_4h, %eax(Compare value in eax with var_4h)
testq source2, source1: it is like computingsource1&source2without setting destinationJump instructions
Types of Jumps
jmp
Unconditional
je
Equal/Zero
jne
Not Equal/Not Zero
js
Negative
jns
Nonnegative
jg
Greater
jge
Greater or Equal
jl
Less
jle
Less or Equal
ja
Above(unsigned)
jb
Below(unsigned)
The last 2 values of the table refer to unsigned integers.
Unsigned integers cannot be negative while signed integers represent both positive and negative values.
Since the computer needs to differentiate between them, it uses different methods to interpret these values.
For signed integers, it uses something called the two’s complement representation and for unsigned integers it uses** normal binary calculations**.
Analyzing Jump
To set a breakpoint:
db 0x<Address>To run the program until breakpoint:
dcTo view the value of the registers at breakpoint:
drTo view the value in the variable:
px @locationTo seek/move onto the next instruction:
ds
Analysing if2
Tracing the assembly code
Initially,
var_chwill have 0var_4hwill have 1000var_8hwill have 99
First
jgewill fail (var_ch > var_8h? -> FALSE)Second
jgewill also fail (var_8h > var_4h? -> FALSE)Perform
andoperation onvar_8hwith0x64Perform unconditional jmp
Perform
suboperation onvar_4hwith0x3e7Clear
eaxPop
rbpand return
What is the value of var_8h before the popq and ret instructions?
Answer: 96
Steps to Reproduce:
Hex
60-> Decimal96
What is the value of var_ch before the popq and ret instructions?
Answer: 0
Steps to Reproduce:
What is the value of var_4h before the popq and ret instructions?
Answer: 1
Steps to Reproduce:
What operator is used to change the value of var_8h, input the symbol as your answer(symbols include +, -, *, /, &, |):
Answer: &
Steps to Reproduce:
The following line in the code denotes that
andoperation is perfomed:
Loops
A quicker way to examine the loop would be to add a break point to
cmplinstruction and running dc. Since this is a loop, the program will always break at thecmplinstruction(because this instruction checks the condition before executing what is inside the loop)
Analyzing loop2
What is the value of var_8h on the second iteration of the loop?
Answer: 5
Steps to Reproduce:
What is the value of var_ch on the second iteration of the loop?
Answer: 0
Steps to Reproduce:
What is the value of var_8h at the end of the program?
Answer: 2
Steps to Reproduce:
What is the value of var_ch at the end of the program?
Answer: 0
Steps to Reproduce:
Crackme 1
Analyzing crackme1
What is the password?
Answer: 127.0.0.1
Steps to Reproduce:
Crackme 2
Analyzing crackme2
What is the correct password?
Answer: dwperuc3sv
Steps to Reproduce:
Reading the
secret.txt:
The password is checked in reverse. So, reverse the secret!
References
Last updated