Intro to x86-64

Date: 31, May, 2021

Author: Dhilip Sanjay S


Click Here to go to the TryHackMe room.

Introduction

  • Computers execute machine code, which is encoded as bytes, to carry out tasks on a computer. Since different computers have different processors, the machine code executed on these computers is specific to the processor.

  • Intel x86-64 instruction set architecture is commonly used today.

  • This machine code is usually produced by a compiler, which takes the source code of a file, and after going through some intermediate stages, produces machine code that can be executed by a computer.

  • 16-bit -> 32 bit -> 64 bit (instruction set)

    • All these instruction sets have been created for backward compatibility, so code compiled for 32 bit architecture will run on 64 bit machines.

  • Before an executable file is produced, the source code is first compiled into assembly(.s files), after which the assembler converts it into an object program(.o files), and operations with a linker finally make it an executable.

Radare2

  • radare2 is a framework for reverse engineering and analysing binaries.

  • It can be used to disassemble binaries(translate machine code to assembly, which is actually readable) and debug said binaries(by allowing a user to step through the execution and view the state of the program).

  • To debug the executable: r2 -d <executable>

  • To analyze all: aa

  • To set the syntax to AT&T: e asm.syntax=att

  • Help command: ?

  • To find a list of the functions run: afl

    [0x7fa314fcf090]> afl
    0x55824f4fa560    1 42           entry0
    0x55824f6fafe0    1 4124         reloc.__libc_start_main
    0x55824f4fa590    4 50   -> 40   sym.deregister_tm_clones
    0x55824f4fa5d0    4 66   -> 57   sym.register_tm_clones
    0x55824f4fa620    5 58   -> 51   entry.fini0
    0x55824f4fa550    1 6            sym..plt.got
    0x55824f4fa660    1 10           entry.init0
    0x55824f4fa730    1 2            sym.__libc_csu_fini
    0x55824f4fa734    1 9            sym._fini
    0x55824f4fa6c0    4 101          sym.__libc_csu_init
    0x55824f4fa66a    1 78           main
    0x55824f4fa540    1 6            sym.imp.__printf_chk
    0x55824f4fa510    3 23           sym._init
    0x55824f4fa000    3 97   -> 123  map.home_tryhackme_introduction_intro.r_x
  • Print Disassembly Function: pdf @main

  • The values on the complete left column are memory addresses of the instructions, and these are usually stored in a structure called the stack.

  • The middle column contains the instructions encoded in bytes(what is usually the machine code)

  • The last column actually contains the human readable instructions.

List of registers:

64 bit
32 bit

%rax

%eax

%rbx

%ebx

%rcx

%ecx

%rdx

%edx

%rsi

%esi

%rdi

%edi

%rsp

%esp

%rbp

%ebp

%r8

%r8d

%r9

%r9d

%r10

%r10d

%r11

%r11d

%r12

%r12d

%r13

%r13d

%r14

%r14d

%r15

%r15d

  • The first 6 registers are known as general purpose registers.

  • The %rsp is the stack pointer and it points to the top of the stack which contains the most recent memory address. The stack is a data structure that manages memory for programs.

  • %rbp is a frame pointer and points to the frame of the function currently being executed - every function is executed in a new frame.

Mov instruction

  • To move data using registers, the following instruction is used: movq source, destination

    • Transferring constants(which are prefixed using the $ operator) e.g. movq $3 rax would move the constant 3 to the register

    • Transferring values from/to a register e.g. movq %rax %rbx which involves moving value from rax to rbx

    • Transferring values from/to memory which is shown by putting registers inside brackets e.g. movq %rax (%rbx) which means move value stored in %rax to memory location represented by %rbx.

Basic Data Types

Initial Data Type
Suffix
Size (bytes)

Byte

b

1

Word

w

2

Double Word

l

4

Quad

q

8

Single Precision

s

4

Double Precision

l

8

Memory manipulation using registers

  • (Rb, Ri) = MemoryLocation[Rb + Ri]

  • D(Rb, Ri) = MemoryLocation[Rb + Ri + D]

  • (Rb, Ri, S) = MemoryLocation(Rb + S * Ri]

  • D(Rb, Ri, S) = MemoryLocation[Rb + S * Ri + D]

Important instructions

  • leaq source, destination: this instruction sets destination to the address denoted by the expression in source (Load Effective Address - It doesn't mov the contents!)

  • addq source, destination: destination = destination + source

  • subq source, destination: destination = destination - source

  • imulq source, destination: destination = destination * source

  • salq source, destination: destination = destination << source where << is the left bit shifting operator

  • sarq source, destination: destination = destination >> source where >> is the right bit shifting operator

  • xorq source, destination: destination = destination XOR source

  • andq source, destination: destination = destination & source

  • orq source, destination: destination = destination | source


If statements

  • If statements use 3 important instructions in assembly:

    • cmpq source2, source1: it is like computing source1-source2 without setting destination

      • Example: cmpl var_4h, %eax (Compare value in eax with var_4h)

    • testq source2, source1: it is like computing source1&source2 without setting destination

    • Jump instructions

Types of Jumps

Jump Type
Description

jmp

Unconditional

je

Equal/Zero

jne

Not Equal/Not Zero

js

Negative

jns

Nonnegative

jg

Greater

jge

Greater or Equal

jl

Less

jle

Less or Equal

ja

Above(unsigned)

jb

Below(unsigned)

  • The last 2 values of the table refer to unsigned integers.

  • Unsigned integers cannot be negative while signed integers represent both positive and negative values.

  • Since the computer needs to differentiate between them, it uses different methods to interpret these values.

  • For signed integers, it uses something called the two’s complement representation and for unsigned integers it uses** normal binary calculations**.

Analyzing Jump

  • To set a breakpoint: db 0x<Address>

  • To run the program until breakpoint: dc

  • To view the value of the registers at breakpoint: dr

  • To view the value in the variable: px @location

  • To seek/move onto the next instruction: ds

  • popq instruction involves popping a value of the stack and reading it.

  • retq instruction sets this popped value to the current instruction pointer.

Analysing if2

Tracing the assembly code

  • Initially,

    • var_ch will have 0

    • var_4h will have 1000

    • var_8h will have 99

  • First jge will fail (var_ch > var_8h? -> FALSE)

  • Second jge will also fail (var_8h > var_4h? -> FALSE)

  • Perform and operation on var_8h with 0x64

  • Perform unconditional jmp

  • Perform sub operation on var_4h with 0x3e7

  • Clear eax

  • Pop rbp and return

What is the value of var_8h before the popq and ret instructions?

  • Answer: 96

  • Steps to Reproduce:

    • Hex 60 -> Decimal 96

What is the value of var_ch before the popq and ret instructions?

  • Answer: 0

  • Steps to Reproduce:

What is the value of var_4h before the popq and ret instructions?

  • Answer: 1

  • Steps to Reproduce:

What operator is used to change the value of var_8h, input the symbol as your answer(symbols include +, -, *, /, &, |):

  • Answer: &

  • Steps to Reproduce:

    • The following line in the code denotes that and operation is perfomed:


Loops

  • A quicker way to examine the loop would be to add a break point to cmpl instruction and running dc. Since this is a loop, the program will always break at the cmpl instruction(because this instruction checks the condition before executing what is inside the loop)

Analyzing loop2

What is the value of var_8h on the second iteration of the loop?

  • Answer: 5

  • Steps to Reproduce:

What is the value of var_ch on the second iteration of the loop?

  • Answer: 0

  • Steps to Reproduce:

What is the value of var_8h at the end of the program?

  • Answer: 2

  • Steps to Reproduce:

What is the value of var_ch at the end of the program?

  • Answer: 0

  • Steps to Reproduce:


Crackme 1

Analyzing crackme1

What is the password?

  • Answer: 127.0.0.1

  • Steps to Reproduce:


Crackme 2

Analyzing crackme2

What is the correct password?

  • Answer: dwperuc3sv

  • Steps to Reproduce:

  • Reading the secret.txt:

  • The password is checked in reverse. So, reverse the secret!


References

Last updated