A simple code everybody starts with is a “hello world!” program. Here, using the ‘for’ loop, I make my computer display hello twice. Looking at the code it is easy to figure out what it does.
But it is not exactly how hackers look at it. There is some magic happening there.
What do computers really do? They obediently follow the program (step by step defined recipe). The same way we humans do things. Listen to master’s orders.
But the difference is that they can do the same thing over and over again without breaking sweat (loops) and without being bored don’t complain because they are machines. Dumb without a program, but boy are they fast if you provide a program to run!
But how do they do that? What’s under the hood.
#include
int main(void) {
int i;
for (i = 0; i < 2; ++i)
printf("Hello, world!\n");
return 0;
}
The source code (like the one above) is set of instructions that must be translated (compiled) into a machine language that the CPU understands. All instructions are in fact strings of ones and zeros. Those form something called machine code. CPU can decode them as instructions and compute stuff, which means they operate on data.
Here is what my first program does.
A look into Computer Guts
Compile and run the code. See what it does.pi@tron:~/dh $ gcc -g -o hello hello.c
pi@tron:~/dh $
pi@tron:~/dh $ ./hello
Hello, world!
Hello, world!
pi@tron:~/dh $
Inside the computer the code operates at a different level.
First GDB lesson courtesy of beej.us and few other places on the net.
My first program up close looks like this:
$ objdump -M intel -D hello | grep -A20 main.:
000000000040052d :
40052d: 55 push rbp
40052e: 48 89 e5 mov rbp,rsp
400531: 48 83 ec 10 sub rsp,0x10
400535: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0
40053c: eb 0e jmp 40054c
40053e: bf e4 05 40 00 mov edi,0x4005e4
400543: e8 c8 fe ff ff call 400410
400548: 83 45 fc 01 add DWORD PTR [rbp-0x4],0x1
40054c: 83 7d fc 01 cmp DWORD PTR [rbp-0x4],0x1
400550: 7e ec jle 40053e
400552: b8 00 00 00 00 mov eax,0x0
400557: c9 leave
400558: c3 ret
400559: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
0000000000400560 :
400560: 41 57 push r15
400562: 41 89 ff mov r15d,edi
400565: 41 56 push r14
400567: 49 89 f6 mov r14,rsi
I want gdb to debug code using Intel representation rather than default AT&T. Here’s my file that is going to take care of it:
$ cat ~/.gdbinit
set disassembly intel
$ gdb -q hello
Reading symbols from hello...done.
(gdb) break main
Breakpoint 1 at 0x400535: file hello.c, line 6.
(gdb) run
Starting program: /home/jaro/Desktop/Jun-Dec-2016/14.Code/aoh/hello
Breakpoint 1, main () at hello.c:6
6 for(i = 0; i < 2; ++i)
(gdb) info registers
rax 0x40052d 4195629
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffe178 140737488347512
rsi 0x7fffffffe168 140737488347496
rdi 0x1 1
rbp 0x7fffffffe080 0x7fffffffe080
rsp 0x7fffffffe070 0x7fffffffe070
r8 0x7ffff7dd4e80 140737351863936
r9 0x7ffff7dea530 140737351951664
r10 0x7fffffffdf10 140737488346896
r11 0x7ffff7a36e50 140737348070992
r12 0x400440 4195392
r13 0x7fffffffe160 140737488347488
r14 0x0 0
r15 0x0 0
rip 0x400535 0x400535
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
(gdb) list
1 #include
2
3 int main(void) {
4
5 int i;
6 for(i = 0; i < 2; ++i)
7 printf("hello, world.\n");
8
9 return 0;
10 }
(gdb)
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
=> 0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
(gdb) info register rip
rip 0x400535 0x400535
(gdb)
Move value 0 to the memory location (here in hex) minus 4 bytes.
=> 0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
(gdb) nexti
0x000000000040053c 6 for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
=> 0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
=> 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
Let’s see those CPU flags after we executed comparison. Here SF is set but there is no OF (not set) if I understand it correctly.
(gdb) nexti
0x0000000000400550 6 for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
=> 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb) info register eflags
eflags 0x297 [ CF PF AF SF IF ]
(gdb)
(gdb) nexti
7 printf("hello, world.\n");
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
=> 0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
Let’s read what is at the location 0x4005e4. The gdb instruction x/12x means: examine 12 bytes in hex at the location (here: 0x4005e4).
(gdb) x/12x 0x4005e4
0x4005e4: 0x68 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x77
0x4005ec: 0x6f 0x72 0x6c 0x64
(gdb)
(gdb) nexti
0x0000000000400543 7 printf("hello, world.\n");
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
=> 0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
Next instruction goes back to for loop and adds 1 to variable i.
(gdb) nexti
hello, world.
6 for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
=> 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
=> 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb) info register eflags
eflags 0x202 [ IF ]
(gdb)
(gdb) nexti
9 return 0;
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
=> 0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)
(gdb) nexti
10 }
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
=> 0x0000000000400557 <+42>: leave
0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb) nexti
0x0000000000400558 10 }
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040052d <+0>: push rbp
0x000000000040052e <+1>: mov rbp,rsp
0x0000000000400531 <+4>: sub rsp,0x10
0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4
0x0000000000400543 <+22>: call 0x400410
0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1
0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0
0x0000000000400557 <+42>: leave
=> 0x0000000000400558 <+43>: ret
End of assembler dump.
(gdb)