Mastering Reverse Engineering
上QQ阅读APP看书,第一时间看更新

Registers

In programming, processing data requires variables. You can simply think of registers as variables in assembly language. However, not all registers are treated as plain variables, but rather, each register has a designated purpose. The registers are categorized as being one of the following:

  • General purpose registers
  • Segment registers
  • Flag registers
  • Instruction pointers

In x86 architecture, each general purpose register has its designated purpose and is stored at WORD size, or 16 bits, as follows:

  • Accumulator (AX)
  • Counter (CX)
  • Data (DX)
  • Base (BX)
  • Stack pointer (SP)
  • Base pointer (BP)
  • Source index (SI)
  • Destination index (DI)

For registers AX, BX, CX, and DX, the least and most significant bytes can be accessed by smaller registers. For AX, the lower 8 bits can be read using the AL register, while the upper 8 bits can be read using the AH register, as shown here:

When running code, the system needs to identify where the code is at. The Instruction Pointer (IP) register is the one that contains the memory address where the next assembly instruction to be executed is stored.

System states and logical results of executed code are stored in the FLAGS register. Every bit of the FLAGS register has its own purpose, with some of the definitions given in the following table:

All of these flags have a purpose, but the flags that are mostly monitored and used are the carry, sign, zero, overflow, and parity flags.

All these registers have an "extended" mode for 32-bits. It can accessed with a prefixed "E" (EAX, EBX, ECX, EDX, ESP, EIP, and EFLAGS). The same goes with 64-bit mode, which can be accessed with a prefixed "R" (RAX, RBX, RCX, RDX, RSP, and RIP).

The memory is divided into sections such as the code segment, stack segment, data segment, and other sections. The segment registers are used to identify the starting location of these sections, as follows:

  • Stack segment (SS)
  • Code segment (CS)
  • Data segment (DS)
  • Extra segment (ES)
  • F segment (FS)
  • G segment (GS)

When a program loads, the operating system maps the executable file to the memory. The executable file contains information to which data maps respective segments. The code segment contains the executable code. The data segment contains the data bytes, such as constants, strings, and global variables. The stack segment is allocated to contain runtime function variables and other processed data. The extra segment is similar to the data segment, but this space is commonly used to move data between variables. Some 16-bit operating systems, such as DOS, make use of the SS, CS, DS, and ES since there are only 64 kilobytes allocated per segment. However, in modern operating systems (32-bit systems and higher) these four segments are set in the same memory space, while FS and GS point to process and thread information respectively.