OS-dev 101: Bootloader

The PC’s Physical Address Space:

+------------------+  <- 0xFFFFFFFF (4GB)
|      32-bit      |
|  memory mapped   |
|     devices      |
|                  |
/\/\/\/\/\/\/\/\/\/\
/\/\/\/\/\/\/\/\/\/\
|                  |
|      Unused      |
|                  |
+------------------+  <- depends on amount of RAM
|                  |
|                  |
| Extended Memory  |
|                  |
|                  |
+------------------+  <- 0x00100000 (1MB)
|     BIOS ROM     |
+------------------+  <- 0x000F0000 (960KB)
|  16-bit devices, |
|  expansion ROMs  |
+------------------+  <- 0x000C0000 (768KB)
|   VGA Display    |
+------------------+  <- 0x000A0000 (640KB)
|                  |
|    Low Memory    |
|                  |
+------------------+  <- 0x00000000

The first PCs, which were based on the 16-bit Intel 8088 processor, were only capable of addressing 1MB of physical memory.
The most important part of this area is the Basic Input/Output System (BIOS), which occupies the 64KB region from 0x000F0000 through 0x000FFFFF.
--------------------------------------------------------------
Intel cpus start in REAL MODE with a 1MB address space. it's hard coded to start with CS:0xF0000 and IP:0xFFF0
real mode physical address = 16 * segment + offset.
0xFFFF0 is 16 bytes before the end of the BIOS (0x100000).
since the top of address space is mapped to BIOS it will start executing from there. this design ensures that the BIOS always gets control of the machine first after power-up.
When the BIOS runs, it sets up an interrupt descriptor table and initializes various devices such as the VGA display. After initializing the PCI bus and all the important devices the BIOS knows about, it searches for a bootable device such as a floppy, hard drive, or CD-ROM. Eventually, when it finds a bootable disk, the BIOS reads the boot loader from the disk and transfers control to it.
--------------------------------------------------------------
hard disks for PCs are divided into 512 byte regions called sectors. A sector is the disk's minimum transfer granularity: each read or write operation must be one or more sectors in size and aligned on a sector boundary. If the disk is bootable, the first sector is called the boot sector, since this is where the boot loader code resides. When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector into memory at physical addresses 0x7c00 through 0x7dff, and then uses a jmp instruction to set the CS:IP to 0000:7c00, passing control to the boot loader. Like the BIOS load address, these addresses are fairly arbitrary - but they are fixed and standardized for PCs.

summary:
The power supply starts the clock generator and asserts #POWERGOOD signal on the bus.
CPU #RESET line is asserted (CPU now in real 8086 mode).
%ds=%es=%fs=%gs=%ss=0, %cs=0xFFFF0000,%eip = 0x0000FFF0 (ROM BIOS POST code).
All POST checks are performed with interrupts disabled.
IVT (Interrupt Vector Table) initialised at address 0.
The BIOS Bootstrap Loader function is invoked via int 0x19, with %dl containing the boot device 'drive number'. This loads track 0, sector 1 at physical address 0x7C00 (0x07C0:0000).

                             Memory
                             +------------------+ <- 0x0000fa00 
               +-----------  | [     BOOT1    ] | <- 0x00007e00
               |             +------------------+
          +----(--(BIOS)---  | [     BOOT0    ] | <- 0x00007c00
          |    |             +------------------+
          |    |             | [     stack    ] |
          |    |             |                  |
          |    |             +------------------+ <- 0x00000500
          |  /------------\
        +---+---+---\  /---+---\   /---+
  Disk  |MBR|   |   /  \   |   /   \   |
        |   |   |   \  /   |   \   /   |
        +---+---+---/  \---+---/   \---+
Sector    1   2   3      63  64 ...

sources:
http://www.tldp.org/LDP/lki/lki-1.html
flint.cs.yale.edu/cs422/assignments/as1.html

Segment registers

GDT stands for Global Descriptor Table.
The important thing here is the word table and by table Intel means array.

Being a table an array it has elements, each element is called Descriptor. Of course each element can be indexed, i.e. it has an unique index

The segment selector registers hold the index of a descriptor. The value into a segment selector register is called selector.

However things are a bit more elaborated.
An index is not a selector.

Beside the index, a segment selector register, holds two more things:

  1. The privilege that the programmer want to use for accessing the descriptor. This is called RPL (Request Privilege Level) for every register but CS in which is called CPL (Current Privilege Level). The role RPL and CPL play in using the segment selector register is defined in Intel manuals and it is too long to be explained here.
  2. The table to use for looking into. One table is the GDT the other one is the LDT. Again differences and uses are in the Intel manuals.

Whenever a program is loaded, the linking loader loads the “Segment Registers” with the appropriate selectors.
A Segment Register ( e.g. CS, DS, SS, etc) is divided in two parts: Visible and Hidden.
It is the visible part which is loaded by the loader with the appropriate value.
This value is an index in GDT or LDT, depending on the T flag of the selector.
The processor loads the hidden part by itself. The information in hidden part is the segment base address in the linear address space, segment limit, access information.

So the informal rule is:

selector = index + table_to_use + privilege
table_to_use + index = descriptor = all the information about the segment of memory to be used

where, of course, + does not mean arithmetic plus at all. The actual bit field for a segment selector register is

15                                                 3    2        0
+--------------------------------------------------+----+--------+
| Index                                            | T  |   RPL  |
+--------------------------------------------------+----+--------+

T = Table Indicator:     0 = GDT, 1 = LDT


So for example the value 05h select the descriptor with index 0 (in the LDT using RPL=1), which is invalid as Intel explicitly mandates to not use the descriptor 0.
The first usable descriptor is accessible with the selector 08h which select as table the GDT with RPL=0. The values 08h-0bh all select the descriptor with index 1 in the GDT, just with different RPL.

SUMMARY:
linear_address = LDT[segment_selector] + offset
linear_address = GDT[segment_selector] + offset

Here’s a complete ASCII art of the terminology

<---- Selector ---->           Segment Selector Registers 
+--------------+-+--+          
| Index        |T|PL| = DS  
+--------------+-+--+
  (13-bits)     X XX --(2-bits)-> lower number higher privilege
   |            |
   |        0---+----------------------1
   |        |          GDTR            |          LDTR
   |        |   +-----------------+    |    +-----------------+
   |        +-->|GDT base-address |    +--->|LDT base-address |
   |            +-----------------+         +-----------------+
  index         |    base + 1     |         |    base + 1     |
   |            +-----------------+         +-----------------+
   |            |                 |         |                 |
   |            ...   ...   ... ...         ...   ...   ... ...
   |            |                 |         |                 |
   |            +-----------------+         +-----------------+
   +----------->|  base + index   |         |  base + index   |
                +-----------------+         +.................+
                |                 |         |                 |
                ...  ...  ...   ...         ...   ... ...   ...
Index = Offset within Descriptor Table
T     = table index: 0=GDT, 1=LDT
PL    = Requested Privilege Level (RPL) *
* The RPL is compared against the Privilege Level encoded inside the Descriptor (Descriptor Privilege Level - DPL).
Any mismatches and the request causes a Fault

About how it is calculated and why we need it

Short answer: Read the Intel manual as it is the complete reference.

Long answer: We, user mode programmers, don’t calculate it. Since all this thing with the segments boils down to limiting the privileges of a program and since it is the OS that commands and not us, we simply use the value that the OS gives us (by loading our program basically) as the GDT and the LDT are set up by the OS and it is now willing to cooperate with us in any aspect of it.

We need segments because in Real mode there were segments (Google for more info), because they can avoid relocation an gather isolation in absence of paging (Google for more info) and because segments now incorporates more info than a simply base offset and a limit. One of all: the Descriptor Privilege Level that limits the privileges of a user mode program.

x86-64 Cheat sheet

 

INSTRUCTION FORMAT:
opcode    destination-operand,  source-operand

ADDRESSING MODES: 
opcode    register, register
opcode    register, immediate
opcode    register, memory
opcode    memory,   register
opcode    memory,   immediate

(memory to memory is illegal) 

MOV INSTRUCTION:
MOV   DEST,   SRC;      copy SRC into DEST
MOV   DEST,   [SRC];    copy value at memory address SRC into DEST


LEA INSTRUCTION:
LEA   DEST,   [SRC[;      compute address in SRC and copy into DEST

MOV vs LEA:
MOV   eax,   var      == lea eax, [var] ; i.e. mov r32, imm32
LEA   eax,   [var+16] == mov eax, var+16
LEA   eax,   [eax*4]  == shl eax, 2 ; but without setting flags

MOV EDX, [EBX + 8*EAX + 4]
LEA ESI, [EBX + 8*EAX + 4]

PUSH INSTRUCTION:
1- Store the pushed value at current address of ESP register.
2- Decrement the ESP register to size of pushed value.
ARRAY ADDRESSING:

Let's assume that ebx is the base register and esi is the index register of the element. 4 is the scaling factor for dword array. 

to read the value from array into eax:  
MOV     eax,         [ebx+4*esi]  
to store the value of eax into array: 
MOV    [ebx+4*esi],  eax 

The value of index register can be optionally scaled with 2, 4 or 8. In this example we can use a scaling factor 8 for a struct array in which each struct consists of 2 dwords. (in 386 legal scaling factors are 1, 2, 4 and 8).

to read the value of 2nd dword from array into eax:
MOV     eax,         [ebx+8*esi+4]
DATATYPES:

BINARY:
0001 0010 00110100010101100111100010011010101111001101111011110001
==== ==== ========------------------------------------------------
   |    |        |                |                              |
   4    8        16              32                             64 
   N    B        W               DW                             QW

HEX:    0123 4567 89AB CDEF
qword   ==== ==== ==== ====
dword   ==== ====
word    ====
byte    ==
nibble  = 
                                                              bit
                                                nibble   =  4 bits
                                   byte    =  2 nibbles  =  8 bits
                       WORD    = 2 bytes   =  4 nibbles  = 16 bits
           DWORD   = 2 WORDs   = 4 bytes   =  8 nibbles  = 32 bits
QWORD  = 2 DWORDs  = 4 WORDs   = 8 bytes   = 16 nibbles  = 64 bits
The low byte (bits 0 through 7) of each data type occupies the lowest address in memory and that address is also
the address of the operand.
64-BIT OPERATIONS:

0x1122334455667788
  ================  rax (64 bits)
          ========  eax (32 bits)
              ====  ax  (16 bits)
              ==    ah   (8 bits)
                ==  al  (8 bits)

mov eax, 0x11112222 ; eax = 0x11112222
mov ax,  0x3333     ; eax = 0x11113333 (low 16 bits changed)
mov al,  0x44       ; eax = 0x11113344 (low 8 bits changed)
mov ah,  0x55       ; eax = 0x11115544 (high 8 bits changed)
xor ah,  ah         ; eax = 0x11110044 (high 8 bits cleared)
mov eax, 0x11112222 ; eax = 0x11112222
xor al,  al         ; eax = 0x11112200 (low 8 bits cleared)
mov eax, 0x11112222 ; eax = 0x11112222
xor ax, ax          ; eax = 0x11110000 (low 16 bits cleared)

mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
mov eax, 0x55556666         ; actual: rax = 0x0000000055556666
                            ; expected: rax = 0x1111222255556666
                            ; upper 32 bits seem to be lost!
mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
mov ax, 0x7777              ; rax = 0x1111222233337777 (works!)
mov rax, 0x1111222233334444 ; rax = 0x1111222233334444
xor eax, eax                ; actual: rax = 0x0000000000000000
                            ; expected: rax = 0x1111222200000000