I’ve occasionally wondered how it is that a computer gets from its power on state to running an operating system. Today I decided to take a look at a small piece of that puzzle by examining a bootloader for an embedded x86 system I was hoping to reverse engineer at some point in the future. This particular machine runs the pSOS real-time operating system on a 486. I had some trouble figuring out where the firmware is actually loaded into memory by cursory examination and googling so I opted to break out IDA Pro and actually take a look at what goes on.
I don’t have access to the BIOS, unfortunately, so I was not able to examine that piece of the puzzle. I’ll save that for a later date when I’m not trying to finish my dissertation. Fortunately, the BIOS functionality seems to be pretty standard. What follows is a lightly edited version of the notes I took while reverse engineering the bootloader.
- The BIOS loads the master boot record (MBR) (sector 1 of the internal CF flash) at
0000:7c00
and then jumps to the bootloader at 0000:7c00
. - The boot loader loads 16 sector starting with sector 26 to
0a00:0000
using the interrupt handler for int 13h
set up by the BIOS. This is the root directory structure for the FAT file system. - It looks up the first cluster number for the
PSOSBOOT.SYS
file (cluster 2). - The first sector of this cluster is loaded to
0a00:0000
which overwrites the root directory. - Next,
es
is set to the first word of 0a00:0000
which is 0x07e0
. - This value is shifted left by 4 and 4 is added to it (to get
0x7e04
) which is then written to 0000:7d14
. The 4 bits that were shifted off the end of the word are written into the least significant 4 bits of 0000:7d16
. The 4 bits don’t change anything since they’re just 0 and 0000:7d16
was already 0. This is self-modifying code! The result is that the last instruction executed by the bootloader is a jmp large far ptr 0010:7e04
(according to IDA). - Next, it needs to load the OS (or a tertiary bootloader) into RAM. It does this at
7e00:0000
. It already looked up the first cluster number for PSOSBOOT.SYS
, so it can immediately load it. It loops through the clusters corresponding to this file, loading it sequentially in memory. - A global descriptor table is constructed at address
0000:5000
. The table is 4 entries long. The zeroth is always ignored by the processor. The first and second are set to have a base address of 0x00000000
and a limit of 0xffffffff
. The first is a data segment that is read/write whereas the second is a code segment that is execute/read. The third entry is all zeros - (Maskable) interrupts are disabled and the processor enables protected mode by writing a bit to the cr0 register. At this point, the /Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3: System Programming Guide/ says that “[r]andom failures can occur” if a far jump or far call is not issued immediately after setting this bit in
cr0
. This code does not do that. Instead, it performs a short jump to the following instruction and will not actually set the cs
register until step~12. - The
ds
, es
, fs
, gs
, and ss
segment registers are loaded with the selector for the first (data) segment. A sequence of 4 instructions whose purpose eludes me.
mov eax, esp
xor bx, bx
mov sp, bx
mov esp, eax
As far as I can tell, the only effect this will have is to clear ebx
. I tested it with with a simple program.
#include <stdio.h>
int main()
{
int eax;
short bx;
short sp;
int esp;
asm("movl %%esp, %%eax\n\t"
"xorw %%bx, %%bx\n\t"
"movw %%bx, %%sp\n\t"
"movl %%eax, %%esp\n\t"
"movl %%eax, %0\n\t"
"movw %%bx, %1\n\t"
"movw %%sp, %2\n\t"
"movl %%esp, %3"
: "=r"(eax), "=r"(bx), "=r"(sp), "=r"(esp)
:
: "eax", "bx", "esp");
printf("eax = %08x\n"
"bx = %04hx\n"
"sp = %04hx\n"
"esp = %08x\n", eax, bx, sp, esp);
return 0;
}
The result is exactly what I expected, eax and esp have the same value, bx
is zero, and sp
is the bottom half of esp. Of course, I was not running this in a strange state between entering protected mode and before loading the cs
register and without virtual memory and a whole host of other environmental issues such as running in ring
- If anyone has any idea why these instructions are here, I’d love to know.
- Finally, it executes a
far jmp to 10:7e04
—this is the instruction that was modified in step 6 above—which sets the cs
register to the second segment and jumps to address 0x7e04
.\n\n
Now I need to look at what is loaded at 0x7e00
…but that can wait for another day.