I’ve occasionally wondered how it is that a computer gets from its power on state to running an operating system. Today I decided to take a look at a small piece of that puzzle by examining a bootloader for an embedded x86 system I was hoping to reverse engineer at some point in the future. This particular machine runs the pSOS real-time operating system on a 486. I had some trouble figuring out where the firmware is actually loaded into memory by cursory examination and googling so I opted to break out IDA Pro and actually take a look at what goes on.
I don’t have access to the BIOS, unfortunately, so I was not able to examine that piece of the puzzle. I’ll save that for a later date when I’m not trying to finish my dissertation. Fortunately, the BIOS functionality seems to be pretty standard. What follows is a lightly edited version of the notes I took while reverse engineering the bootloader.
- The BIOS loads the master boot record (MBR) (sector 1 of the internal CF flash) at
0000:7c00 and then jumps to the bootloader at
- The boot loader loads 16 sector starting with sector 26 to
0a00:0000 using the interrupt handler for
int 13h set up by the BIOS. This is the root directory structure for the FAT file system.
- It looks up the first cluster number for the
PSOSBOOT.SYS file (cluster 2).
- The first sector of this cluster is loaded to
0a00:0000 which overwrites the root directory.
es is set to the first word of
0a00:0000 which is
- This value is shifted left by 4 and 4 is added to it (to get
0x7e04) which is then written to
0000:7d14. The 4 bits that were shifted off the end of the word are written into the least significant 4 bits of
0000:7d16. The 4 bits don’t change anything since they’re just 0 and
0000:7d16 was already 0. This is self-modifying code! The result is that the last instruction executed by the bootloader is a
jmp large far ptr 0010:7e04 (according to IDA).
- Next, it needs to load the OS (or a tertiary bootloader) into RAM. It does this at
7e00:0000. It already looked up the first cluster number for
PSOSBOOT.SYS, so it can immediately load it. It loops through the clusters corresponding to this file, loading it sequentially in memory.
- A global descriptor table is constructed at address
0000:5000. The table is 4 entries long. The zeroth is always ignored by the processor. The first and second are set to have a base address of
0x00000000 and a limit of
0xffffffff. The first is a data segment that is read/write whereas the second is a code segment that is execute/read. The third entry is all zeros
- (Maskable) interrupts are disabled and the processor enables protected mode by writing a bit to the cr0 register. At this point, the /Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3: System Programming Guide/ says that “[r]andom failures can occur” if a far jump or far call is not issued immediately after setting this bit in
cr0. This code does not do that. Instead, it performs a short jump to the following instruction and will not actually set the
cs register until step~12.
ss segment registers are loaded with the selector for the first (data) segment.
A sequence of 4 instructions whose purpose eludes me.
mov eax, esp
xor bx, bx
mov sp, bx
mov esp, eax
As far as I can tell, the only effect this will have is to clear
ebx. I tested it with with a simple program.
asm("movl %%esp, %%eax\n\t"
"xorw %%bx, %%bx\n\t"
"movw %%bx, %%sp\n\t"
"movl %%eax, %%esp\n\t"
"movl %%eax, %0\n\t"
"movw %%bx, %1\n\t"
"movw %%sp, %2\n\t"
"movl %%esp, %3"
: "=r"(eax), "=r"(bx), "=r"(sp), "=r"(esp)
: "eax", "bx", "esp");
printf("eax = %08x\n"
"bx = %04hx\n"
"sp = %04hx\n"
"esp = %08x\n", eax, bx, sp, esp);
The result is exactly what I expected, eax and esp have the same value,
bx is zero, and
sp is the bottom half of esp. Of course, I was not running this in a strange state between entering protected mode and before loading the
cs register and without virtual memory and a whole host of other environmental issues such as running in ring
- If anyone has any idea why these instructions are here, I’d love to know.
- Finally, it executes a
far jmp to 10:7e04—this is the instruction that was modified in step 6 above—which sets the
cs register to the second segment and jumps to address
Now I need to look at what is loaded at
0x7e00…but that can wait for another day.