How does the operating system transfer itself from the hard disk to the memory?

How does the operating system transfer itself from the hard disk to the memory

Previous summary

The top address of the stack is set to 0x9FF00, specifically, the stack segment register ss is 0x9000, and the stack pointer register sp is 0xFF00.

Questions in this section

  • How does the operating system move itself from the hard disk to the memory

Access memory

In the above, the operating system sets the registers ds and cs to 0x9000, and sets the stack top address ss:sp to 0x9FF00, which is far enough away from the code position 0x9000, to ensure that the stack will not easily overwrite the existing code when it develops downward. .

register register name register value function
Data segment register ds 0x9000 Access data
code segment register cs 0x9000 access code
stack top code ss:sp 0x9FF00 Access stack

To put it simply, how to access the data segment of the data, how to access the code segment of the code, and how to access the stack top pointer of the stack –> made a preliminary memory planning

Summary: The first 512 bytes of the hard disk are loaded into memory, but a lot of code is still in other sectors of the hard disk

Get the rest of the operating system code from the hard disk to the memory

load_setup:
    mov dx,#0x0000 ; drive 0, head 0
    mov cx,#0x0002 ; sector 2, track 0
    mov bx,#0x0200 ; address = 512, in 0x9000
    mov ax,#0x0200 + 4 ; service 2, nr of sectors
    int 0x13 ; read it
    jnc ok_load_setup ; ok - continue
    mov dx,#0x0000
    mov ax,#0x0000 ; reset the diskette
    int 0x13
    jmp load_setup

ok_load_setup:
    ...

Tips: There are two int instructions here, note that they are not integer variables in high-level languages, but assembly instructions

int command

The INT instruction is an interrupt instruction in assembly language. It allows programs to call system services or drivers at runtime. The format of the INT instruction is INT imm8, where imm8 is an 8-bit immediate value indicating which interrupt number. For example, INT 0x21 instructs the Interrupt Service Routine to call function 0x21.

The INT instruction transfers control of the program to the Interrupt Service Routine. After the interrupt service routine is executed, the program will resume execution.

The INT instruction is a common system call method in DOS systems, but it is not commonly used in Windows and Linux systems.

Register parameter passing

Register passing refers to the use of registers in assembly language to pass parameters to functions or subroutines. This method does not need to open up space in memory to store parameters, but directly uses registers to store parameters.

Register parameter passing usually uses CPU internal registers to pass parameters. Commonly used registers are EAX, EBX, ECX, EDX, etc. Different compilers and operating systems have different register usage.

The advantage of register passing is that it is fast because data does not need to be copied back and forth between memory and registers. The disadvantage is that more registers are needed to store parameters. If there are too many parameters, the registers may not be enough.

Interruption

In assembly language, interrupts are generally implemented through the INT instruction. The INT instruction transfers program control to the corresponding Interrupt Service Routine (ISR). After the interrupt service routine is executed, it will return to the original program to continue execution.

Each interrupt has a unique interrupt number, and the interrupt number is specified by the INT instruction in assembly to call different interrupt service routines. For example, INT 0x21 will call the interrupt service routine with function number 0x21.

In assembly language, interrupt service routines usually use the IRET instruction to end interrupt processing and return to the original program.

In assembly language, CLI and STI instructions can also be used to control interrupts. CLI instructions are used to disable interrupts, and STI instructions are used to restore interrupts.

Example Take the above code as an example, int 0x13 means to initiate an interrupt number 0x13, the various mov instructions above this instruction are used to assign values to dx, cx, bx, ax, and these four registers are used as this interrupt program parameter.

Extension : Corresponding to register parameter passing is stack parameter passing, which is widely used in C language.

Pseudo *execute function

After an interrupt is initiated, the CPU will use the interrupt number 0x13 to find the entry address of the corresponding interrupt handler, and jump to execute it, which is logically equivalent to executing a function.

Tips: The processing program for the 0x13 interrupt is written for us by the BIOS in advance, specifically the function of reading the related functions of the disk.

What is Linux doing here with this interrupt number 0x13?

load_setup:
    mov dx,#0x0000 ; drive 0, head 0
    mov cx,#0x0002 ; sector 2, track 0
    mov bx,#0x0200 ; address = 512, in 0x9000
    mov ax,#0x0200 + 4 ; service 2, nr of sectors
    int 0x13 ; read it
    ...

The function of this code: start from the second sector of the hard disk, load the data to memory 0x90200, and load 4 sectors in total.

load_setup:
    ...
    jnc ok_load_setup ; ok - continue
    ...
    jmp load_setup

ok_load_setup:
    ...

The jnc and jmp instructions here indicate which label to jump to for success and failure, which is equivalent to if else in a high-level language. If the copy is successful, it will jump to the label ok_load_setup; if it fails, then This code will be executed repeatedly.

Code after ok_load_setup

ok_load_setup:
    ...
    mov ax,#0x1000
    mov es,ax ; segment of 0x10000
    call read_it
    ...
    jmpi 0,0x9020

The function of this code: load the 240 sectors from the 6th sector of the hard disk to the memory 0x10000

At this point, the code of the entire operating system has been loaded from the hard disk into the memory. These codes, through the inter-segment jump instruction jmpi 0,0×9020, jump to 0x90200, which is the content at the beginning of the second sector of the hard disk.

The compilation process of the operating system

The process of compiling an operating system usually involves the following steps:

  1. Configuration: Use configuration files or command-line tools to configure various parameters and options of the system.
  2. Compile: Compile source code into machine code using a compiler. This step can take a long time.
  3. Linking: linking the compiled machine code and other library files together to form an executable file.
  4. Install: Installs executables and other necessary files into the system to get it up and running.
  5. Test: Test the compiled system to ensure it is functional and stable.

For the code that has been read so far, the entire compilation process is completed through the cooperation of Makefile and build.c, and finally achieves such an effect:

  1. Compile bootsect.s into bootsect and place it in sector 1 of the hard disk;
  2. Compile setup.s into setup and place it in sector 2~5 of the hard disk;
  3. Compile and link all the remaining code (head.s as the beginning, together with various .c and other .s files) into a system, and place it in the next 240 sectors of the hard disk.

At the same time, the code at 0x90200, which is the code at the memory address we are about to jump to, is loaded from the second sector of the hard disk. The very beginning of the second sector, which is the content of the setup binary file, is formed by compiling the setup.s source code file.

Summary

Knowing how the operating system loads itself from the hard disk to the memory, and studying the brief process of compiling and loading Linux 0.11, the code of the operating system has been completely moved from the hard disk to the memory.

Last

Organize notes: Qianshi
Source of content: Geek Time “Interesting Reading of Linux Source Code” study notes Day 4
Support: likes, comments, favorites