2017年12月21日 星期四

boot process

In the early days, bootstrapping a computer meant feeding a paper tape containing a boot program or manually loading a boot program using the front panel address/data/control switches. Today's computers are equipped with facilities to simplify the boot process, but that doesn't necessarily make it simple.
Let's start with a high-level view of Linux boot so you can see the entire landscape. Then we'll review what's going on at each of the individual steps. Source references along the way will help you navigate the kernel tree and dig in further.

Overview

Figure 1 gives you the 20,000-foot view.
Figure 1. The 20,000-foot view of the Linux boot process
High-level view of the Linux kernel boot
When a system is first booted, or is reset, the processor executes code at a well-known location. In a personal computer (PC), this location is in the basic input/output system (BIOS), which is stored in flash memory on the motherboard. The central processing unit (CPU) in an embedded system invokes the reset vector to start a program at a known address in flash/ROM. In either case, the result is the same. Because PCs offer so much flexibility, the BIOS must determine which devices are candidates for boot. We'll look at this in more detail later.
When a boot device is found, the first-stage boot loader is loaded into RAM and executed. This boot loader is less than 512 bytes in length (a single sector), and its job is to load the second-stage boot loader.
啟動裝置被找到以後,第一個階段是將boot loader載入至RAM中並且執行。boot loader小於 512bytes(一個sector大小),boot loader的任務是交棒給第二階段。
When the second-stage boot loader is in RAM and executing, a splash screen is commonly displayed, and Linux and an optional initial RAM disk (temporary root file system) are loaded into memory. When the images are loaded, the second-stage boot loader passes control to the kernel image and the kernel is decompressed and initialized. At this stage, the second-stage boot loader checks the system hardware, enumerates the attached hardware devices, mounts the root device, and then loads the necessary kernel modules. When complete, the first user-space program (init) starts, and high-level system initialization is performed.
當第二階段boot loader已經在RAM中執行,螢幕通常已經被點亮,linux以及擴充RAM(temporary root file system)被載入至記憶體中。當原本在儲存裝置裡的image已經載入到記憶體中,第二階段boot loader會交移控制給kernel並且kernel解壓鎖並且初始化。在這個階段的時候,要檢查系統的硬體,列舉出附屬的硬體裝置,掛載ROOT的裝置,然後載入必要的kernel模組。完成後開始第一使用者空間程式初始化,執行high level初始化。
That's Linux boot in a nutshell(簡而言之). Now let's dig in a little further and explore some of the details of the Linux boot process.

System startup

The system startup stage depends on the hardware that Linux is being booted on. On an embedded platform, a bootstrap environment is used when the system is powered on, or reset. Examples include U-Boot, RedBoot, and MicroMonitor from Lucent. Embedded platforms are commonly shipped(運送) with a boot monitor. These programs reside in special region of flash memory on the target hardware and provide the means to download a Linux kernel image into flash memory and subsequently execute it. In addition to having the ability to store and boot a Linux image, these boot monitors perform some level of system test and hardware initialization. In an embedded target, these boot monitors commonly cover both the first- and second-stage boot loaders.
系統啟動的階段,取決於正在booted on的硬體。在嵌入式環境,bootstrap是使用在上電開機的時候。嵌入式的平台,通常會帶有boot的監控器。boot monitor的程式通常在flash的特定區塊,被安置在目標的硬體上,並提供方法可以下載image進入flash中並執行。除了能夠儲存和boot Linux的image,boot monitors也實現系統層級的test,和硬體初始化。在嵌入式平台上,boot monitor通常就做掉第一和第二階段的boot loader工作。


In a PC, booting Linux begins in the BIOS at address 0xFFFF0. The first step of the BIOS is the power-on self test (POST). The job of the POST is to perform a check of the hardware. The second step of the BIOS is local device enumeration and initialization.

在PC環境中,BIOS過程中 booting 起始在address 位址 0xFFFF0處。BIOS的第一步是POST(上電自我檢查),POST的工作就是檢查硬體。第二步是列舉所有硬體及初始化。
Given the different uses of BIOS functions, the BIOS is made up of two parts: the POST code and runtime services. After the POST is complete, it is flushed from memory, but the BIOS runtime services remain and are available to the target operating system.
鑒於BIOS function 有不同用途,BIOS主要由兩個部分組成,POST的code與執行時期的服務(services)。POST完成以後,完成以後POST的code就可以從記憶體中拿掉或覆蓋了,除了BIOS runtime services會保留到操作系統起來。
To boot an operating system, the BIOS runtime searches for devices that are both active and bootable in the order of preference defined by the complementary metal oxide semiconductor (CMOS) settings. A boot device can be a floppy disk, a CD-ROM, a partition on a hard disk, a device on the network, or even a USB flash memory stick.
為了boot作業系統起來,BIOS的執行時期按照CMOS的設定去搜尋可以被boot且是active的裝置。這個boot裝置可以是軟碟、CDROM或是硬碟的區塊、網路裝置或是USB(flash)啟動。
Commonly, Linux is booted from a hard disk, where the Master Boot Record (MBR) contains the primary boot loader. The MBR is a 512-byte sector, located in the first sector on the disk (sector 1 of cylinder 0, head 0). After the MBR is loaded into RAM, the BIOS yields control to it.
通常都是從硬碟boot,在硬碟MBR(master boot record)的位置,那裡存放了primary boot loader。MBR是512byte的sector,位在硬碟第一個區塊(sector 1 of cylinder 0, head 0),在MBR載入至RAM後,BIOS就交移控制權給boot loader。

Stage 1 boot loader

The primary boot loader that resides in the MBR is a 512-byte image containing both program code and a small partition table (see Figure 2). The first 446 bytes are the primary boot loader, which contains both executable code and error message text. The next sixty-four bytes are the partition table, which contains a record for each of four partitions (sixteen bytes each). The MBR ends with two bytes that are defined as the magic number (0xAA55). The magic number serves as a validation check of the MBR.
boot loader放在MBR,是一塊512byte的image,裡面包含程式和分區表。前面的446byte是 primary boot loader,包含可以執行的code和error訊息文字。剩下的64byte是分區表,最後有2byte是0xAA55,用來辨識是MBR。

Figure 2. Anatomy of the MBR
Anatomy of the MBR
The job of the primary boot loader is to find and load the secondary boot loader (stage 2). It does this by looking through the partition table for an active partition. When it finds an active partition, it scans the remaining partitions in the table to ensure that they're all inactive. When this is verified, the active partition's boot record is read from the device into RAM and executed.
primary boot loader 的工作是去找到並且載入第二個boot loader(stage 2)。透過查找分區表找裡面是active的分區。找到active的分區以後,繼續搜尋分區表確保裡面剩下的都是inactive的分區。當驗證後,將active分區的boot record寫進到RAM並執行。

Stage 2 boot loader

The secondary, or second-stage, boot loader could be more aptly called the kernel loader. The task at this stage is to load the Linux kernel and optional initial RAM disk.
第二階段boot loader可以呼叫kernel loader,這階段的任務是載入Kernel和選擇性的初始化RAM disk。
The first- and second-stage boot loaders combined are called Linux Loader (LILO) or GRand Unified Bootloader (GRUB) in the x86 PC environment. Because LILO has some disadvantages that were corrected in GRUB, let's look into GRUB. (See many additional resources on GRUB, LILO, and related topics in the Resources section later in this article.)
第一和第二階段結合的 boot loader叫 LILO(Linux Loader)或是 GRUB(GRand Unified Bootloader)。
The great thing about GRUB is that it includes knowledge of Linux file systems. Instead of using raw sectors on the disk, as LILO does, GRUB can load a Linux kernel from an ext2 or ext3 file system. It does this by making the two-stage boot loader into a three-stage boot loader. Stage 1 (MBR) boots a stage 1.5 boot loader that understands the particular file system containing the Linux kernel image. Examples include reiserfs_stage1_5 (to load from a Reiser journaling file system) or e2fs_stage1_5 (to load from an ext2 or ext3 file system). When the stage 1.5 boot loader is loaded and running, the stage 2 boot loader can be loaded.
GRUB最棒的事情是包含LINUX檔案系統的知識。GRUB可以從ext2或是ext3的檔案系統載入Linux Kernel。MBR(stage 1) boot stage 1.5boot loader釐清特定的檔案系統裡面包含linux kernel。stage 1.5完成 stage 2就可以被載入。
With stage 2 loaded, GRUB can, upon request, display a list of available kernels (defined in /etc/grub.conf, with soft links from /etc/grub/menu.lst and /etc/grub.conf). You can select a kernel and even amend it with additional kernel parameters. Optionally, you can use a command-line shell for greater manual control over the boot process.
當stage 2載入,GRUB可以顯示可用的kernel清單(定義在 /etc/grub.conf),你可以選擇一個kernel甚至使用其他的參數來修改kernel。
With the second-stage boot loader in memory, the file system is consulted, and the default kernel image and initrd image are loaded into memory. With the images ready, the stage 2 boot loader invokes the kernel image.
stage 2 boot loader在記憶體中,檔案系統會被參考到,kernel 和 initrd的image會被載入到記憶體中。當image都準備好,stage 2的boot loader會調用(invoke) kernel image。

Kernel

With the kernel image in memory and control given from the stage 2 boot loader, the kernel stage begins. The kernel image isn't so much an executable kernel, but a compressed kernel image. Typically this is a zImage (compressed image, less than 512KB) or a bzImage (big compressed image, greater than 512KB), that has been previously compressed with zlib. At the head of this kernel image is a routine that does some minimal amount of hardware setup and then decompresses the kernel contained within the kernel image and places it into high memory. If an initial RAM disk image is present, this routine moves it into memory and notes it for later use. The routine then calls the kernel and the kernel boot begins.
當kernel image在記憶體中且控制權從stage 2 boot loader移交過來,kernel stage 開始。kernel 是一個壓縮的image而不是可以直接執行的kernel。在image的前段包含需要最輕量化可以啟動硬體的routine然後解壓縮kernel內容放到高的記憶體。如果現在是在RAM disk image,會將routine 移到記憶體中並記錄下來供之後使用。然後routine可以呼叫kernel然後kernel boot 開始。
When the bzImage (for an i386 image) is invoked, you begin at ./arch/i386/boot/head.Sin the start assembly routine (see Figure 3 for the major flow). This routine does some basic hardware setup and invokes the startup_32routine in./arch/i386/boot/compressed/head.S. This routine sets up a basic environment (stack, etc.) and clears the Block Started by Symbol (BSS). The kernel is then decompressed through a call to a C function called decompress_kernel(located in./arch/i386/boot/compressed/misc.c). When the kernel is decompressed into memory, it is called. This is yet another startup_32 function, but this function is in ./arch/i386/kernel/head.S.
routine 會做基本的硬體建立並調用startup_32 routine 建立起基本的環境(stack, etc.)然後清除 Block Started by Symbol(BSS),kernel解壓縮是透過呼叫C函式 decompress_kernel。

In the new startup_32 function (also called the swapper or process 0), the page tables are initialized and memory paging is enabled. The type of CPU is detected along with any optional floating-point unit (FPU) and stored away for later use. The start_kernel function is then invoked (init/main.c), which takes you to the non-architecture specific Linux kernel. This is, in essence, the main function for the Linux kernel.
startup_32 (也叫切換或是程序0),分頁表被初始化然後記憶體分頁被致能。偵測CPU的型別是否有任何FPU然後儲存之後可以使用。start_kernel(init/main.c)被調用,帶你到非特定linux kernel 結構。實質上,這是linux kernel的主要功能。
Figure 3. Major functions flow for the Linux kernel i386 boot
Major Functions in Linux Kernel i386 Boot Process
With the call to start_kernel, a long list of initialization functions are called to set up interrupts, perform further memory configuration, and load the initial RAM disk. In the end, a call is made to kernel_thread (in arch/i386/kernel/process.c) to start the init function, which is the first user-space process. Finally, the idle task is started and the scheduler can now take control (after the call to cpu_idle). With interrupts enabled, the pre-emptive scheduler periodically takes control to provide multitasking.
呼叫starta_kernel,會有一個清單的函式被呼叫,用來建立中斷,平台更進一步記憶體組態,然後載入初始化RAM disk,用來跑第一使用者空間程序。最後,空閒的task開始排程,可以開始控制排程。當中斷被致能,pre-emptive 排程器週期性地掌控控制權以提供multitasking。
During the boot of the kernel, the initial-RAM disk (initrd) that was loaded into memory by the stage 2 boot loader is copied into RAM and mounted. This initrd serves as a temporary root file system in RAM and allows the kernel to fully boot without having to mount any physical disks. Since the necessary modules needed to interface with peripherals can be part of the initrd, the kernel can be very small, but still support a large number of possible hardware configurations. After the kernel is booted, the root file system is pivoted (via pivot_root) where the initrd root file system is unmounted and the real root file system is mounted.
在kernel boot的期間,初始化RAM disk被載入到記憶體中,被stage 2 boot loader複製進RAM且安裝。initrd 服務一個暫時的root檔案系統在RAM然後允許kernel boot。
The initrd function allows you to create a small Linux kernel with drivers compiled as loadable modules. These loadable modules give the kernel the means to access disks and the file systems on those disks, as well as drivers for other hardware assets. Because the root file system is a file system on a disk, the initrdfunction provides a means of bootstrapping to gain access to the disk and mount the real root file system. In an embedded target without a hard disk, the initrd can be the final root file system, or the final root file system can be mounted via the Network File System (NFS).

Init

After the kernel is booted and initialized, the kernel starts the first user-space application. This is the first program invoked that is compiled with the standard C library. Prior to this point in the process, no standard C applications have been executed.
In a desktop Linux system, the first application started is commonly /sbin/init. But it need not be. Rarely do embedded systems require the extensive initialization provided by init (as configured through /etc/inittab). In many cases, you can invoke a simple shell script that starts the necessary embedded applications.

Summary

Much like Linux itself, the Linux boot process is highly flexible, supporting a huge number of processors and hardware platforms. In the beginning, the loadlin boot loader provided a simple way to boot Linux without any frills. The LILO boot loader expanded the boot capabilities, but lacked any file system awareness. The latest generation of boot loaders, such as GRUB, permits Linux to boot from a range of file systems (from Minix to Reiser).

沒有留言:

張貼留言