Do embedded system programs run in FLASH or RAM?

Follow and star public accounts to get direct access to exciting content

Question 1: How is the code in FLASH run? For example, where is the PC pointer and who sets it?
Take ARM as an example:
ARM-cortex-M3/4 microcontroller (such as STM32, etc.): The code of this type of microcontroller is in nor flash, and the cortex core can be run directly without loading the code into ram for running.
ARM-cortex-A series SOC (such as Exynos4412): This type of SOC is more complex and usually has a memory management unit (MMU). The code is stored in nand flash. When the program is running, the code needs to be loaded into ram first to run. The SOC-like startup process includes the loader. Just like the Windows operating system is stored on the hard disk, when the computer is turned on, the operating system code is loaded into the memory module (RAM).
PC pointer: No matter what microcontroller or SOC, there is a PC register. This register saves the address of the next instruction to be fetched. Under normal circumstances, "4" is automatically added. When a branch jump is encountered, the value is set by the jump instruction. So what are pointers? The pointer is the address of a variable. When the operating system (such as Linux, Windows) is included, that is, the hardware level contains a memory management unit (MMU), the pointer is a virtual address. When the operating system is not included, the pointer is a physical address or a virtual address. and physical address are translated by MMU.
Question 2: Do these codes need to be moved to RAM to run? Is there anything wrong with not doing this?
As mentioned above, most of the microcontroller codes run directly in nor flash, and a small number of them need to be loaded into ram. nor flash can directly address a byte and can find the specific address of an instruction, so it can be run directly. The storage unit of nand flash is a block, which cannot be directly addressed by instructions, so the code in it cannot be run directly. Therefore, programs saved in nand flash cannot be run unless they are loaded into ram. That is, Windows in your hard disk is not loaded into the memory stick and cannot run.
Question 3: If it needs to be moved to RAM, is there any difference whether it is on-chip or off-chip?
It can be on-chip or off-chip, it depends on which SOC or CPU it is.
Question 4: If the actual code size of the user's FLASH (such as 1MB) exceeds the available space of RAM (such as 512KB), what is the migration process like?
This situation is rarely encountered in actual situations. Of course, there may be systems with very small RAM, which can be used in time-divided and segmented ways, that is, the program runs one section, loads one section, and after running, loads the next section. It is highly not recommended to play like this. Today's RAM is very large. When your actual code reaches 1MB, your memory may be 1G or 2G. For example, after the Linux operating system is compiled, it actually only has a few MB. The actual Linux system will have several G of memory available.
Question 5: Compared with the on-chip expanded FLASH and SRAM, apart from the difference in space size, what is the difference in performance and speed?
It depends on the bus design of the SOC. Generally speaking, the off-chip performance is weaker.
Whether program code can be run directly in Flash depends on the access characteristics of Flash.
Flash memory is organized in blocks, and it is more efficient to access it in blocks when used. Flash is similar to ROM, but it is actually readable and writable. Unlike RAM, which is also readable and writable, when writing data, it needs to first erase the block to which the location you write belongs. Do you only write a few bytes, so if you want to rewrite the data in Flash, you will always cache the block to which the data belongs to the memory first, and then rewrite the data in the memory and then write the block back again, so that No data will be lost, but the cost will be too high. When reading, the position of the block is often located first, and then sequentially read in the block. It is very inefficient to interrupt the reading of data in different blocks, so reading by block and writing by block is a major feature of Flash. It does not The storage area can be addressed at will, typically such as NAND Flash.
However, there is a type of Flash memory that can achieve arbitrary addressing without much cost when reading data. Its read operation is close to that of RAM, while the write operation still continues by erasing in blocks and then pressing The characteristics of block writing are typical such as NOR Flash.
Therefore, because of this characteristic, Flash is usually used to store data that does not need to be changed frequently and cannot be lost after power failure.
After introducing the background knowledge, let’s return to the question:
The first thing to be clear is that the CPU needs to read instructions from the memory. The instruction address is given by the PC register. After each instruction is executed, the PC will automatically point to the next instruction. If the length of the instructions is not equal, the given address will not be the same. There is always consistent alignment. Secondly, program execution is always accompanied by jumps, which makes the addressing of instructions more arbitrary. Therefore, the program must be executed directly in some kind of memory, and at least it must be able to address arbitrarily when reading data. NOR Flash can just meet the requirements. Common MCUs on the market have built-in Flash of this type, so stored programs can be run directly on it without loading into RAM. Other memories that do not have this access feature cannot directly execute programs on them. They must be transferred to a memory that meets this feature for execution, such as loading into RAM.
1. How is the code in FLASH run? For example, where and who sets the PC pointer?
The MCU using the cortex-m core will map the startup memory to the 0x00000000 address according to the level of the external startup configuration pin. If it is started in Flash, an exception interrupt vector table will be stored at the starting position of the internal Flash. The first and second items store the initial stack address and reset vector. The location of this table is configurable, and the location after reset is exactly at the 0x00000000 address. After the hardware is powered on and reset, the SP and PC registers will automatically be set to the first two items in the table in sequence, and then the code will start executing based on the initial value set by the PC, so the PC value is automatically set during reset.
2. Do these codes need to be moved to RAM to run? Is there anything wrong with not doing this?
As stated earlier, this is not necessary. Executing in RAM may result in better execution performance, but it is not necessary for the Nor Flash inside the MCU. One thing to mention is that the program generally consists of the code segment txt, the read-only data segment rodata, the initialized data segment data and the uninitialized data segment bss (no data). The read-only data segment is not needed like the code segment. Change, so it can stay in Flash, but the data segment also stored in Flash needs to be loaded into RAM and free up space for bss. This is the initialization of the running environment. It is moved, but it is not the code that is moved. This happens before entering the main function.
3. If it needs to be moved to RAM, is there any difference whether it is on-chip or off-chip?
The performance of on-chip RAM will be better, but the capacity generally cannot be made too large.
4. If the actual code size of the user's FLASH (such as 1MB) exceeds the available space of RAM (such as 512KB), what is the migration process like?
It can be loaded and executed in stages, but the organization of the program will become complicated and the operation will become inefficient. If this happens, you should consider changing the hardware configuration or optimizing and tailoring the program.
5. Compared with the on-chip expanded FLASH and SRAM, apart from the difference in space size, what is the difference in performance and speed?
This depends on the clock rate and access delay of the memory. The performance of the integrated internal memory is generally better than that of the off-chip one. Therefore, to make the program have higher running performance, the internal memory should be used first. Due to the low operating speed of low-end MCUs, there will not be much difference between the internal and external parts.
This question can be answered from the following three aspects:
1. Principle of computer composition

von Neumann model

Students majoring in computer science must be familiar with this picture. This is the most classic computer model. All current computer equipment (including embedded devices, of course) has not jumped out of this model. The five items inside can be divided into three parts: (1) CU and ALU are CPUs (2) Memory is memory devices (ideal memory devices) (3) Input and Output are various peripheral devices (keyboard, mouse, monitor· ·····).

Our focus here is on memory devices. Memory is imagined as an ideal memory device in the von Neumann model. The so-called ideal memory device is readable, writable, non-volatile, and random read and write. For theoretical models, an easy-to-understand introduction is key. But in reality it is not so ideal. Limited by cost, different memories can only meet some of the indicators. This is the mainstream memory we will talk about next.

2. Characteristics of mainstream memories

Today's memory can be roughly divided into two categories: RAM and ROM. There have been many summaries on the specific definitions and development history of these two types of memories. I will not go into details here. I will only talk about my understanding of these two types of memories from my personal perspective.

(1) RAM, which can be divided into SRAM and DRAM.

Common characteristics (the fundamental characteristics of RAM): readable, writable, random read and write

Difference: SRAM can be used immediately after power-on, DRAM needs to be initialized before it can be used, and the unit cost of SRAM is higher than that of DRAM.

(2) ROM, which can be divided into hard disk and Flash (NOR Flash and NAND Flash)

Common features: non-volatile

The difference: both hard disk and NAND Flash can be read and written in a block, while NOR Flash can be read randomly, but needs to be written in a block.

The non-volatile and random-read characteristics of NOR Flash allow it to be used as the system's boot medium.

3. The speed difference between CPU and memory is the main reason that currently restricts computer performance.

4. Specific to the above 5 questions

(1) It depends on what kind of Flash it is. If it is NOR Flash, the system can directly access and execute it. If it is NAND Flash, the code needs to be loaded into RAM before running. The PC register is in the CPU and is set to a specific value by hardware when the CPU is powered on (for example: the PC register of ARM Cortex-M3 is 0x4 by default when powered on).

(2) Like the first question, whether the code needs to be moved depends on the Flash type.

(3) If it is the same type of RAM, there is no difference between on-chip and off-chip. If the RAM types are different, specific analysis is required.

(4) It is assumed here that Flash is 1MB and RAM is 512KB. It is guessed that it should be NOR Flash and SRAM (such as STM32), so the code does not need to be moved. If it is NAND Flash, it is usually paired with DRAM, and the capacity will be much larger, so it should be assumed that this is not true.

Copyright statement: This article comes from the Internet and conveys knowledge for free. The copyright belongs to the original author. If there is any copyright issue with the work, please contact me for deletion.

?END ?

Follow my WeChat official account, reply "Add group" and join the technical exchange group according to the rules.

Click "Read the original text" to view more sharing. You are welcome to share, collect, like and watch.