[Re-learning C++] 01| How does C++ manage memory resources?

First post

【Re-learning C++】01| How does C++ manage memory resources?

Foreword

Hello everyone, I only talk about technical dry goods and can play code. Today is the first lecture of [Re-learning C++]. Let’s learn about C++ memory management.

Unlike java, golang and other languages with built-in garbage collection mechanisms, C++ does not automatically reclaim memory. We have to manually manage memory allocation and deallocation on the heap, which often leads to problems such as memory leaks and memory overflows. Moreover, these problems may not appear immediately, but will be exposed after running for a period of time, and troubleshooting is also very difficult. Therefore, it is very important to understand and master the memory management techniques and tools in C++, which can improve program performance, reduce errors and increase security.

Memory partition

In C++, the memory space allocated by the operating system to the program is divided into code segment, data segment, stack, and heap several different areas according to the purpose, and each area has its own unique memory management mechanism.

Code area

The code area is an area used to store program code. The code segment is loaded into the memory before the program is actually executed. During the program execution, the memory in the code area will not be modified and released.

Since the code area is read-only, it will be shared by multiple processes. When multiple processes execute the same program at the same time, the operating system only needs to load the code segment into memory once, and then let multiple processes share this memory area.

Data segment

The data segment is used to store static data such as static global variables, static local variables, and static constants. During the running of the program, the size of the data segment is fixed, but its content can be modified. According to whether the variable is initialized. Data segments can be divided into initialized data segments and uninitialized data segments.

Stack

The function call in C++ and the use of local variables in the function are all realized through the memory partition of the stack. The stack partition is automatically allocated and released by the operating system, which is a “last in, first out” memory partition. The size of each stack is fixed, generally only a few MB, so if the stack variable is too large, or the function calls are nested too deeply, stack overflow is prone to occur.

Let’s take a sample code first to see how C++ uses the stack to make function calls.

#include <iostream>

void inner(int a) {
    std::cout << a << std::endl;
}
void outer(int n) {
int a = n + 1;
    inner(a);
}

int main() {
    outer(4);
}

The stack changes during the running of the above code are as follows
image.png

Whenever a program calls a function, information such as parameters, local variables, and return address of the function are pushed onto the stack. When the function is executed, the information is popped from the stack. According to the return address pushed onto the stack by the outer caller before, return to the unexecuted code of the outer caller to continue execution.

Local variables are stored directly on the stack. When the function is executed, the memory occupied by these variables will be released. The local variables in the previous examples are simple types, called POD types in C++. For non-POD type variables with constructors and destructors, memory allocation on the stack is also valid. The compiler inserts calls to constructors and destructors when appropriate.

There is a question here, when an exception occurs in function execution, will the destructor be called?
The answer is yes, C++ calls “stack unwinding” for the call to the destructor when an exception occurs. The following code demonstrates stack unwinding.

#include <iostream>
#include <string>

class Obj {
public:
    std::string name_;
    Obj(const std::string & amp; name):name_(name){std::cout << "Obj() " << name_ << std::endl;};
    ~Obj() {std::cout << "~Obj() " << name_ << std::endl;};
};


void bar() {
    auto o = Obj{"bar"};
    throw "bar exception";
}

int main() {
    try {
        bar();
    } catch (const char* e) {
        std::cout << "catch Exception: " << e << std::endl;
    }
}

The result of executing the code is:

Obj() bar
~Obj() bar
catch Exception: bar exception

It can be found that when an exception occurs, the local variable o in the bar function can still be destructed normally.

The process of stack unwinding is actually the process of matching the catch clause when an exception occurs.

  1. The program throws an exception, stops the currently executing call chain, and starts looking for a catch clause that matches the exception.
  2. If an exception occurs inside a try, the catch clause matching that try block is checked first. If the function body where the exception is located does not have a try to catch the exception. will go directly to the next step.
  3. If no matching catch is found in the second step, it will be searched in the outer try block until it is found.
  4. If no matching catch is found at the outermost layer, that is to say, the exception cannot be handled, the program will call the standard library function terminate to terminate the execution of the function.

During this time, all objects on the stack will be automatically destructed.

Heap

The heap is a memory partition used to store dynamically allocated memory in C++. The allocation and release of heap memory needs to be managed manually, and can be allocated and released through functions such as new/delete or malloc/free. The size of the heap memory is usually not fixed. When we need to dynamically allocate memory, we can use the heap memory.

Heap memory is manually allocated and released by programmers, so using heap memory requires attention to issues such as memory leaks and memory overflow. Memory leak problems occur when programmers forget to free allocated memory. And when the requested heap memory exceeds the memory limit allocated to the process by the operating system, it will cause a memory overflow problem.

The vast majority of memory leaks in C++ programs are due to forgetting to call delete/free to release resources on the heap.

or on the code

#include <iostream>
#include <string>

class Obj {<!-- -->
public:
    std::string name_;
    Obj(const std::string & amp; name):name_(name){<!-- -->std::cout << "Obj() " << name_ << std::endl;};
    ~Obj() {<!-- -->std::cout << "~Obj() " << name_ << std::endl;};
};

Obj* makeObj() {<!-- -->
Obj* obj = nullptr;
try {<!-- -->
obj = new Obj{<!-- -->"makeObj"};
...
} catch(...) {<!-- -->
delete obj;
throw;
}
return obj;
}

Obj* foo() {<!-- -->
Obj* obj = nullptr;
try {<!-- -->
obj = makeObj();
...
} catch(...) {<!-- -->
delete obj;
}
return obj;
}
int main() {<!-- -->
    Obj* obj = foo();
    ...
    delete obj;
}

It can be seen that the heap variable obj created by the makeObj function needs to be concerned about the processing of the variable in every upper-level call to obtain the variable. This undoubtedly greatly increases the mental burden of developers.

RAII

I want to create objects on the heap, but I don’t want to deal with such complicated memory release operations. C++ does not create a set of garbage collection mechanisms like java, golang and other languages, but adopts a unique resource management method – RAII (Resource Acquisition Is Initialization, resource acquisition is initialization).

RAII uses the feature that the stack object will automatically call the destructor after the scope ends, and manages resources by creating stack objects. The resources are obtained in the stack object constructor, and the resources are released in the stack object destructor, so as to ensure the acquisition and release of resources.

The following is an example of automatically releasing heap memory through RAII

#include <iostream>

class AutoIntPtr {<!-- -->
public:
    AutoIntPtr(int* p = nullptr) : ptr(p) {<!-- -->}
    ~AutoIntPtr() {<!-- --> delete ptr; }

    int & amp; operator*() const {<!-- --> return *ptr; }
    int* operator->() const {<!-- --> return ptr; }

private:
    int* ptr;
};

void foo() {<!-- -->
AutoIntPtr p(new int(5));
    std::cout << *p << std::endl; // 5
}

int main() {<!-- -->
    foo();
}

In the above example, the AutoIntPtr class encapsulates a dynamically allocated pointer of int type, its constructor is used to obtain resources (ptr = p), and its destructor is used to Release resources (delete ptr). When AutoIntPtr goes out of scope, the destructor is called automatically to free the contained resource.

Based on RAII, C++11 introduces smart pointers such as std::unique_ptr and std::shared_ptr for memory management classes, making memory management more convenient and safe . These memory management classes can automatically release memory, avoiding the tedious work of manually releasing memory. It is worth mentioning that the above AutoIntPtr is a simplified version of the smart pointer.

In actual development, RAII is widely used. Not just for autofreeing memory. It can also be used to close files, release database connections, release synchronization locks, etc.

Summary

This article introduces the memory management mechanism in C++, including memory division, stack, heap and RAII technology. By studying this article, we can better grasp the memory management skills of C++, and avoid problems such as memory leaks and memory overflows.