foreach problem in php

The foreach structure was introduced in php4, which is a simple way to traverse an array. Compared with the traditional for loop, foreach can obtain key-value pairs more conveniently. Before php5, foreach could only be used for arrays; after php5, foreach could also be used to traverse objects (see: Traversing Objects for details). This article only discusses array traversal.

Although foreach is simple, it may have some unexpected behavior, especially if the code involves references.

Several cases are listed below to help us further understand the nature of foreach.

Question 1
$arr = array(1,2,3);

foreach($arr as KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 6: k => & amp;?v) {
$v = $v * 2;
}
// now $arr is array(2, 4, 6)

foreach($arr as $k => KaTeX parse error: Expected ‘}’, got ‘EOF’ at end of input: v) { echo “k”, ” => “, “$v”;
}
Let’s start simple. If we try to run the above code, we will find that the final output is 0=>2 1=>4 2=>4.

Why not 0=>2 1=>4 2=>6?

In fact, we can think that the foreach($arr as $k => $v) structure implies the following operations, assigning the current ‘key’ and the current ‘value’ of the array to variables respectively.
k
and
v. The specific expansion is as follows:

foreach($arr as $k => $v){
//Two assignment operations are implicit before the user code is executed.
$v = currentVal();
$k = currentKey();

//Continue running user code
…

}
Based on the above theory, now we re-analyze the first foreach:

The first loop, because

v

is a reference, so

v is a reference, therefore

v is a reference, so v = & amp;

a

r

r

[

0

]

,

arr[0],

arr[0], v=

v

?

2

Equivalent to

v*2 is equivalent to

v?2 is equivalent to arr[0]*2, so $arr becomes 2,2,3

The second loop, KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 5: v = & amp;?arr[1], $arr becomes 2,4,3

The third loop, KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 5: v = & amp;?arr[2], $arr becomes 2,4,6

Then the code enters the second foreach:

The first loop, implicit operation

v

=

v=

v=arr[0] is triggered, because at this time

v

is still

v is still

v is still a reference to arr[2], which is equivalent to

a

r

r

[

2

]

=

arr[2]=

arr[2]=arr[0], $arr becomes 2,4,2

The second cycle,

v

=

v=

v=arr[1], that is

a

r

r

[

2

]

=

arr[2]=

arr[2]=arr[1], $arr becomes 2,4,4

The third cycle,

v

=

v=

v=arr[2], that is

a

r

r

[

2

]

=

arr[2]=

arr[2]=arr[2], $arr becomes 2,4,4

OK, the analysis is completed.

How to solve similar problems? There is a reminder in the PHP manual:

Warning: The $value reference of the last element of the array remains after the foreach loop. It is recommended to use unset() to destroy it.
$arr = array(1,2,3);

foreach($arr as KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 6: k => & amp;?v) {
$v = KaTeX parse error: Expected ‘EOF’, got ‘}’ at position 8: v * 2; }? unset(v);

foreach($arr as $k => KaTeX parse error: Expected ‘}’, got ‘EOF’ at end of input: v) { echo “k”, ” => “, “$v”;
}
// Output 0=>2 1=>4 2=>6
As we can see from this question, references are likely to be accompanied by side effects. If you don’t want unintentional modifications to change the contents of the array, it is best to unset these references in time.

Question 2
$arr = array(a’,b’,c’);

foreach($arr as $k => KaTeX parse error: Expected ‘}’, got ‘EOF’ at end of input: …{ echo key(arr), “=>”, current($arr);
}

// print 1=>b 1=>b 1=>b
This question is even weirder. According to the manual, key and current are the key values of the current element in the array.

Then why key(
a
r
r
)
one
straight
yes
1
,
c
u
r
r
e
n
t
(
arr) has always been b?

First use vld to view the compiled opcode:

Let’s start with the ASSIGN instruction on line 3, which means assigning array(a’, b’, c’) to $arr.

Since $arr is CV and array(a’, b’, c’) is TMP, the function actually executed by the ASSIGN instruction found is ZEND_ASSIGN_SPEC_CV_TMP_HANDLER. It should be pointed out here that CV is a variable cache added after PHP5.1. It uses an array to save zval**. When the cached variables are used again, there is no need to search the active symbol table, but directly go to CV. Obtained from the array, since the access speed of the array is much faster than that of the hash table, the efficiency can be improved.

static int ZEND_FASTCALL ZEND_ASSIGN_SPEC_CV_TMP_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);
zend_free_op free_op2;
zval *value = _get_zval_ptr_tmp( & amp;opline->op2, EX(Ts), & amp;free_op2 TSRMLS_CC);

//Create $arr** pointer in CV array
zval **variable_ptr_ptr = _get_zval_ptr_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_W TSRMLS_CC);

if (IS_CV == IS_VAR & amp; & amp; !variable_ptr_ptr) {
    …
}
else {
    // Assign array to $arr
     value = zend_assign_to_variable(variable_ptr_ptr, value, 1 TSRMLS_CC);
    if (!RETURN_VALUE_UNUSED( & amp;opline->result)) {
        AI_SET_PTR(EX_T(opline->result.u.var).var, value);
        PZVAL_LOCK(value);
    }
}

ZEND_VM_NEXT_OPCODE();

}
After the ASSIGN instruction is completed, the zval** pointer is added to the CV array, and the pointer points to the actual array, which means that $arr has been cached by CV.

Next, perform the loop operation of the array. Let’s look at the FE_RESET instruction. Its corresponding execution function is ZEND_FE_RESET_SPEC_CV_HANDLER:

static int ZEND_FASTCALL ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

if (…) {

} else {
// Get the pointer to the array through the CV array
array_ptr = _get_zval_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_R TSRMLS_CC);

}

//Save the pointer to array into zend_execute_data->Ts (Ts is used to store temp_variable during code execution)
AI_SET_PTR(EX_T(opline->result.u.var).var, array_ptr);
PZVAL_LOCK(array_ptr);

if (iter) {
    …
} else if ((fe_ht = HASH_OF(array_ptr)) != NULL) {
    //Reset the internal pointer of the array
    zend_hash_internal_pointer_reset(fe_ht);
    if (ce) {
        …
    }
    is_empty = zend_hash_has_more_elements(fe_ht) != SUCCESS;
    
    // Set EX_T(opline->result.u.var).fe.fe_pos to save the internal pointer of the array
    zend_hash_get_pointer(fe_ht, & amp;EX_T(opline->result.u.var).fe.fe_pos);
} else {
    …
}
…

}
Here, two important pointers are mainly stored in zend_execute_data->Ts:

EX_T(opline->result.u.var).var —- pointer to array
EX_T(opline->result.u.var).fe.fe_pos —- pointer to the internal elements of array
After the FE_RESET instruction is executed, the actual situation in the memory is as follows:

Next we continue to look at FE_FETCH, whose corresponding execution function is ZEND_FE_FETCH_SPEC_VAR_HANDLER:

static int ZEND_FASTCALL ZEND_FE_FETCH_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);

// Note that the pointer is obtained from EX_T(opline->op1.u.var).var.ptr
zval *array = EX_T(opline->op1.u.var).var.ptr;
…

switch (zend_iterator_unwrap(array, & amp;iter TSRMLS_CC)) {
    default:
    case ZEND_ITER_INVALID:
        …

    case ZEND_ITER_PLAIN_OBJECT: {
        …
    }

    case ZEND_ITER_PLAIN_ARRAY:
        fe_ht = HASH_OF(array);
        
        // pay attention:
        // In the FE_RESET instruction, the pointer to the internal element of the array is saved in EX_T(opline->op1.u.var).fe.fe_pos
        //Get the pointer here
        zend_hash_set_pointer(fe_ht, & amp;EX_T(opline->op1.u.var).fe.fe_pos);
        
        // Get the value of the element
        if (zend_hash_get_current_data(fe_ht, (void **) & amp;value)==FAILURE) {
            ZEND_VM_JMP(EX(op_array)->opcodes + opline->op2.u.opline_num);
        }
        if (use_key) {
            key_type = zend_hash_get_current_key_ex(fe_ht, & amp;str_key, & amp;str_key_len, & amp;int_key, 1, NULL);
        }
        
        //Move the internal pointer of the array to the next element
        zend_hash_move_forward(fe_ht);
        
        //The moved pointer is saved to EX_T(opline->op1.u.var).fe.fe_pos
        zend_hash_get_pointer(fe_ht, & amp;EX_T(opline->op1.u.var).fe.fe_pos);
        break;

    case ZEND_ITER_OBJECT:
        …
}

…

}
According to the implementation of FE_FETCH, we roughly understand foreach(
a
r
r
a
s
k => $v) does. It will obtain the array element based on the pointer of zend_execute_data->Ts. After the acquisition is successful, it will move the pointer to the next position and save it again.

To put it simply, since the internal pointer of the array has been moved to the second element in FE_FETCH in the first loop, key( is called inside foreach
a
r
r
)
and
c
u
r
r
e
n
t
(
arr), what is actually obtained is 1 and ‘b’.

Then why is 1=>b output three times?

Let’s continue to look at the SEND_REF instructions on lines 9 and 13, which means pushing the $arr parameter onto the stack. Then the DO_FCALL instruction is generally used to call the key and current functions. PHP is not compiled into native machine code, so PHP uses such opcode instructions to simulate how the actual CPU and memory work.

Check out SEND_REF in the PHP source code:

static int ZEND_FASTCALL ZEND_SEND_REF_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

// Get the pointer to the $arr pointer from CV
varptr_ptr = _get_zval_ptr_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_W TSRMLS_CC);

//Variable separation, here is a new copy of the array specifically for the key function
SEPARATE_ZVAL_TO_MAKE_IS_REF(varptr_ptr);
varptr = *varptr_ptr;
Z_ADDREF_P(varptr);

// push onto stack
zend_vm_stack_push(varptr TSRMLS_CC);

ZEND_VM_NEXT_OPCODE();

}
SEPARATE_ZVAL_TO_MAKE_IS_REF in the above code is a macro:

#define SEPARATE_ZVAL_TO_MAKE_IS_REF(ppzv)
if (!PZVAL_IS_REF(*ppzv)) {
SEPARATE_ZVAL(ppzv);
Z_SET_ISREF_PP((ppzv));
}
The main function of SEPARATE_ZVAL_TO_MAKE_IS_REF is to copy a new one in memory if the variable is not a reference. In this example, it copies array(a’, b’, c’). Therefore, the memory after variable separation is:

Note that after the variable separation is completed, the pointer in the CV array points to the newly copied data, and the old data can still be obtained through the pointer in zend_execute_data->Ts.

I won’t go into details about the following loops one by one. Let’s take a look at the picture above:

The foreach structure uses the blue array below, which will traverse a, b, c in sequence.
Key and current use the yellow array above, and its internal pointer always points to b.
At this point we understand why key and current always return the second element of the array. Since no external code acts on the copied array, its internal pointer will never move.

Question 3
$arr = array(a’,b’,c’);

foreach($arr as KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 6: k => & amp;?v) {
echo key(

a

r

r

)

,

=

>

,

c

u

r

r

e

n

t

(

arr), ‘=>’, current(

arr),′=>′,current(arr);
}
// print 1=>b 2=>c =>
There is only one difference between this question and Question 2: the foreach in this question uses a reference. Use VLD to check this question and find that the opcode compiled from the code in question 2 is the same. Therefore, we use the tracking method of question 2 to gradually check the corresponding implementation of opcode.

First foreach will call FE_RESET:

static int ZEND_FASTCALL ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

if (opline->extended_value & amp; ZEND_FE_RESET_VARIABLE) {
// Get variables from CV
array_ptr_ptr = _get_zval_ptr_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_R TSRMLS_CC);
if (array_ptr_ptr == NULL || array_ptr_ptr == & amp;EG(uninitialized_zval_ptr)) {

}
else if (Z_TYPE_PP(array_ptr_ptr) == IS_OBJECT) {

}
else {
// For the case of traversing array
if (Z_TYPE_PP(array_ptr_ptr) == IS_ARRAY) {
SEPARATE_ZVAL_IF_NOT_REF(array_ptr_ptr);
if (opline->extended_value & amp; ZEND_FE_FETCH_BYREF) {
//Set the zval that holds the array to is_ref
Z_SET_ISREF_PP(array_ptr_ptr);
}
}
array_ptr = *array_ptr_ptr;
Z_ADDREF_P(array_ptr);
}
} else {

}

}
Part of the implementation of FE_RESET has been analyzed in Question 2. Special attention needs to be paid here. In this example, foreach uses a reference to obtain the value, so during execution, FE_RESET will enter another branch different from the previous question.

Eventually, FE_RESET will set the is_ref of the array to true, and there will only be one copy of the array’s data in the memory.

Next analyze SEND_REF:

static int ZEND_FASTCALL ZEND_SEND_REF_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

// Get the pointer to the $arr pointer from CV
varptr_ptr = _get_zval_ptr_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_W TSRMLS_CC);

//Variable separation, since the variable in the CV itself is a reference at this time, a new array will not be copied here.
SEPARATE_ZVAL_TO_MAKE_IS_REF(varptr_ptr);
varptr = *varptr_ptr;
Z_ADDREF_P(varptr);

// push onto stack
zend_vm_stack_push(varptr TSRMLS_CC);

ZEND_VM_NEXT_OPCODE();

}
The macro SEPARATE_ZVAL_TO_MAKE_IS_REF only separates variables with is_ref=false. Since array has been set to is_ref=true before, it will not be copied. In other words, there is still only one copy of array data in the memory at this time.

The above figure explains why the first two loops output 1=>b 2=>C. During the third cycle of FE_FETCH, continue to move the pointer forward.

ZEND_API int zend_hash_move_forward_ex(HashTable *ht, HashPosition *pos)
{
HashPosition *current = pos ? pos : & amp;ht->pInternalPointer;

IS_CONSISTENT(ht);

if (*current) {
    *current = (*current)->pListNext;
    return SUCCESS;
} else
    return FAILURE;

}
Since the internal pointer already points to the last element of the array at this time, moving forward will point to NULL. After pointing the internal pointer to NULL, we then call key and current on the array, and NULL and false will be returned respectively, indicating that the call failed. At this time, no characters will be echoed.

Question 4
$arr = array(1, 2, 3);
$tmp =

a

r

r

;

f

o

r

e

a

c

h

(

arr; foreach(

arr;foreach(tmp as KaTeX parse error: Expected ‘EOF’, got ‘ & amp;’ at position 6: k => & amp;?v){
KaTeX parse error: Expected ‘EOF’, got ‘}’ at position 9: v *= 2; }? var_dump(arr, $tmp); // What to print?
This question has little to do with foreach, but since it involves foreach, let’s discuss it together 🙂

The code first creates an array
a
r
r
,
Follow
back
Will
Should
number
Group
endow
Give
Got it
tmp, in the next foreach loop, for
v
Enter
OK
build
change
meeting
do
use
At
number
Group
tmp, but it does not affect $arr.

why?

This is because in PHP, the assignment operation copies the value of one variable to another variable, so modifying one of them will not affect the other.

Off topic: This does not apply to the object type. Starting from PHP5, objects are always assigned by reference by default. For example:

class A{
public $foo = 1;
}
$a1 = $a2 = new A;
$a1->foo=100;
echo

a

2

?

>

f

o

o

;

/

/

output

100

,

a2->foo; // Output 100,

a2?>foo;//Output 100, a1 and $a2 are actually references to the same object
Returning to the code in the question, now we can determine
t
m
p

arr is actually a value copy, the entire
a
r
r
number
Group
meeting
quilt
Again
complex
system
one
share
Give
tmp. Theoretically, after the assignment statement is executed, there will be two copies of the same array in memory.

Some students may wonder, if the array is large, wouldn’t this operation be very slow?

Fortunately, php has a smarter way to deal with it. In fact, when
t
m
p

After arr is executed, there is still only one array in the memory. View the zend_assign_to_variable implementation in the php source code (extracted from php5.3.26):

static inline zval* zend_assign_to_variable(zval **variable_ptr_ptr, zval *value, int is_tmp_var TSRMLS_DC)
{
zval *variable_ptr = *variable_ptr_ptr;
zval garbage;

//The lvalue is of object type
if (Z_TYPE_P(variable_ptr) == IS_OBJECT & amp; & amp; Z_OBJ_HANDLER_P(variable_ptr, set)) {

}
// When the lvalue is a reference
if (PZVAL_IS_REF(variable_ptr)) {

} else {
//The case where lvalue refcount__gc=1
if (Z_DELREF_P(variable_ptr)==0) {

} else {
GC_ZVAL_CHECK_POSSIBLE_ROOT(*variable_ptr_ptr);
// non-temporary variables
if (!is_tmp_var) {
if (PZVAL_IS_REF(value) & amp; & amp; Z_REFCOUNT_P(value) > 0) {
ALLOC_ZVAL(variable_ptr);
*variable_ptr_ptr = variable_ptr;
*variable_ptr = *value;
Z_SET_REFCOUNT_P(variable_ptr, 1);
zval_copy_ctor(variable_ptr);
} else {
//

t

m

p

=

tmp=

tmp=arr will run here,
// value points to

a

r

r

realistic

a

r

r

a

y

pointer to data,

v

a

r

i

a

b

l

e

p

t

r

p

t

r

for

The pointer to the actual array data in arr, variable_ptr_ptr is

The pointer to the actual array data in arr, variablep?trp?tr is the pointer to the data pointer in tmp
// It just copies the pointer and does not actually copy the actual array.
*variable_ptr_ptr = value;
//The refcount__gc value of value + 1, in this example refcount__gc is 1, and after Z_ADDREF_P it is 2
Z_ADDREF_P(value);
}
} else {

}
}
Z_UNSET_ISREF_PP(variable_ptr_ptr);
}

return *variable_ptr_ptr;

}
visible
t
m
p

The essence of arr is to copy the pointer of the array, and then automatically increase the refcount of the array by 1. Use a diagram to express the memory at this time. There is still only one copy of the array array:

Since there is only one array, modify it in the foreach loop
t
m
p
of
hour
wait
,
for
what
arr didn’t change?

Continue to look at the ZEND_FE_RESET_SPEC_CV_HANDLER function in the PHP source code. This is an OPCODE HANDLER, and its corresponding OPCODE is FE_RESET. This function is responsible for setting the array’s internal pointer to its first element before foreach begins.

static int ZEND_FASTCALL ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);

zval *array_ptr, **array_ptr_ptr;
HashTable *fe_ht;
zend_object_iterator *iter = NULL;
zend_class_entry *ce = NULL;
zend_bool is_empty = 0;

// Perform FE_RESET on variables
if (opline->extended_value & amp; ZEND_FE_RESET_VARIABLE) {
    array_ptr_ptr = _get_zval_ptr_ptr_cv( & amp;opline->op1, EX(Ts), BP_VAR_R TSRMLS_CC);
    if (array_ptr_ptr == NULL || array_ptr_ptr == & EG(uninitialized_zval_ptr)) {
        …
    }
    // foreach an object
    else if (Z_TYPE_PP(array_ptr_ptr) == IS_OBJECT) {
        …
    }
    else {
        // This example will enter the branch
        if (Z_TYPE_PP(array_ptr_ptr) == IS_ARRAY) {
            // Note SEPARATE_ZVAL_IF_NOT_REF here
            // It will copy an array again
            //Really separate $tmp and $arr and turn them into two arrays in memory
            SEPARATE_ZVAL_IF_NOT_REF(array_ptr_ptr);
            if (opline->extended_value & amp; ZEND_FE_FETCH_BYREF) {
                Z_SET_ISREF_PP(array_ptr_ptr);
            }
        }
        array_ptr = *array_ptr_ptr;
        Z_ADDREF_P(array_ptr);
    }
} else {
    …
}

//Reset the internal pointer of the array
…

}
As can be seen from the code, the actual execution of variable separation is not when the assignment statement is executed, but is postponed to when the variable is used. This is also the implementation of the Copy On Write mechanism in PHP.

After FE_RESET, the memory changes are as follows:

The above figure explains why foreach does not affect the original $arr. As for the changes in ref_count and is_ref, interested students can read in detail the specific implementations of ZEND_FE_RESET_SPEC_CV_HANDLER and ZEND_SWITCH_FREE_SPEC_VAR_HANDLER (both located in php-src/zend/zend_vm_execute.h). This article will not provide a detailed analysis:)
The article is reprinted from: The most expensive mount in the Eight Tribes of the Dragon, the rare mount in the Eight Tribes of the Dragon

Author: 9335 Game Network, please indicate the original link for reprinting: https://www.clw9335.com/