numpy@shallow copy and deep copy

Article directory

    • Copies and Views
      • No Copy at All
      • View or Shallow Copy
      • Deep Copy
      • Tips for Conserving Memory
    • supplementary test
    • Slicing an array returns a view of it:
    • Deep Copy

Copies and Views

When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not. This is often a source of confusion for beginners. There are three cases:

No Copy at All

Simple assignments make no copy of objects or their data.

>>> a = np. array([[ 0, 1, 2, 3],
... [ 4, 5, 6, 7],
... [ 8, 9, 10, 11]])
>>> b = a # no new object is created
>>> b is a # a and b are two names for the same ndarray object
True

Python passes mutable objects as references, so function calls make no copy.

>>> def f(x):
...print(id(x))
...
>>> id(a) # id is a unique identifier of an object
148293216 # may vary
>>> f(a)
148293216 # may vary

View or Shallow Copy

Different array objects can share the same data. The view method creates a new array object that looks at the same data.

>>> c = a.view()
>>> c is a
False
>>> c.base is a # c is a view of the data owned by a
True
>>> c.flags.owndata
False
>>>
>>> c = c. reshape((2, 6)) # a's shape doesn't change
>>> a.shape
(3, 4)
>>> c[0, 4] = 1234 # a's data changes
>>> a
array([[ 0, 1, 2, 3],
       [1234, 5, 6, 7],
       [ 8, 9, 10, 11]])

Slicing an array returns a view of it:

>>> s = a[:, 1:3]
>>> s[:] = 10 # s[:] is a view of s. Note the difference between s = 10 and s[:] = 10
>>> a
array([[ 0, 10, 10, 3],
       [1234, 10, 10, 7],
       [ 8, 10, 10, 11]])

Deep Copy

  • The copy method makes a complete copy of the array and its data.

  • >>> d = a.copy() # a new array object with new data is created
    >>> d is a
    False
    >>> d.base is a # d doesn't share anything with a
    False
    >>> d[0, 0] = 9999
    >>> a
    array([[ 0, 10, 10, 3],
           [1234, 10, 10, 7],
           [ 8, 10, 10, 11]])
    

Tips to save memory

  • Sometimes copy should be called after slicing if the original array is not required anymore. For example, suppose a is a huge intermediate result and the final result b only contains a small fraction of a, a deep copy should be made when constructing b with slicing:

  • >>> a = np.arange(int(1e8))
    >>> b = a[:100].copy()
    >>> del a # the memory of ``a`` can be released.
    
  • If b = a[:100] is used instead, a is referenced by b and will persist in memory even if del a is executed.

Supplementary test

  • The following experiments are carried out under jupyter notebook
import numpy as np


a = np.array([[ 0, 1, 2, 3],
               [ 4, 5, 6, 7],
               [ 8, 9, 10, 11]])
b=a
print(a==b)
print(a is b)
print(id(a)==id(b))
 [[ True True True True True]
     [True True True True]
     [True True True True]]
    True
    True
  • ? Shallow copy, get the view c of the source object a
c=a.view()

c,type(c)
 (array([[ 0, 1, 2, 3],
            [ 4, 5, 6, 7],
            [ 8, 9, 10, 11]]),
     numpy.ndarray)
  • The types of a, c are ndarray, and they have a certain connection, but they are not equivalent
print(c is a, id(c), id(a), c.base is a)
False 2296426462928 2296415339600 True

  • Check the type of c and c.base (both ndarray)
type(c.base),type(c)
(numpy.ndarray, numpy.ndarray)
  • Checks if an ndarray object is a view of some other object
c.flags.owndata, a.flags.owndata
# Explain that the data of c comes from other objects (for example, generated by other objects calling view), and the data of a comes from itself
 (False, True)
print(id(c))
c = c.reshape(2,6)
print(id(c))
 2296426462928
    2296415340656
  • ? Modifying the shape of view c does not affect the shape of source object a
c,a
 (array([[ 0, 1, 2, 3, 4, 5],
            [ 6, 7, 8, 9, 10, 11]]),
     array([[ 0, 1, 2, 3],
            [ 4, 5, 6, 7],
            [ 8, 9, 10, 11]]))
  • Try modifying an element of source object a by modifying an element of view c
c[0,4]=123
#If the view of an object a (denoted as c) is modified, then the object a will be affected by the transformation of c (because c.base=a)
#However, the shape (shape) and data (data) of the ndarray object are relatively independent, modifying an element of c, the corresponding element in a is also modified, but modifying the shape of c, the shape of a will not change
#The corresponding relationship of the elements can be flattened and then corresponding
c,a
 (array([[ 0, 10, 10, 3, 123, 10],
            [ 10, 7, 8, 10, 10, 11]]),
     array([[ 0, 10, 10, 3],
            [123, 10, 10, 7],
            [ 8, 10, 10, 11]]))
  • After changing the shape of view c, check again whether c.base is still equivalent to a
c.base is a
 True

Slicing an array returns a view of it:

s = a[:, 1:3]
s[:] = 10 # s[:] is a view of s. Note the difference between s = 10 and s[:] = 10
a,s

 (array([[ 0, 10, 10, 3],
            [123, 10, 10, 7],
            [ 8, 10, 10, 11]]),
     array([[10, 10],
            [10, 10],
            [10, 10]]))
t=a[:,1:3]
t
 array([[10, 10],
           [10, 10],
           [10, 10]])
t=100
t

? 100

x=np.arange(12).reshape(4,3)
x,id(x)

? (array([[ 0, 1, 2],
? [ 3, 4, 5],
? [ 6, 7, 8],
? [ 9, 10, 11]]),
? 2296338927792)

x=99
x,id(x)

? (99, 2296148266288)

# a="str1"
# b=a
# print(b is a)

? True
?

Deep Copy

d = a.copy() # a new array object with new data is created
a, d

?(array([[ 0, 10, 10, 3],
? [123, 10, 10, 7],
? [ 8, 10, 10, 11]]),
? array([[ 0, 10, 10, 3],
? [123, 10, 10, 7],
? [ 8, 10, 10, 11]]))

d is a,id(d),id(a)

? (False, 2296338931824, 2296415339600)

d.base, a.base# can find the deep copy d of a, its d.base, and the a.base of a itself are None, because their status is the same, except for having the same elements, in the memory fully independent
# d doesn't share anything with a

? (None, None)

c.base# and c is a shallow copy (view) of a, so it has a non-None base (c.base is a)

? array([[ 0, 10, 10, 3],
? [123, 10, 10, 7],
? [ 8, 10, 10, 11]])

d.base==a,

? (array([[False, False, False, False],
? [False, False, False, False],
? [False, False, False, False]]),)

d[1,1]=9999
d,a

?(array([[ 0, 10, 10, 3],
? [ 123, 9999, 10, 7],
? [ 8, 10, 10, 11]]),
? array([[ 0, 10, 10, 3],
? [123, 10, 10, 7],
? [ 8, 10, 10, 11]]))