[STL template library] Simulate the implementation of vector class template {detailed iterator failure problem, deep copy problem of two-dimensional dynamic array}

1. Core structure

template<class T>
class Myvector{<!-- -->
    typedef T *iterator; //[1]
    typedef const T *const_iterator;
  private:
    iterator _start; //point to the beginning of storage space //[2]
    iterator _finish; //point to the next position of the actual storage element
    iterator _end_of_storage; //point to the next position at the end of the storage space

  public:
    size_t size()const{<!-- -->
      return _finish-_start;
    }
    size_t capacity()const{<!-- -->
      return _end_of_storage-_start;
    }
};

explain:

  • [1] For a sequential list iterator is a pointer to its internal elements.
  • [2] The following is a schematic diagram of its structure:

2. Traverse access

2.1 Iterator

 iterator begin(){<!-- -->
      return_start;
    }

    iterator end(){<!-- -->
      return_finish;
    }

    const_iterator begin() const{<!-- -->
      return_start;
    }
    const_iterator end() const{<!-- -->
      return_finish;
    }

2.2 operator[ ]

 T & amp; operator[](size_t pos){<!-- -->
      assert(pos < size());
      return _start[pos];
    }

    const T & amp; operator[](size_t pos) const{<!-- -->
      assert(pos < size());
      return _start[pos];
    }

3. Default constructor

3.1 Structure

Myvector()
      :_start(nullptr), //[1]
      _finish(nullptr),
      _end_of_storage(nullptr)
    {<!-- -->}

    Myvector(size_t n, const T & val = T()) //[2]
      :_start(nullptr),
      _finish(nullptr),
      _end_of_storage(nullptr)
    {<!-- -->
      resize(n, val);
    }
 //The overloaded function of the previous function
    Myvector(int n, const T & val = T()) //[3]
      :_start(nullptr),
      _finish(nullptr),
      _end_of_storage(nullptr)
    {<!-- -->
      resize(n, val);
    }

 

explain:

  • [1] The constructor must empty the three iterators first, otherwise there will be a problem of accessing wild pointers when the space is reserved.
  • [2] The second parameter uses an anonymous object of type T as the default value, which is equivalent to calling the default construction. The generated anonymous objects are const and need to be referenced by const.
  • [2] It can be seen from this that the class we define must provide a default structure, otherwise there will be some problems.
  • [2] C++ has an upgrade to built-in types thanks to templates, which also have default construction. Default construction of built-in types will initialize them to 0
  • [3] The purpose of overloading this function is to solve the calling conflict with the function Myvector(input_iterator first, input_iterator last);.
  • [3] For example: Myvector v1(10, 5); If there is no such overloaded function, since 10(int) does not match the type of size_t(unsigned int), type conversion is required. Therefore, the compiler will give priority to matching Myvector(input_iterator first, input_iterator last);, and interpret 10 and 5 as iterators, thus accessing illegal space and causing the program to crash.

3.2 Copy construction

//Iterator interval copy construction:
    template<class input_iterator> //[1]
     Myvector(input_iterator first, input_iterator last)
        :_start(nullptr),
        _finish(nullptr),
        _end_of_storage(nullptr)
      {<!-- -->
        reserve(last-first);
        while(first!=last)
        {<!-- -->
          *_finish = *first;
           + + _finish;
           + + first;
        }
      }

// Traditional way of writing:
    Myvector(const Myvector<T> &v)
        :_start(nullptr),
        _finish(nullptr),
        _end_of_storage(nullptr)
    {<!-- -->
      reserve(v. size());
      for(size_t i = 0; i<v. size(); + + i)
      {<!-- -->
        _start[i] = v. _start[i];
      }
      _finish = _start + v. size();
    }

// Reuse writing method:
    Myvector(const Myvector<T> &v)
        :_start(nullptr), //[3]
        _finish(nullptr),
        _end_of_storage(nullptr)
    {<!-- -->
      Myvector tmp(v.begin(), v.end()); //[2]
      swap(tmp);
    }
    
//Exchange function:
    void swap(Myvector<T> &v){<!-- -->
      std::swap(_start,v._start);
      std::swap(_finish,v._finish);
      std::swap(_end_of_storage, v._end_of_storage);
    }

explain:

  • [1] This function is written as a template to be compatible with iterators of various containers. Enable vector to copy and construct data of various containers.
  • [2] Reuse the iterator interval copy construction, first construct a temporary object tmp, and then exchange data with the construction object (shallow exchange)
  • [2] After that, tmp is destructed, and the constructed object also gets copied data.
  • [3] Note: Before the exchange, the three iterators of the constructed object must be cleared, otherwise it will crash due to the release of wild pointers when destructing tmp.

3.3 Assignment Overloading & Destruction

 Myvector<T> & amp; operator=(const Myvector<T> & amp;v){<!-- -->
      Myvector tmp(v); //[1]
      swap(tmp);
    }

    ~Myvector(){<!-- -->
      delete[]_start;
    }
  • [1] Reuse copy construction to implement assignment overloading.

4. Capacity operation

4.1 reserve

 void reserve(size_t n){<!-- -->
      if(n > capacity())
      {<!-- -->
        iterator tmp = new T[n];
        size_t sz = size();
        if(_start!=nullptr)
        {<!-- -->
          //memcpy(tmp, _start, sz * sizeof(T)); //[1]
          for(size_t i = 0; i< sz; + + i)
          {<!-- -->
            tmp[i] = _start[i];
          }
          delete[]_start;
        }
        _start = tmp;
        _finish = _start + sz; //[2]
        _end_of_storage = _start + n;
      }
    }

explain:

  • [1] Memcpy cannot be used to copy data here. When the template type T is a custom type and involves dynamic memory application (such as string or list), memcpy can only perform shallow copy.
  • [1] The assignment overload should be called element-by-element to achieve multiple layers of deep copying. The following is a detailed explanation of the deep copy problem of a two-dimensional dynamic array.
  • [2] Open up new space after capacity expansion and release old space. _finish and _end_of_storage should also calculate the new address according to the length, otherwise it will become a wild pointer.

4.2 resize

 void resize(size_t n, const T & amp;val = T()){<!-- -->
      if(n > size())
      {<!-- -->
        reserve(n);
        for(size_t i = size(); i<n; + + i)
        {<!-- -->
          _start[i] = val; //[1]
        }
      }
      _finish = _start + n; //[2]
    }

explain:

  • [1] For custom type elements, the assignment overload is called here.
  • [2] When n > size(), _finish moves backward; when n<=size(), _finish moves forward;

5. Add, delete, check and modify

5.1 push_back & pop_back

 void push_back(const T & amp;val){<!-- -->
      if(_finish == _end_of_storage)
      {<!-- -->
        size_t n = capacity() == 0? 5:capacity()*2;
        reserve(n);
      }
      *_finish = val;
       + + _finish;
    }
    
    void pop_back(){<!-- -->
      assert(_finish != _start);
      --_finish;
    }

5.2 insert

 iterator insert(iterator pos, const T & amp;val){<!-- -->
      assert(pos >= _start);
      assert(pos <= _finish); //[1]
      if(_finish == _end_of_storage)
      {<!-- -->
        size_t n = capacity() == 0? 5:capacity()*2;
        size_t len = pos-_start;
        reserve(n);
        pos = _start + len; //[2]
      }
      iterator end = _finish - 1;
      while(end >= pos)
      {<!-- -->
        *(end + 1) = *end;
        --end;
      }
      *pos = val;
       + + _finish;
      return pos; //[3]
    }

explain:

  • [1] When pos == _finish, it means tail insert data.
  • [2] After expanding and reopening the space, pos still points to the address of the old space and becomes a wild pointer (the iterator becomes invalid). So the position needs to be recalculated based on the length.
  • [3] The insert data pos iterator may be invalid, so the recalculated pos iterator (pointing to the newly inserted data) must be returned and the return value received at the function call before the insert operation can continue.

5.3 erase

 iterator erase(iterator pos){<!-- -->
      assert(pos >= _start);
      assert(pos < _finish);
      iterator tmp = pos + 1; //[1]
      while(tmp < _finish){<!-- -->
        *(tmp-1) = *tmp;
         + + tmp;
      }
      --_finish;
\t
//if(size() < capacity()/2) //[2]
//{<!-- -->
//...
//Scale down -- trade time for space
//}

      return pos; //[3]
    }

explain:

  • [1] This process is to move data forward to overwrite the data to be deleted.
  • [2] It is not ruled out that individual versions of erase will perform shrinking operations. Here, the memory space is re-opened, and the pos iterator may also fail, so it also needs to be recalculated.
  • [3] Returns the element after the deleted element.

6. Two questions about vector

6.1 Iterator failure problem

Insert 10 times this number before all even numbers:

//Wrong way of writing:
void Test1(){<!-- -->
  int arr[] = {<!-- -->1,2,3,4,5};
  Myvector<int> v1(arr, arr + sizeof(arr)/sizeof(arr[0]));
  Myvector<int>::iterator it = v1.begin();
  while(it != v1. end())
  {<!-- -->
    if(*it % 2 == 0)
    {<!-- -->
      v1.insert(it, (*it) * 10); //The return value points to the newly inserted data
    }
     + + it;
  }
    
  for(int e : v1)
  {<!-- -->
    cout << e << " ";
  }
  cout << endl;
}

At this time, the following two iterators may fail, causing the program to crash:

//Correct writing:
void Test1(){<!-- -->
  int arr[] = {<!-- -->1,2,3,4,5};
  Myvector<int> v1(arr, arr + sizeof(arr)/sizeof(arr[0]));
  Myvector<int>::iterator it = v1.begin();
  while(it != v1. end())
  {<!-- -->
    if(*it % 2 == 0)
    {<!-- -->
      it = v1.insert(it, (*it) * 10); //Receive the return value and solve case 1
      it + =2; //Re-judgment from the next even data, solve the situation 2
    }
    else
    {<!-- -->
    + + it;
    }
  }
  //...
}

Delete all even numbers in the sequence:

//Wrong way of writing:
void Test2(){<!-- -->
  int arr[] = {<!-- -->1,2,3,4,5};
  Myvector<int> v1(arr, arr + sizeof(arr)/sizeof(arr[0]));
  Myvector<int>::iterator it = v1.begin();
  while(it != v1. end())
  {<!-- -->
    if(*it % 2 == 0)
    {<!-- -->
      v1.erase(it); //The return value points to the element after the deleted element
    }
       + + it;
  }

  for(int e : v1)
  {<!-- -->
    cout << e << " ";
  }
  cout << endl;
}

operation result:

Interpreting the SGI version results:

  1. The first case seems to work normally, but it is actually because of the accidental arrangement of the data.
  2. In the second case the program crashes. It is because the last number is an even number, and + + it missed _finish after deletion, which eventually caused the out-of-bounds access program to crash.
  3. In the third case, the program can run but the result is wrong. is because after removing the first 2 + + it misses the second 2.
//Correct writing:
void Test2(){<!-- -->
  int arr[] = {<!-- -->1,2,2,3,4,4,4};
  Myvector<int> v1(arr, arr + sizeof(arr)/sizeof(arr[0]));
  Myvector<int>::iterator it = v1.begin();
  while(it != v1. end())
  {<!-- -->
    if(*it % 2 == 0)
    {<!-- -->
      it = v1.erase(it); //The return value points to the element after the deleted element
    }
    else{<!-- -->
       + + it;
    }
  }
  //...
}

in conclusion:

  1. After using insert/erase to insert or delete pos position data. Be sure to receive the return value to update the pos position. If you directly access it, you may encounter wild pointer problems due to capacity expansion.
  2. It is necessary to clarify the position pointed to by the return value of insert/erase, and make appropriate adjustments to the returned updated pos to continue the insert or delete operation.

6.2 Deep copy problem of two-dimensional dynamic array

The topic Yanghui triangle mentioned in the previous section ([STL template library] vector introduction and use 3.2) is an example, if we want to copy the calculated results:

 void Test1(){<!-- -->
  Myvector<Myvector<int>> ret = Solution().generate(5); //copy construction
  for(size_t i = 0; i<ret. size(); + + i)
  {<!-- -->
    for(size_t j = 0; j<ret[i]. size(); + + j)
    {<!-- -->
      cout << ret[i][j] << " ";
    }
    cout << endl;
  }
}

ret is a two-dimensional array of Myvector type, so how does multi-layer deep copy work?

operation result: