Data Structures – Introduction

Table of Contents

  • 1. The basic concept of data structure
    • 1.1 Common terms
  • 2. Basic concepts of algorithms
    • 2.1 Definition
    • 2.2 Features
    • 2.3 Design requirements
    • 2.4 Measurement of Algorithm Efficiency

1. Basic concepts of data structure

1.1 Common terms

  • Data: The carrier of information, a collection of symbols (including numbers, characters, etc.) that can be input to a computer and recognized and processed by a computer program.
  • Data element: The basic unit of data, consisting of several data items (the smallest indivisible unit).
    For example: student information, as a data element, can be composed of data items such as student number, name, gender, etc.
  • Data Object: A collection of data elements with the same nature.
    For example: natural numbers can be regarded as a kind of data object, N = {0,1,2…}.
  • Data type: A general term consisting of a collection of values and a set of operations defined on that collection.
    • Atomic type: The value cannot be subdivided. Such as int type, common operations include: four arithmetic operations, modulo operation, etc.
    • Structural type: A value can be decomposed into several components, which can be non-structural atomic types or structural types.
    • Abstract Data Type (ADT): A mathematical model abstracted from the problem and a set of operations defined on the mathematical model (common operations are construction, search, modification, etc.). The definition of an abstract data type depends only on its logical characteristics, and has nothing to do with how it is represented and implemented inside the computer.
      ADT abstract data type name {<!-- -->
          Data Object: <definition of data object>
          Data relation: <definition of data relation>
          Basic operation: <definition of basic operation>
      }
      
  • Data structure: A collection of data elements that have one or more specific relationships with each other. It includes three aspects (three elements): logical structure, storage structure (physical structure), and operation rules.
    • Logical structure: Refers to the logical relationship between data elements, such as linear and nonlinear.
    • Storage structure: Refers to the actual representation of data elements in the computer, including the representation of data elements themselves and the representation of logical relationships between elements. Common storage structures: sequential storage, chain storage, index storage, hash storage.
    • Operation rules: Operations for data elements, including definition and implementation. The definition refers to the logical structure, pointing out the required operation function; the implementation refers to the storage structure, pointing out the specific operation steps of the operation function.

2. Basic concepts of algorithms

2.1 Definition

  • An algorithm is a description of the steps to solve a particular problem, represented in a computer as a finite sequence of instructions, where each instruction represents one or more operations.

2.2 Features

  • Finite: An algorithm must always end after executing a finite number of steps, each of which can be completed in finite time.
  • Deterministic: Each instruction in the algorithm must have an exact meaning, and only the same output can be obtained for the same input.
  • Feasibility: The operations described in the algorithm can be realized by executing the basic operations that have been implemented for a limited number of times.
  • Input-Output: An algorithm has zero or more inputs and one or more outputs.

2.3 Design requirements

  • Correctness: The algorithm should correctly reflect the problem and lead to the correct answer.
  • Readability: Algorithms should be well readable for easy understanding and communication.
  • Robustness: Algorithms should be able to react or handle illegal inputs appropriately, rather than producing inexplicable results.
  • High efficiency and low storage: The algorithm should try to meet the short execution time and use less storage space during execution.

2.4 Measurement of Algorithm Efficiency

  • Time Complexity: The measure of algorithm time, denoted as: T(n) = O( f(n) ). Among them, T(n) represents the total number of executions of the statement, which is a function of the problem size n, and f(n) is a function of the problem size n. As the problem size n increases, the growth rate of the algorithm execution time and f( n) have the same growth rate.
    • Best Time Complexity: The running time of the algorithm in the best case.
    • Average Time Complexity: The expected running time of the algorithm when all situations occur with equal probability.
    • Worst Time Complexity: The running time of the algorithm in the worst case.
      In general, the worst time complexity is always considered to ensure that the running time of the algorithm will not be longer than it.
      Common method: List the function relationship between the execution times x of a statement in the deepest loop and the problem size n, and find x. as follows:
      int function(int n)
      {<!-- -->
          int i = 1;
          while (i > n) {<!-- -->
              i *= 2;
          }
      
          return i;
      }
      

      First establish the innermost loop i *= 2 The functional relationship between the execution times of this statement x and the problem size n:
      ?

      2

      x

      =

      no

      2^x = n

      2x=n
      Then solve x
      ?

      x

      =

      log

      ?

      2

      no

      x = \log_2{n}

      x=log2?n , then the time complexity of the algorithm is

      o

      (

      log

      ?

      2

      no

      )

      O(\log_2{n})

      O(log2?n).

  • Addition and multiplication rules
    • Addition rules:

      T

      (

      no

      )

      =

      T

      1

      (

      no

      )

      +

      T

      2

      (

      no

      )

      =

      o

      (

      f

      (

      no

      )

      )

      +

      o

      (

      g

      (

      no

      )

      )

      =

      o

      (

      m

      a

      x

      (

      f

      (

      no

      )

      ,

      g

      (

      no

      )

      )

      )

      T(n) = T_1(n) + T_2(n) = O(f(n)) + O(g(n)) = O(max(f(n), g(n)))

      T(n)=T1?(n) + T2?(n)=O(f(n)) + O(g(n))=O(max(f(n),g(n)))

    • Rules for multiplication:

      T

      (

      no

      )

      =

      T

      1

      (

      no

      )

      x

      T

      2

      (

      no

      )

      =

      o

      (

      f

      (

      no

      )

      )

      x

      o

      (

      g

      (

      no

      )

      )

      =

      o

      (

      f

      (

      no

      )

      x

      g

      (

      no

      )

      )

      T(n) = T_1(n) \times T_2(n) = O(f(n)) \times O(g(n)) = O(f(n) \times g(n))

      T(n)=T1?(n)×T2?(n)=O(f(n))×O(g(n))=O(f(n)×g(n))

  • Common time complexity comparison

    o

    (

    1

    )

    < o ( log ? 2 no ) < o ( no ) < o ( no log ? 2 no ) < o ( no k ) < o ( 2 no ) < o ( no ! ) < o ( no no ) O(1) < O(\log_2{n}) < O(n) < O(n\log_2{n}) < O(n^k) < O(2^n) < O(n!) < O (n^n) O(1)

  • Space complexity: the measure of the storage space of the algorithm, denoted as:

    S

    (

    no

    )

    =

    o

    (

    g

    (

    no

    )

    )

    S(n) = O(g(n))

    S(n)=O(g(n)). where S(n) represents the storage space used by the algorithm, which is a function of the problem size n. The space complexity is

    o

    (

    1

    )

    O(1)

    Algorithms with O(1) are often called work-in-place, that is, the required auxiliary space is constant.