Computer algorithm analysis and design (12)—Greedy algorithm (optimal loading problem and Huffman coding problem)

Article directory

1. Optimal loading problem
- 1.1 Problem statement
- 1.2 Code writing
2. Huffman coding
- 2.1 Overview of Huffman coding
- 2.2 Prefix code
- 2.3 Problem description
- 2.4 Code ideas
- 2.5 Code writing
- 2.6 Average code length
- 2.7 Examples
- - 2.7.1 Question 1
  - 2.7.2 Question 2

1. Optimal loading problem

1.1 Problem statement

?1. There is a batch of containers to be loaded onto a ship with a carrying capacity of

ship c, known container

(

≤

)

i(1≤i≤n)

The weight of i(1≤i≤n) is

w_i

wi?. The optimal loading problem requires loading as many containers as possible onto a ship without limiting the loading volume.

?2. Greedy selection strategy: The one with the lightest weight is loaded first.

?3. Algorithm idea: Divide the shipping process into multi-step selections. Each step of loading

1 box each time, choosing the lightest box from the remaining boxes. This continues until all the boxes are loaded onto the ship or no more boxes can be accommodated on the ship.

1.2 Code writing

The time complexity of the algorithm is

O

(

n

l

o

g

n

)

O(nlogn)

O(nlogn)

#include<algorithm>
#include<iostream>
using namespace std;

int main(){<!-- -->
    int n; //define the number of containers
    int c; //Define the maximum carrying capacity of the ship
    cout<<"Enter the number of containers and the maximum carrying capacity of the ship"<<endl;
    cin>>n>>c;
    
    cout<<"Enter the weight of each item"<<endl;
    int w[n]; //Use an array to fill in the weight of the container
    for(int i=0;i<n;i + + )
{<!-- -->
        cin>>w[i];
    }

    sort(w,w + n); //Quick sort sorts the weight of the containers from small to large
    
    int temp=0; //intermediate value
    int count=0; //Counter
    
    for(int i=0;i<n;i + + )
{<!-- -->
        temp = temp + w[i];
        if(temp<=c)
{<!-- -->
            count + + ;

        }
        else
{<!-- -->
            break;
        }
    }

    cout<<"The maximum quantity that can be loaded into the container is"<<count<<endl;
    return 0;
}

2. Huffman coding

2.1 Overview of Huffman coding

?1. Huffman coding is one of the applications in telecommunications. It is widely used as a very effective coding method for data file compression. Its compression rate is usually between 20% and 90%. In telecommunications services, binary encoding is usually used to represent letters or other characters, and such encoding is used to represent character sequences.

?2. For example: If the message to be sent is

’

ABACCDA’

ABACCDA’, which uses only four characters, can be distinguished by two-digit binary encoding. hypothesis

A, B, C, D

The codes of A, B, C and D are respectively

00, 01,10, 11

00,01,10,11, then the above message will be

00010010101100

’

00010010101100’

00010010101100’ (14 digits in total), the decoder can group and decode by two digits to restore the original message.

2.2 Prefix code

?1. Prefix code definition: define a

0,1

0,1 string as its code, and requires that the code of any character is not a prefix of other character codes. This encoding is called a prefix code.

	a	b	c	d	e	f
Encoding mode 1	0	101	100	111	1101	1100
Encoding method 2	0	1	00	01	10	11

?It’s easy to find that when we use encoding

2, there will be problems with decoding, for example

00110

00110 can be decoded into

aabba

aabba, which can also be decoded into

cfa

cfa, the decoding result is not unique and the encoding method is not feasible.
?With prefix codes (that is, any prefix of the code is not other codes), the decoding result is unique and the encoding method is feasible.

2.3 Problem description

?1. If the optimal encoding method is required, how to compare the advantages and disadvantages of different prefix codes? We can compare the total length of the encoded binary string. The shorter the total length, the better the encoding method is. The total length of the encoded binary depends on the frequency of characters to be encoded. Suppose we are given coded characters and their frequencies and two different prefix codes, as shown in the table below.

	a	b	c	d	e	f
Frequency (thousands)	45	13	12	16	9	5
Prefix code 1	0	101	100	111	1101	1100
Prefix code 2	1100	1101	111	100	101	0

?It can be known from calculation that using prefix code

1The total length of the encoded binary string is

224000

224000, using prefix code

2The total length of the encoded binary string is

348000

348000, obviously, the prefix code

1 is better than prefix code

?2. Greedy strategy: Characters that appear more frequently have shorter codes, Characters that appear less frequently have codes that are longer. Encode and decode it using a binary tree.

2.4 Code ideas

?1. Given encoding character set

C and

any character in C

frequency of c

(

)

f(c)

f(c).

A prefix code encoding scheme of C corresponds to a binary tree

T. character

c in the tree

The depth in T is denoted by

(

)

d_T(c)

dT?(c). The average code length of this encoding method is defined as:

?2. Huffman algorithm constructs a binary tree representing the optimal prefix code in a bottom-up manner

T. The algorithm is based on

Starting from n leaf nodes, execute

n-1

n?1 times of merging, the final required tree is generated after the operation.

T. In a Huffman tree, each character in the encoded character set

The frequency of c is

(

)

f(c)

f(c). by

f is the priority queue of key value

Q when used for greedy selection effectively determines the two trees with the minimum frequency that are currently to be merged. Once two trees with minimum frequencies are merged, a new tree is generated whose sum of frequencies is the sum of the frequencies of the merged two trees, and the new tree is added

2.5 Code Writing