Subset generation algorithm: given a set, enumerate all possible subsets

Given a set, enumerate all possible subsets.

(For simplicity, there are no duplicate elements in the sets discussed in this article)

1. Method 1: Incremental construction method

The first idea is to select one element at a time and put it into the collection. The procedure is as follows:

void print_subset(int n, int *A, int cur) {<!-- -->

    for (int i = 0; i < cur; i + + ) printf("%d ", A[i]);

    printf("\\
");

    int s = cur ? A[cur - 1] + 1 : 0; //Determine the minimum possible value of the current element

    for (int i = s; i < n; i + + ) {<!-- -->
        A[cur] = i;
        print_subset(n, A, cur + 1); //Recursively construct a subset
    }
}

Since the number of elements in A is uncertain, each recursive call must output the current collection. In addition, the recursion boundary does not need to be determined explicitly – if no more elements can be added, the recursion will naturally stop.

This code uses a sequencing technique: it specifies that the numbers of all elements in set A are arranged from small to large, so that the set {1, 2} will not be output twice as {1, 2} and {2, 1}.

Tip:In the incremental method of enumerating subsets, you need to use sequencing techniques to avoid enumerating the same set twice.

There are 1024 nodes in this solution tree: one for each possible A, and

n

n

The set of n elements has exactly

2

n

2^n

2n subsets, 210 = 1024.

Code

//All subsets of {0~n-1}: incremental construction method
#include<cstdio>
using namespace std;

void print_subset(int n, int* A, int cur) {<!-- -->
  for(int i = 0; i < cur; i + + ) printf("%d ", A[i]); // Print the current collection
  printf("\\
");
  int s = cur ? A[cur-1] + 1 : 0; // Determine the minimum possible value of the current element
  for(int i = s; i < n; i + + ) {<!-- -->
    A[cur] = i;
    print_subset(n, A, cur + 1); // Recursively construct a subset
  }
}

int A[10];
int main() {<!-- -->
  int n;
  scanf("%d", & amp;n);
  print_subset(n, A, 0);
  return 0;
}

2. Method 2: Bit vector method

The second idea is to construct a bit vector B[i] instead of directly constructing the subset A itself, where B[i] = 1, if and only if i is in subset A. The recursive implementation is as follows:

void print_subset(int n, int *B, int cur) {<!-- -->
    if (cur == n) {<!-- -->
        for (int i = 0; i < cur; i + + ) {<!-- -->
            if (B[i]) printf("%d ", i); //Print the current collection
        }
        printf("\\
");
        return ;
    }

    B[cur] = 1; //Select the cur-th element
    print_subset(n, B, cur + 1);

    B[cur] = 0; //Do not select the cur-th element
    print_subset(n, B, cur + 1);
}

It must be a complete subset after all “whether all elements are selected” are determined, so it is output only when if (cur == n) is true.

The current solution tree has 2047 nodes, which is slightly more than the previous method, because all partial solutions (incomplete solutions) also correspond to nodes on the solution tree.

Tip: In the bit-vector method of enumerating subsets, the number of nodes in the solution tree is slightly larger, but it is still fast enough in most cases.

This is a tree

n

+

1

n+1

A binary tree with n + 1 levels (the range of cur is 0 ~ n). There is 1 node in level 0, 2 nodes in level 1, 4 nodes in level 2, and 8 nodes in level 3. ,…,No.

i

i

i layer has

2

i

2^i

2i nodes, the total number is

1

+

2

+

4

+

8

+

.

.

.

+

2

n

=

2

n

+

1

?

1

1 + 2 + 4 + 8 + … + 2^n = 2^{n + 1} – 1

1 + 2 + 4 + 8 + … + 2n=2n + 1?1, consistent with the experimental results.

The picture below shows the answer tree.

Code

//All subsets of {0~n-1}: bit vector method
#include<cstdio>
using namespace std;

void print_subset(int n, int* B, int cur) {<!-- -->
  if(cur == n) {<!-- -->
    for(int i = 0; i < cur; i + + )
      if(B[i]) printf("%d ", i); // Print the current collection
    printf("\\
");
    return;
  }
  B[cur] = 1; //Select the cur-th element
  print_subset(n, B, cur + 1);
  B[cur] = 0; // Do not select the cur-th element
  print_subset(n, B, cur + 1);
}

int B[10];
int main() {<!-- -->
  int n;
  scanf("%d", & amp;n);
  print_subset(5, B, 0);
  return 0;
}

3. Method 3: Binary method

You can also use binary to represent the subset S of {0, 1, 2,…, n – 1}: from right to left

i

i

The i bit (numbered starting from 0) represents the element

i

i

Whether i is in the set S. The figure below shows how the binary 0100011000110111 represents the set {0,1,2,4,5,9,10,14}.

Note: For convenience, the rightmost bit always corresponds to element 0, not element 1.

Tip: You can use binary representation of subsets, where from right to left

i

i

The i bit (numbered starting from 0) represents the element

i

i

Whether i is in the set (1 means “in”, 0 means “not”).

After the collection is represented, operations must be performed on the collection. Common set operations can be easily implemented using bitwise operators. The most common binary operators are AND (& amp;), OR (|), NOT (!), which are very similar to the corresponding logical operations, as shown in the following table.


“Exclusive OR (XOR)” operator “^”, its rule is “If A and B are not the same, in A ^ B = 1, otherwise 0″. The most important property of the XOR operation is “switching” – XORing twice is equivalent to no XOR, that is, A^B^B = A. In addition, AND, OR and XOR all satisfy the commutative law: A & amp;B = B & amp;A, A|B = B|A, A^B = B^A.

Unlike logical operators, bitwise operators operate bitwise – the “bitwise AND” of two 32-bit integers is equivalent to 32 operations between 0/1 pairs. The following table compares the values of bitwise AND, bitwise OR, and bitwise XOR between the binary numbers 10110 (decimal 22) and 01100 (decimal 12), as well as the meaning of the corresponding set operations.

It can be seen that A & amp;B, A|B and A^B correspond to the intersection, union and symmetric difference of sets respectively. In addition, the empty set is 0, and the full set {0, 1, 2, …,

n

?

1

n-1

The binary form of n?1} is

n

n

n 1’s, i.e. decimal

2

n

?

1

2^n-1

2n?1. For convenience, the complete set is often defined in the program as ALL_BITS = (1 << n) - 1, then the complement of A is ALL_BITS^A. Of course, it is also possible to directly use integer subtraction ALL_BITS - A, but the speed is slower than the bit operation "^".

Tip:When subsets are represented in binary, bitwise AND, OR, and XOR in bitwise operations correspond to the intersection, union, and symmetric difference of sets.

In this way, the following program can be used to output each element corresponding to the subset S:

void print_subset(int n, int s) {<!-- --> //Print the subset S of {0, 1, 2, ..., n-1}
    for (int i = 0; i < n; i + + ) {<!-- -->
        if (s & amp; (1 << i)) printf("%d ", i); //Using the C language rule of "all non-zero values are true"
    }
    printf("\\
");
}

And enumerating subsets is as easy as enumerating integers:

for (int i = 0; i < (1 << n); i + + ) {<!-- --> //Enumerate the codes 0, 1, 2, .. corresponding to each subset. ., 2^n - 1
    print_subset(n, i);
}

Code

//All subsets of {0~n-1}: binary method
#include<cstdio>
using namespace std;

void print_subset(int n, int s) {<!-- --> // Print the subset S of {0, 1, 2, ..., n-1}
  for(int i = 0; i < n; i + + )
    if(s & amp;(1<<i)) printf("%d ", i); // This takes advantage of the C language rule of "all non-zero values are true"
  printf("\\
");
}

int main() {<!-- -->
  int n;
  scanf("%d", & amp;n);
  for(int i = 0; i < (1<<n); i + + ) // Enumerate the codes corresponding to each subset 0, 1, 2, ..., 2^n-1
    print_subset(n, i);
  return 0;
}