[Python] collections module

The collections module in Python contains some container data types.

 import collections

[x for x in dir(collections) if not x.startswith('_')]
# result:
['ChainMap', 'Counter', 'OrderedDict', 'UserDict', 'UserList', 'UserString',
'abc', 'defaultdict', 'deque', 'namedtuple']

1. collections.ChainMap() [Search multiple dictionaries]

Search multiple dictionaries in the order in which they appear.

m=collections.ChainMap(Dictionary 1, Dictionary 2, Dictionary 3,…): manages multiple dictionaries, provides context containers, and can be viewed as a stack.

m.maps: List of mappings. A list of dictionaries. The value searched is the first value in the map found in order. The order can be changed.

import collections

a = {"a":"A","b":"B"}
b = {"c":"C","b":"D"}

m = collections.ChainMap(a,b)
m # Result: ChainMap({'a': 'A', 'b': 'B'}, {'c': 'C', 'b': 'D'})
# Mapping list
m.maps # Result: [{'a': 'A', 'b': 'B'}, {'c': 'C', 'b': 'D'}]

# Get the value of the key and return the value in the first map found in order
m["a"] # Result: 'A'
m["b"] # Result: 'B'
m["c"] # Result: 'C'

list(m.keys()) # Result: ['c', 'b', 'a']
list(m.values()) # Result: ['C', 'B', 'A']
list(m.items()) # Result: [('c', 'C'), ('b', 'B'), ('a', 'A')]


# Change the order (will also change the search results)
m.maps = list(reversed(m.maps))
m # Result: ChainMap({'c': 'C', 'b': 'D'}, {'a': 'A', 'b': 'B'})
m["b"] # Result: 'D'

# Modify the value (only modify the value in the first mapping found)
m["b"] = "w"
m # Result: ChainMap({'c': 'C', 'b': 'w'}, {'a': 'A', 'b': 'B'})

The new_child() method creates a new mapping at the front of the maps list with an additional mapping. The new mapping can be passed as a parameter to the new_child() method. Avoid modifying existing underlying data structures.

import collections

a = {"a":"A","b":"B"}
b = {"c":"C","b":"D"}

m = collections.ChainMap(a,b)
m # Result: ChainMap({'a': 'A', 'b': 'B'}, {'c': 'C', 'b': 'D'})

mn = m.new_child()
mn["b"] = "w"
mn # Result: ChainMap({'b': 'w'}, {'a': 'A', 'b': 'B'}, {'c': 'C', 'b': 'D' })

# or
c = {"b":"w"}
mq = m.new_child(c)
mq # Result: ChainMap({'b': 'w'}, {'a': 'A', 'b': 'B'}, {'c': 'C', 'b': 'D' })

2. collections.Counter() [Counter]

Dictionary subclass that counts hashable objects. Use dictionary form to count elements and the number of times they appear in the sequence.

Reference: collections module Counter

3. collections.OrderedDict() [Dictionary remembers the order of addition]

Dictionary subclasses that remember the order of keys added.

# Ordinary dictionary
c ={}
c[1]="a"
c[2]="b"
c[3]="c"
                                           
d = {}
d[3]="c"
d[2]="b"
d[1]="a"
c == d # Result: True
c # Result: {1: 'a', 2: 'b', 3: 'c'}
d # Result: {3: 'c', 2: 'b', 1: 'a'}


#OrderedDict
import collections

a = collections.OrderedDict()
a[1]="a"
a[2]="b"
a[3]="c"

b = collections.OrderedDict()
b[3]="c"
b[2]="b"
b[1]="a"
a==b # Result: False
a # Result: OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])
b # Result: OrderedDict([(3, 'c'), (2, 'b'), (1, 'a')])

Use the move_to_end() method to move elements to the end (last=True) or beginning (last=False) position.

import collections

a = collections.OrderedDict()
a[1]="a"
a[2]="b"
a[3]="c"
a # Result: OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])

# Move key 2 to the end position
a.move_to_end(2)
a # Result: OrderedDict([(1, 'a'), (3, 'c'), (2, 'b')])
# Move key 3 to the starting position
a.move_to_end(3,last=False)
a # Result: OrderedDict([(3, 'c'), (1, 'a'), (2, 'b')])

4. collections.defaultdict() [Missing keys call the factory function to return the default value]

Dictionary subclasses that call factory functions to provide missing values.

Factory functions set default values in advance. Use the factory function as a parameter to defaultdict. If there is no key, the factory function is called to return the default value.

import collections

def default_factory():
    return "default value"

a = collections.defaultdict(default_factory)
a["a"] = 9
a # Result: defaultdict(<function default_factory at 0x00000278C2482520>, {'a': 9})
a["a"] # Result: 9
a["b"] # 'default value'
a # Result: defaultdict(<function default_factory at 0x00000278C2482520>, {'a': 9, 'b': 'default value'})

import collections

m = collections.defaultdict(int)
m["a"] = 9
m # Result: defaultdict(<class 'int'>, {'a': 9})
m["a"] # Result: 9
m["b"] # 0
m # Result: defaultdict(<class 'int'>, {'a': 9, 'b': 0})

n = collections.defaultdict(list)
n["a"].append(9)
n # Result: defaultdict(<class 'list'>, {'a': [9]})
n["a"] # Result: [9]
n["b"] # []
n # Result: defaultdict(<class 'list'>, {'a': [9], 'b': []})

5. collections.deque() [double-ended queue]

Both ends can be added and removed.

Reference: collections module deque

6. collections.namedtuple() [named tuple]

Create a tuple subclass with named fields. collections.namedtuple(tuple subclass name, [named field 1, named field 2,…])

Ordinary tuples: Use index to access elements in the tuple. Named tuples: Access elements using named fields.

The class created by the named tuple does not use an instance dictionary and can use memory more efficiently; however, attribute access is not as efficient as a class, and the number of fields is inconvenient to increase.

All operations on ordinary tuples still work on named tuples.

import collections

Mytuple = collections.namedtuple('Mytuple',['name','age'])

people1 = Mytuple('Jack',18)
people1.name # Result: 'Jack'
people1.age # Result: 18
people1 # Result: Mytuple(name='Jack', age=18)

people2 = Mytuple(name='Tom',age=33)
people2.name # Result: 'Tom'
people2.age # Result: 33
people2 # Result: Mytuple(name='Tom', age=33)


# Equivalent to
class Mytuple(object):
    def __init__(self,name,age):
        self.name = name
        self.age = age

people1 = Mytuple('Jack',18)
people1.name # Result: 'Jack'
people1.age # Result: 18

people2 = Mytuple('Tom',33)
people2.name # Result: 'Tom'
people2.age # Result: 33

Neither named tuples nor ordinary tuples can be modified. But named tuples can use the _replace() method to create new instances and replace the value of a field.

import collections

Mytuple = collections.namedtuple('Mytuple',['name','age'])

people1 = Mytuple('Jack',18)
people1.name # Result: 'Jack'
people1.age # Result: 18
people1 # Result: Mytuple(name='Jack', age=18)

people2 = people1._replace(age=20)
people2.name # Result: 'Jack'
people2.age # Result: 20
people2 # Result: Mytuple(name='Jack', age=20)

Use the _asdict() method to convert a namedtuple instance into an OrderedDict instance. And the keys of OrderedDict and namedtuple fields are in the same order.

import collections

Mytuple = collections.namedtuple('Mytuple',['name','age'])

people1 = Mytuple('Jack',18)
people1 # Result: Mytuple(name='Jack', age=18)
people1._asdict() # Result: {'name': 'Jack', 'age': 18}

7. collections.abc()

Abstract base class for containers. APIs are defined for Python’s built-in container data structures and container data structures defined by the collections module.

import collections

[x for x in dir(collections.abc) if not x.startswith('_')]
# result:
['AsyncGenerator', 'AsyncIterable', 'AsyncIterator', 'Awaitable', 'ByteString',
'Callable', 'Collection', 'Container', 'Coroutine', 'Generator', 'Hashable',
'ItemsView', 'Iterable', 'Iterator', 'KeysView', 'Mapping', 'MappingView',
'MutableMapping', 'MutableSequence', 'MutableSet', 'Reversible', 'Sequence',
'Set', 'Sized', 'ValuesView']

8. collections.UserDict()

A wrapper around dictionary objects to facilitate dictionary subclassing.

9. collections.UserList()

A wrapper around list objects to facilitate list subclassing.

10. collections.UserString()

A wrapper around string objects to facilitate string subclassing.

Supplement: collections – Container datatypes – Python 3.12.0 documentation