[Python Advanced Features] In-depth NamedTuple Named Tuple

Introduction

Like tuples, NamedTuples are also immutable data types, and the content cannot be changed after creation.
As its name suggests, the difference from tuple is “Named”, that is, “named”. NamedTuple does not use subscripts to read and write like arrays. Instead, it is similar to classes and uses . to read and write.

Basic syntax

Create a function definition for NamedTuple

collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)

Parameter Description:

  • typename: The name of the newly created class.
  • field_names: List of field names. Must be a valid Python variable name and cannot begin with an underscore.
  • rename: Whether to automatically convert invalid field names.
  • defaults: List of field default values.
  • module: The value of __module__.

Usage tutorial

Create

Let’s first look at how to create named tuples. Take Point (which represents a point in two-dimensional coordinates) as an example:

# Guide package
from collections import namedtuple

#Create a normal tuple
point = (22, 33)
print(point) # Output: (22, 33)

#Create named tuples
Point = namedtuple('Point', 'x y')
point_A = Point(22, 33)
print(point_A) #Output: Point(x=22, y=33)

The key point is these two sentences

Point = namedtuple('Point', 'x y')
point_A = Point(22, 33)

It should be noted that namedtuple() is used to create classes, not object instances!

We first use namedtuple to create a subclass named Point with two fields x and y. Then assign this class to the Point variable.
Then Point(22, 33) is the ordinary new syntax.

Code similar to the following:

class Point:
def __init__(self, x, y):
self.x = x
self.y = y
point_A = Point(22, 33)

Positional parameters can also be used when creating named tuple objects

a = Point(1, 2)
b = Point(y=2, x=1)
a == b # >>> True

The field_names parameter is used to set the named tuple field names. There are three styles to choose from.
The following are all equivalent ways of writing:

Point = namedtuple('Point', 'x y')
Point = namedtuple('Point', 'x,y')
Point = namedtuple('Point', ['x', 'y'])


#The following are all legal codes
# Any whitespace characters are allowed in the middle
Point = namedtuple('Point', 'x, \t\t\t\\
\\
 y')
Point = namedtuple('Point', 'x \t\t\t\\
\\
 y')
# Tuples can also be used
Point = namedtuple('Point', ('x', 'y'))
# In fact, as long as it is iterable, it will work
def fields():
yield 'x'
yield 'y'
Point = namedtuple('Point', fields())

Use

A named tuple is first a tuple. How can a tuple be used? Of course, a named tuple can also be used.

print(point_A[0])
print(point_A[1])
print(*point_A) #tuple unpack

# output
"""
twenty two
33
22 33
"""

Then there are special uses of named tuples:

print(point_A.x)
print(point_A.y)

# output
"""
twenty two
33
"""

Common methods

The class created by namedtuple also comes with some utility methods:

Point._make(iterable) # Create a named tuple from a sequence
point._asdict() #Convert to dictionary
point._replace(**kargs) # Return a new tuple, the specified field in the new tuple is replaced with the specified value

point._fields # List field names
point._field_defaults # List field default values

Set default value

You can set default values for the fields of named tuples by passing in the defaults parameter when creating the class.

# Four-dimensional vector
# The default value is Vector4D(0, 0, 0, 0)
Vector4 = namedtuple('Vector4D', 'x y z w', defaults=(0, 0, 0, 0))

v1 = Vector4()
v2 = Vector4(1)
v3 = Vector4(1, 2, w=4)
print(v1)
print(v2)
print(v3)

# output
"""
Vector4D(x=0, y=0, z=0, w=0)
Vector4D(x=1, y=0, z=0, w=0)
Vector4D(x=1, y=2, z=0, w=4)
"""

The number of default values can be less than the number of fields, which means setting default values for the n parameters on the right.

Foo = namedtuple('Foo', 'a b c d', defaults=(1, 2))
print(Foo(22, 33))
print(Foo())

# output
"""
Foo(a=22, b=33, c=1, d=2)
Traceback (most recent call last):
  File "D:\TempCodeFiles\\
amed_tuple.py", line 6, in <module>
    print(Foo())
TypeError: Foo.__new__() missing 2 required positional arguments: 'a' and 'b'
"""

Better representation

The way namedtuple() is written is neither intuitive nor elegant. Python 3.5 adds a better way of writing:

# >= Python 3.5
from typing import NamedTuple
class PointA(NamedTuple):
x: int = 0
y: int = 0

# >=Python 2
from collections import namedtuple
PointB = namedtuple('PointB', 'x y', defaults=(0, 0))

print(PointA(2, 3) == PointB(2, 3)) # Output: True

Inherit and extend NamedTuple

namedtuple() returns a normal class. Since it is a class, of course it can also be inherited.

Create a Point named tuple and add a method to find the distance between two points.

# >= Python 3.5
class Point(NamedTuple):
x: int = 0
y: int = 0
    
def distance(self, p) -> float:
return math.sqrt((self.x - p.x) ** 2 + (self.y - p.y) ** 2)

# >=Python 2
class Point(namedtuple('Point', 'x y', defaults=(0, 0))):
def distance(self, p) -> float:
return math.sqrt((self.x - p.x) ** 2 + (self.y - p.y) ** 2)

a = Point()
b = Point(3, 2)
print(a, b)
print(a.distance(b))

Application

Read csv file

Take reading a csv file that stores English words as an example.

import csv
from collections import namedtuple

# Define named tuples
# Define fields according to csv column names
Word = namedtuple('Word', 'word, type, chs_def, eng_ch, context, example')

file_path = r'C:\Users\ZhouXiaokang\Desktop\Words Vol 1 Ch 1 Ep 2.csv'
with open(file_path, 'r', encoding='utf-8') as f:
reader = csv.reader(f)
next(reader) # Skip the title line
for word in map(Word._make, reader):
print(f'{<!-- -->word.word} {<!-- -->word.type}. {<!-- -->word.chs_def} | Example: {<!-- -->word.context}')

output

chirp n & amp;v. (birds, insects) chirp, chirp | Example: (*chirp* *chirp* *chirp*)
screech v. (vehicle, car tires) make a screeching sound | Example: (*screech*)
Shiroko term. 白子 | Example:
mug v. To commit murder and robbery | Example: You didn't get mugged, did you?
faint v. faint; faint | Example: What's that? You fainted from hunger?
...

Representing data as a substitute for dictionaries

Advantages over dictionaries:
1. Fast, small
2..field is clearer than ['field']

The following source code is excerpted from the baidupcs_py library:

class PcsFile(NamedTuple):
    """
    A Baidu PCS file

    path: str # remote absolute path
    is_dir: Optional[bool] = None
    is_file: Optional[bool] = None
    fs_id: Optional[int] = None # file id
    size: Optional[int] = None
    md5: Optional[str] = None
    block_list: Optional[List[str]] = None # block md5 list
    category: Optional[int] = None
    user_id: Optional[int] = None
    ctime: Optional[int] = None # server created time
    mtime: Optional[int] = None # server modified time
    local_ctime: Optional[int] = None # local created time
    local_mtime: Optional[int] = None # local modified time
    server_ctime: Optional[int] = None # server created time
    server_mtime: Optional[int] = None # server modified time
    shared: Optional[bool] = None # this file is shared if True
    """

    path: str # remote absolute path
    is_dir: Optional[bool] = None
    is_file: Optional[bool] = None
    fs_id: Optional[int] = None # file id
    size: Optional[int] = None
    md5: Optional[str] = None
    block_list: Optional[List[str]] = None # block md5 list
    category: Optional[int] = None
    user_id: Optional[int] = None
    ctime: Optional[int] = None # server created time
    mtime: Optional[int] = None # server modified time
    local_ctime: Optional[int] = None # local created time
    local_mtime: Optional[int] = None # local modified time
    server_ctime: Optional[int] = None # server created time
    server_mtime: Optional[int] = None # server modified time
    shared: Optional[bool] = None # this file is shared if True

    rapid_upload_info: Optional[PcsRapidUploadInfo] = None
    dl_link: Optional[str] = None

    @staticmethod
    def from_(info) -> "PcsFile":
        returnPcsFile(
            path=info.get("path"),
            is_dir=info.get("isdir") == 1,
            is_file=info.get("isdir") == 0,
            fs_id=info.get("fs_id"),
            size=info.get("size"),
            md5=info.get("md5"),
            block_list=info.get("block_list"),
            category=info.get("category"),
            user_id=info.get("user_id"),
            ctime=info.get("ctime"),
            mtime=info.get("mtime"),
            local_ctime=info.get("local_ctime"),
            local_mtime=info.get("local_mtime"),
            server_ctime=info.get("server_ctime"),
            server_mtime=info.get("server_mtime"),
            shared=info.get("shared"),
        )

Source code

See Github.

The key part is here:

 # Build-up the class namespace dictionary
    # and use type() to build the result class
    #Collect methods, fields, etc. of classes
    class_namespace = {<!-- -->
        '__doc__': f'{<!-- -->typename}({<!-- -->arg_list})',
        '__slots__': (),
        '_fields': field_names,
        '_field_defaults': field_defaults,
        '__new__': __new__,
        '_make': _make,
        '__replace__': _replace,
        '_replace': _replace,
        '__repr__': __repr__,
        '_asdict': _asdict,
        '__getnewargs__': __getnewargs__,
        '__match_args__': field_names,
    }
    for index, name in enumerate(field_names):
        doc = _sys.intern(f'Alias for field number {<!-- -->index}')
        class_namespace[name] = _tuplegetter(index, doc)

    #Create new class
    result = type(typename, (tuple,), class_namespace)

The type() function passes in one parameter to obtain the class of the object; if three parameters are passed in, it becomes a dynamically created class, which is equivalent to the dynamic nature of class Writing method.

class Foo:
    def hello(self):
        print('Hello')
# Equivalent to
def hello(self):
    print('Hello')
Foo = type('Foo', (object,), {<!-- -->'hello': hello})

Reference article

  1. https://docs.python.org/zh-cn/3/library/collections.html#collections.namedtuple
  2. https://realpython.com/python-namedtuple/