Juha-Matti Santala
Community Builder. Dreamer. Adventurer.

Improve your code with namedtuples

Batteries included is a blog series about the Python Standard Library. Each day, I share insights, ideas and examples for different parts of the library. Blaugust is an annual blogging festival in August where the goal is to write a blog post every day of the month.

One of the small things in Python standard library that I really like is collections.namedtuple. In his brilliant PyCon 2015 talk Beyond PEP 8 -- Best practices for beautiful intelligible code, Raymond Hettinger shows an example of a difficult to understand tuple.

p = (170, 0.1, 0.6)
if p[1] >= 0.5:
  print 'Wheh that is bright!'  # (Editor's note: it's 2015 so he's using Python 2)
if p[2] >= 0.5:
  print 'Wow, that is light'

and walks the audience through improving it.

Once you’ve read this article, I recommend spending a great 50 minute session with Raymond watching the entire talk about how to write better Python code.

namedtuple to rescue

namedtuple, as the name hints, is a tuple with extra naming. It lives in the collection module alongside some other helpful friends.

To continue on Raymond’s example, let’s explore what benefits will namedtuple offer us in addition to a regular tuple.

from collections import namedtuple

Color = namedtuple('Color', ['hue', 'saturation', 'luminosity'])

p = Color(hue=170, saturation=0.1, luminosity=0.6)

if p.saturation >= 0.5:
  print('Wheh that is bright!')
if p.luminosity >= 0.5:
  print('Wow, that is light')

To create a namedtuple, you import it from collections and give it two arguments: a name for that tuple and a list of attribute names.

Benefits of namedtuple

Benefit 1: Named values

When you create an instance of your named tuple, you can pass in all the values as keyword arguments. Reading the line 5 in isolation way later than its written, it’s easier for the developer to immediately understand what is being constructed. They don’t need to go look at what Color is and in which order the attributes are given.

Benefit 2: Attribute access

Tuples are accessed by indices, starting from 0. Namedtuple’s values can additionally be accessed by dot notation and attribute names, making it more understandable to read.

Benefit 3: Better repr

regular = (170, 0.1, 0.6)
named = Color(hue=170, saturation=0.1, luminosity=0.6)

print(regular)
# (170, 0.1, 0.6)
print(named)
# Color(hue=170, saturation=0.1, luminosity=0.6)

When you print your tuples, you get much more information about them compared to a regular tuple. You’ll get the name and name + value combinations of all of the attributes.

Benefit 4: Backwards compatible

Namedtuples are backwards compatible with regular tuples.

If your application or script currently uses tuples and you want to start moving towards a better solution, you can do it piece by piece. Turn your tuple creation into a namedtuple creation and everything down the line will still work. Then you can work on turning your index accesses to attribute name accesses.

Another benefit of this is that you can pass them to any other piece of code, yours or third party, that works with tuples.

_make method

namedtuple has a few handy classmethods but one I want to highlight is _make as it has a great use case that combos with yesterday’s csv post.

For this demo, I made a simplified csv:

Year,Main libraries,Books
1999,436,36940000
2000,436,37013000
2001,432,37205000
import csv
from collections import namedtuple

LibraryData = namedtuple('LibraryData', ['Year', 'Libraries', 'Books'])

with open('libraries.csv', 'r') as csv_infile:
  reader = csv.reader(csv_infile)
  next(reader)  # Skip headers
  for library_data in map(LibraryData._make, reader):
    print(library_data)

## Prints
# LibraryData(Year='1999', Libraries='436', Books='36940000')
# LibraryData(Year='2000', Libraries='436', Books='37013000')
# LibraryData(Year='2001', Libraries='432', Books='37205000')

With map and _make, you can read your CSV data directly into namedtuples when reading in the data.

_.as_dict()

Namedtuples are a great companion with data that gets serialized into JSON to be sent over the wire.

from collections import namedtuple
import json

Pokemon = namedtuple('Pokemon', ['pokedex', 'name', 'region'])

pokemons = [
  Pokemon(1, 'Bulbasaur', 'Kanto'),
  Pokemon(133, 'Eevee', 'Kanto'),
  Pokemon(249, 'Lugia', 'Johto')
]

print(json.dumps(pokemons))
# prints
# [{ "pokedex": 1, "name", "Bulbasaur", "region": "Kanto"}, ...]

Pattern matching with namedtuples

With Python’s pattern matching, namedtuples make the code much easier to read and understand compared to regular tuples because we don’t need to worry about individual positional arguments.

from collections import namedtuple

Pokemon = namedtuple('Pokemon', ['pokedex', 'name', 'region'])

pokemons = [
  Pokemon(1, 'Bulbasaur', 'Kanto'),
  Pokemon(133, 'Eevee', 'Kanto'),
  Pokemon(249, 'Lugia', 'Johto')
]

for pokemon in pokemons:
  match pokemon:
    case Pokemon(region="Kanto"):
      print('From Kanto!')
    case Pokemon(region="Johto"):
      print('From Johto!')
    case _:
      print('Unknown region')

I find this so much more readable than:

from collections import namedtuple

Pokemon = namedtuple('Pokemon', ['pokedex', 'name', 'region'])

pokemons = [
  Pokemon(1, 'Bulbasaur', 'Kanto'),
  Pokemon(133, 'Eevee', 'Kanto'),
  Pokemon(249, 'Lugia', 'Johto')
]

for pokemon in pokemons:
  match pokemon:
    case (_, _, 'Kanto'):
      print('From Kanto!')
    case (_, _, 'Johto'):
      print('From Johto!')
    case _:
      print('Unknown region')

and if you need to capture other values to be used inside your case statements, you can do that too:

from collections import namedtuple

Pokemon = namedtuple('Pokemon', ['pokedex', 'name', 'region'])

pokemons = [
  Pokemon(1, 'Bulbasaur', 'Kanto'),
  Pokemon(133, 'Eevee', 'Kanto'),
  Pokemon(249, 'Lugia', 'Johto')
]

for pokemon in pokemons:
  match pokemon:
    case Pokemon(name=name, region="Kanto"):
      print(f'{name} is from Kanto!')
    case Pokemon(name=name, region="Johto"):
      print(f'{name} is from Johto!')
    case _:
      print('Unknown region')

I think the object matching pattern is an elegant and beautiful piece of Python. Since it works on partial matches of attributes, you can skip all the bits that don’t matter and only write the ones that you want to match against or capture.