Python prep for Advent of Code 2022

Nov 30th, 2022

by Juha-Matti Santala

This was published in categories: advent-of-code python

We are 1 day away from this year's Advent of Code, the annual software development puzzle solving Christmas calendar. Last year, I solved nearly all puzzles and did so with Python and Jupyter Notebooks in my toolbox. I've also spent some time since that working on previous years' puzzles with the same setup.

This year, you can follow my progress and writings in GitHub: https://github.com/Hamatti/adventofcode-2022

In this blog post, I want to share some of the approaches and especially standard library modules and functions that I've shared in the past as part of my solutions. Getting familiar with these not only helps you solve Advent of Code solutions but also expands your real-life Python toolkit.

Building your own utility library

My first tip is to build a utility library to help with the things you end up doing every day. Mine usually starts with one function and then I might add some throughout December if I need some more special cases:

import os


def read_input(day, transformer=str, example=False):
    """
    Given a day number (1-25), reads the corresponding input file into
    a list with each line as an item.

    Runs transformer function on each item.
    """
    try:
        if example:
            filename = f'day_{day}_example.txt'
        else:
            filename = f'day_{day}.txt'
        with open(os.path.join('..', 'inputs', filename)) as input_file:
            return [transformer(line.strip()) for line in input_file]
    except FileNotFoundError as e:
        print(e)

My first function is read_input and it's a helper that reads data from inputs/ folder. It takes three arguments: day is an integer to tell which day's input we want to read, transformer is a function (defaulting to str) that is run on each line of the input. example enables me to store the provided examples as input files and run the same code on both the example and the actual input.

Transformer function is an interesting and very helpful one. I want to keep my puzzle solution code as focused on the puzzle as possible and not parsing the input. So I can create a function that gets passed into the input reader. If the values are numbers, I can pass int. If the input is list of comma separated items, I can pass a function that splits on comma. And so on.

Since these actions are things you do every single day with every puzzle, writing a function for it is very helpful.

My utility library is these days only about reading inputs but there's a lot of opportunities to build your own in a way that helps you the best. A good example for inspiration is Vilho's aoc-lib for Java that manages boilerplate creation, input file downloading, caching, benchmarking and other features.

This fall I started experimenting with a Firefox extension that when clicked, would copy that day's puzzle input to my clipboard to save me from a few clicks whenever I'd need that data. I haven't published it yet for Firefox though as I'm

zip & zip_longest

For Advent of Code puzzles, zip is such a lifesaver. It combines (zips) multiple iterators into a single one in a way where the individual items from each iterable are paired/grouped together:

dogs = ['Frank', 'Lassie', 'Milo', 'Dug']
movies = ['Men in Black', 'Lassie', 'The Mask', 'Up']

for dog, movie in zip(dogs, movies):
  print(f'{dog} is from {movie}')

# Frank is from Men in Black
# Lassie is from Lassie
# Milo is from The Mask
# Dug is from Up

If the iterables you give to zip are not the same length, zip will just ignore the extras:

numbers = [1, 2, 3, 4]
letters = ['a', 'b', 'c']

for number, letter in zip(numbers, letters):
  print(number, letter)

# 1 a
# 2 b
# 3 c

In cases where you are expecting the original iterables to be of same length, you can add a safeguard with strict=True parameter to zip as that will raise a ValueError if the arguments are of different length. The output will be the same as above for equal length arguments. This requires Python 3.10.

If you need all of the items to be included, you can use itertools.zip_longest:

from itertools import zip_longest

numbers = [1, 2, 3, 4]
letters = ['a', 'b', 'c']

for number, letter in zip_longest(numbers, letters):
  print(number, letter)

# 1 a
# 2 b
# 3 c
# 4 None

You can define a fillvalue if you want the missing pairs to be filled a different default than None:

from itertools import zip_longest

items = ['apple', 'banana', 'pear', 'mango']
counts = [5, 12, 3]

for item, count in zip_longest(items, counts, fillvalue=0):
  print(item, count)

# apple 5
# banana 12
# pear 3
# mango 0

It's important to note that both zip and zip_longest return an iterator and if you try to print it as-is, you see something like <zip object at 0x100fbb440>. You can iterate over it (for example in for loops) or pass it to any function that accepts iterables but if you need to turn it into a list specifically, you need to wrap it inside list(your_zip) call.

collections.namedtuple

Collections module contains some really nice utilities for puzzles.

I can't emphasize enough how much I like namedtuple. Giving meaningful names to your data is so crucial and namedtuple offers a few nice additions on top of just giving your data types aliases.

Let's take a look an example

from collections import namedtuple

Instruction = namedtuple('Instruction', ['direction', 'distance'])

instructions = []
raw_data = ['R,15', 'L,20', 'U,17', 'L,42', 'D,3']
for datum in raw_data:
  dir, dist = datum.split(',')
  instructions.append(Instruction(dir, int(dist)))

print(instructions)
# [Instruction(direction='R', distance=15), Instruction(direction='L', distance=20), Instruction(direction='U', distance=17), Instruction(direction='L', distance=42), Instruction(direction='D', distance=3)]

Here, we have raw data in a form that we could expect for a Advent of Code puzzle: a series of instructions with a direction and distance. We could store this data in regular lists or tuples but with them, 1) we'd have to refer to individual items with numeric indices, 2) it's easy to make mistakes and 3) the output for debugging printing is less useful.

With namedtuples, we get a nice formatted output (Instruction(direction='R', distance=15) compared to regular tuple's (('R', 15)) and we can refer to individual attributes with instruction.direction instead of instruction[0].

I tend to use them a lot in my code for added clarity.

namedtuples are also "backwards" compatible with regular tuples: you can use them in-place anywhere that accepts tuples. It supports indexing with numbers and unpacks like regular tuples.

collections.Counter

Another handy tool from collections is Counter. It is a subclass of dict that takes an iterable as an argument and counts the individual values in that iterable and stores that count data as a dictionary.

from collections import Counter

good_or_bad = [
    'good', 'good', 'good',
    'bad', 'bad', 'good',
    'bad', 'good', 'bad'
]

counts = Counter(good_or_bad)
print(counts)
# Counter({'good': 5, 'bad': 4})

It has a few added methods that regular dictionaries don't have. One good example is most_common. Continuing from the previous example:

print(counts.most_common(1))
# [('good', 5)]

The method takes an argument n and returns a list of tuples for n most common items. It's useful when you have a lot of data and you just want to inspect for example the top 3.

collections.deque

Last year's Advent of Code was probably the first time I ever used deque. It's a last-in, first-out (LIFO) data structure "by default" (see below the first code example) with added features. You can append data to the deque and when you pop from it, it removes the latest added item from the deque and returns it.

In Day 10, 2021 I used it to check if nested, matching pairs of parentheses, brackets and braces were in correct order:

from collections import deque


def check_errors(navigation):
    pairs = {
        '(': ')',
        '[': ']',
        '{': '}',
        '<': '>'
    }

    openers = deque()
    errors = []
    for line in navigation:
        for char in line:
            if char in pairs.keys():
                openers.append(char)
            else:
                opener = openers.pop()
                if char != pairs[opener]:
                    errors.append(char)
                    break
    return errors

Python's deque (short for double-ended queue) can also append or pop to the left side of the queue, making it possible to use it as a LIFO, FIFO (first-in, first-out), FILO (first-in, last-out)

itertools.pairwise

A relatively recent addition, entering the Python standard library in 3.10 last year, pairwise is another great addition to puzzle solving.

It takes an iterable and creates pairs of consecutive items:

from itertools import pairwise

letters = ['A', 'B', 'C', 'D', 'F', 'G']

for first, second in pairwise(letters):
    print(first, second)

# A B
# B C
# C D
# D F
# F G

An example of this in the context of Advent of Code, last year I used it day 14, 2021. If you are not using Python 3.10 or newer, you can build your own pairwise functionality, for example by using the method I used in day 1, 2021 to create sliding windows.

itertools.product

Let's say you need a set of coordinates for x & y, for both being between 0 and 10. One way you could do that is with a cartesian product using itertools.product:

from itertools import product

xs = range(10)
ys = range(10)

coordinates = product(xs, ys)

print(list(coordinates))

## Prints

[
  (0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9),
  (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9),
  (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9),
  (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9),
  (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9),
  (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9),
  (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9),
  (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7), (7, 8), (7, 9),
  (8, 0), (8, 1), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), (8, 8), (8, 9),
  (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8), (9, 9)
]

Many Advent of Code puzzles are happening on some sort of coordinate system. My experience and techniques for solving those has changed and evolved a lot over time as I've realized easier and more performant ways to solve them.

For example, these days I often store data in a dictionary for only those coordinates that have a truthy value and dealing with the full grid of coordinates via looping only when needed (often in debugging prints) and direct math on coordinates on other cases (like determining effect on neighboring cells).