Finally, it’s December ❤️. I could barely sleep last night as I was so excited to get started.

This year it seems that the Chief Historian who’s always present when Santa leaves to see all the kids is missing and we need to help the elves to find him.

For this year’s first puzzle, we’re presented with two lists of location IDs that look like this:

3   4
4   3
2   5
1   3
3   9
3   3

but the lists aren’t the same.

Reading the data

Today’s puzzle is one where my selected style of reading the input results in inefficiencies in performance but with the puzzle’s sample size it’s not an issue. The problem is that my approach operates line-by-line always returning a single item for a single line.

So my mapping function (see Day 00 if you wanna see how my setup works) today looks like this:

def map_fn(line):
    left, right = line.split()
    return int(left), int(right)

I split the line by whitespace with str.split and convert them into numbers with int.

A more sensible approach with this particular input would be to start with two empty lists, process the input line by line and append numbers into their left/right lists. Something along the lines of:

lefts = []
rights = []
 
with open('../inputs/day_1.txt', 'r') as inputs:
  for line in inputs:
    left, right = line.split()
    lefts.append(int(left))
    rights.append(int(right))

But I want to stay consistent with my tooling as long as it doesn’t introduce performance issues and almost always Advent of Code puzzle inputs are small enough to allow that.

Part 1: Distances

For the first part of the puzzle, we need to match the smallest number on the left with the smallest on the right and calculate their absolute difference. Do that with every number on the list and the result is the sum of all the differences.

First we need to split the list of input pairs into two lists: lefts and rights since we used the first approach in reading the data in.

There are a couple of ways to do it. My first approach was to iterate over the list twice and take the relevant items out:

lefts = sorted([pair[0] for pair in inputs])
rights = sorted([pair[1] for pair in inputs])

There is a handy, albeit not the most readable trick that one can do in Python to achieve the same using zip and the splat (*) operator:

lefts, rights = zip(*inputs)

The zip + splat combo will likely make a reappearance in later puzzles as well since it is a handy way to transpose a matrix by switching columns and rows like we are doing here for our list of lists.

def calculate_distance(inputs):
    """Calculate distance between two lists by calculating
    the absolute difference between the smallest items, 2nd
    smallest items etc and summing them up."""
 
    lefts, rights = zip(*inputs)
    lefts = sorted(lefts)
    rights = sorted(rights)
 
    total_distance = 0
    for left, right in zip(lefts, rights):
        distance = abs(left - right)
        total_distance += distance
 
    return total_distance

Once we have our two lists sorted, we return them into one list with zip and now we can iterate over them one pair at a time. Since they are both sorted, we’re now operating on them as the puzzle required: first smallest from each, then the second smallest from each and so on.

We calculate the absolute difference with the built-in abs function and sum them all up.

⭐️ First star achieved!

Part 2: Similarities

In the second part, we need to calculate a similarity score for the two lists.

To calculate that, we need to check each number on the left list and see how many times it appears in the second one. We then multiply the number itself by the count and sum them all up.

Structurally this is very similar to the first part’s solution:

def calculate_similarity_scores(inputs):
    """Calculate similarity scores by multiplying each
    left item in a tuple with how many times it appears
    as the right item in the list of tuples
    """
 
    lefts, rights = zip(*inputs)
 
    total_similarity_score = 0
    for number in lefts:
        appears_in_right = rights.count(number)
        similarity_score = number * appears_in_right
        total_similarity_score += similarity_score
 
    return total_similarity_score

Here we don’t need to sort the lists and we only loop the left list. For each number on the left list, we use list.count which gives us the number of times the given item appears in the list.

We multiply the number by its appearances and sum them all up.

⭐️ Second star achieved!

The first day was a nice warm up exercise!

Alternative solution with Counter

After finishing and publishing my solution, many people mentioned how the second part is a prime use case for collections.Counter.

def calculate_similarity_scores(inputs):
    lefts, rights = zip(*inputs)
    counts = Counter(rights)
 
    total_similarity_score = 0
    for number in lefts:
        similarity_score = number * counts[number]
        total_similarity_score += similarity_score
 
    return total_similarity_score

Here, we create a Counter objects from the numbers on the right column. Counter is a dictionary where the keys are the values in an iterable and the values are how many times that value appears in the iterable.

The main difference to my first solution is that now we only iterate over the right column once when we create the Counter object. Earlier, we iterated over it every time when rights.count was called.

With this small input it doesn’t matter much but it’s good to know Counter exists. We’ll surely use it a couple of times this month.