Juha-Matti Santala
Community Builder. Dreamer. Adventurer.

Advent of Code - 2021

This is a solution to Day 10 of Advent of Code 2021.

Day 10 - Syntax Scoring

You ask the submarine to determine the best route out of the deep-sea cave, but it only replies:

Syntax error in navigation subsystem on line: all of them

All of them?! The damage is worse than you thought. You bring up a copy of the navigation subsystem (your puzzle input).

The navigation subsystem syntax is made of several lines containing chunks. There are one or more chunks on each line, and chunks contain zero or more other chunks. Adjacent chunks are not separated by any delimiter; if one chunk stops, the next chunk (if any) can immediately start. Every chunk must open and close with one of four legal pairs of matching characters:

  • If a chunk opens with (, it must close with ).
  • If a chunk opens with [, it must close with ].
  • If a chunk opens with {, it must close with }.
  • If a chunk opens with <, it must close with >.

So, () is a legal chunk that contains no other chunks, as is []. More complex but valid chunks include ([]), {()()()}, <([{}])>, [<>({}){}}[([])<>]], and even (((((((((()))))))))).

Some lines are incomplete, but others are corrupted. Find and discard the corrupted lines first.

A corrupted line is one where a chunk closes with the wrong character - that is, where the characters it opens and closes with do not form one of the four legal pairs listed above.

Examples of corrupted chunks include (], {()()()>, (((()))}, and <([]){()}[{}]). Such a chunk can appear anywhere within a line, and its presence causes the whole line to be considered corrupted.

For example, consider the following navigation subsystem:


[({(<(())[]>[[{[]{<()<>>
[(()[<>])]({[<{<<[]>>(
{([(<{}[<>[]}>{[]{[(<()>
(((({<>}<{<{<>}{[]{[]{}
[[<[([]))<([[{}[[()]]]
[{[{({}]{}}([{[{{{}}([]
{<[[]]>}<{[{[{[]{()[[[]
[<(<(<(<{}))><([]([]()
<{([([[(<>()){}]>(<<{{
<{([{{}}[<[[[<>{}]]]>[]]

Some of the lines aren't corrupted, just incomplete; you can ignore these lines for now. The remaining five lines are corrupted:

  • {([(<{}[<>[]}>{[]{[(<()> - Expected ], but found } instead.
  • [[<[([]))<([[{}[[()]]] - Expected ], but found ) instead.
  • [{[{({}]{}}([{[{{{}}([] - Expected ), but found ] instead.
  • [<(<(<(<{}))><([]( - Expected >, but found ) instead.
  • <{([([[(<>()){}]>(<<{{ - Expected ], but found > instead.

Stop at the first incorrect closing character on each corrupted line.

Did you know that syntax checkers actually have contests to see who can get the high score for syntax errors in a file? It's true! To calculate the syntax error score for a line, take the first illegal character on the line and look it up in the following table:

  • ): 3 points.
  • ]: 57 points.
  • }: 1197 points.
  • >: 25137 points.

In the above example, an illegal ) was found twice (2*3 = 6 points), an illegal ] was found once (57 points), an illegal } was found once (1197 points), and an illegal > was found once (25137 points). So, the total syntax error score for this file is 6+57+1197+25137 = 26397 points!

Read input

from utils import read_input

navigation = read_input(10)

Part 1

In this puzzle, we get to use deque! It's a data structure that's effective for last-in, first-out (LIFO) where we'll add and remove items to/from the end.

To find an unmatching pair, we keep adding all the opening brackets into a deque, then every time we run into a closing bracket, we check if the last added opener is a match and if it is, we remove it and continue. Once we find a closer that is not matched with a corresponding opener, we stop, note it down as an error and move on to the next line.

from collections import deque


def check_errors(navigation):
    pairs = {
        '(': ')',
        '[': ']',
        '{': '}',
        '<': '>'
    }

    openers = deque()
    errors = []
    for line in navigation:
        for char in line:
            if char in pairs.keys():
                openers.append(char)
            else:
                opener = openers.pop()
                if char != pairs[opener]:
                    errors.append(char)
                    break
    return errors

Scoring is quite straight-forward: we have a scoreboard for each mistaken closer and we sum over those points.

def score_errors(errors):
    scoreboard = {
        ')': 3,
        ']': 57,
        '}': 1197,
        '>': 25137
    }

    score = 0
    for error in errors:
        score += scoreboard[error]
    return score

Find the first illegal character in each corrupted line of the navigation subsystem. What is the total syntax error score for those errors?

found_errors = check_errors(navigation)
error_score = score_errors(found_errors)
print(f'Solution: {error_score}')
assert error_score == 345441

Solution: 345441

Part 2

Now, discard the corrupted lines. The remaining lines are incomplete.

Incomplete lines don't have any incorrect characters - instead, they're missing some closing characters at the end of the line. To repair the navigation subsystem, you just need to figure out the sequence of closing characters that complete all open chunks in the line.

You can only use closing characters (), ], }, or >), and you must add them in the correct order so that only legal pairs are formed and all chunks end up closed.

In the example above, there are five incomplete lines:

  • [({(<(())[]>[[{[]{<()<>> - Complete by adding }}]])})].
  • [(()[<>])]({[<{<<[]>>( - Complete by adding )}>]}).
  • (((({<>}<{<{<>}{[]{[]{} - Complete by adding }}>}>)))).
  • {<[[]]>}<{[{[{[]{()[[[] - Complete by adding ]]}}]}]}>.
  • <{([{{}}[<[[[<>{}]]]>[]] - Complete by adding ])}>.

Did you know that autocomplete tools also have contests? It's true! The score is determined by considering the completion string character-by-character. Start with a total score of 0. Then, for each character, multiply the total score by 5 and then increase the total score by the point value given for the character in the following table:

  • ): 1 point.
  • ]: 2 points.
  • }: 3 points.
  • >: 4 points.

So, the last completion string above - ])}> - would be scored as follows:

  • Start with a total score of 0.
  • Multiply the total score by 5 to get 0, then add the value of ] (2) to get a new total score of 2.
  • Multiply the total score by 5 to get 10, then add the value of ) (1) to get a new total score of 11.
  • Multiply the total score by 5 to get 55, then add the value of } (3) to get a new total score of 58.
  • Multiply the total score by 5 to get 290, then add the value of > (4) to get a new total score of 294.

The five lines' completion strings have total scores as follows:

  • }}]])})] - 288957 total points.
  • )}>]}) - 5566 total points.
  • }}>}>)))) - 1480781 total points.
  • ]]}}]}]}> - 995444 total points.
  • ])}> - 294 total points.

Autocomplete tools are an odd bunch: the winner is found by sorting all of the scores and then taking the middle score. (There will always be an odd number of scores to consider.) In this example, the middle score is 288957 because there are the same number of scores smaller and larger than it.

After reading this second part, I was tempted to go and refactor the original function but decided for the sake of preserving my thought process to keep it and rewrite the function to only return the other side.

If this was production code, I would have done both error finding and incomplete lines filtering in the same function so avoid double-looping. This is the challenge with Advent of Code puzzles sometimes: they are single-run and sometimes the first part's solution can become a bit confusing if mixed with the second part's code. I do it occasionally but today decided not to.

(See Appendix A below for this solution)

This first function is essentially the same as in part 1 but instead of capturing the corrupted lines with mismatching brackets, we capture and collect the ones that did not error out.

This was probably the first time I've ever used Python's for..else structure which is a bit of an odd ball. for/else works in a way that it is a regular for loop to start with but the else clause is executed only if the for loop "ended naturally", meaning there was no break used.

I use it here to find if we did encounter a corrupt line (in which case we break out).

from collections import deque


def analyze_navigation(navigation):
    pairs = {
        '(': ')',
        '[': ']',
        '{': '}',
        '<': '>'
    }

    openers = deque()
    incomplete_lines = []

    for line in navigation:
        for char in line:
            if char in pairs.keys():
                openers.append(char)
            else:
                opener = openers.pop()
                if char != pairs[opener]:
                    break
        else:
            incomplete_lines.append(line)
    return incomplete_lines

To find a solution to an incomplete line, I keep track of unmatched openers.

If we encounter a new opener, we add it to the list.
If we encounter a potential closer, we remove the last matching opener from the list.

At the end, we have a list of unmatched openers so we find corresponding closers and then reverse that to get the final sequence.

To remove the last matching opener, I reverse the list twice which is quite an expensive operation and there's probably a smarter way to do it.

def find_solution(line):
    open_to_closed = {
        '(': ')',
        '[': ']',
        '{': '}',
        '<': '>'
    }
    closed_to_open = {
        ')': '(',
        ']': '[',
        '}': '{',
        '>': '<'
    }

    unmatched = []
    for char in line:
        if char in closed_to_open.values():
            unmatched.append(char)
        else:
            unmatched.reverse()
            unmatched.remove(closed_to_open[char])
            unmatched.reverse()

    return ''.join([open_to_closed[opener] for opener in unmatched][::-1])


def find_solutions(incomplete_lines):
    return [
        find_solution(line) for line in incomplete_lines
    ]

Nothing special happening in scoring either. For each character, we multiple the score by 5 and add the corresponding score value for the closing bracket.

We then sort the scores and get the middle value.

def score_solutions(solutions):
    scoreboard = {
        ')': 1,
        ']': 2,
        '}': 3,
        '>': 4
    }

    scores = []
    for solution in solutions:
        score = 0
        for char in solution:
            score *= 5
            score += scoreboard[char]
        scores.append(score)

    scores = sorted(scores)
    return scores[len(scores) // 2]

Find the completion string for each incomplete line, score the completion strings, and sort the scores. What is the middle score?

incomplete_lines = analyze_navigation(navigation)
solutions = find_solutions(incomplete_lines)
score = score_solutions(solutions)

print(f'Solution: {score}')
assert score == 3235371166 # For refactoring

Solution: 3235371166

Appendix A - Combined Solution

I mentioned in part 2 that I kinda wanted to go back to part 1 and put both of them in the same function. So I did it here.

from collections import deque


def full_analysis(navigation):
    pairs = {
        '(': ')',
        '[': ']',
        '{': '}',
        '<': '>'
    }

    openers = deque()
    incomplete_lines = []
    errors = []

    for line in navigation:
        for char in line:
            if char in pairs.keys():
                openers.append(char)
            else:
                opener = openers.pop()
                if char != pairs[opener]:
                    errors.append(char)
                    break
        else:
            incomplete_lines.append(line)
    return errors, incomplete_lines


p1, p2 = full_analysis(navigation)
p1_score = score_errors(p1)
print(f'Solution for part 1: {p1_score}')

p2_score = score_solutions(find_solutions(p2))
print(f'Solution for part 2: {score}')

Solution for part 1: 345441

Solution for part 2: 3235371166