Meet Rajesh Gor

Advent of Code Day 3 in Golang: Do Or Don’t Regex

Inspired by Shradha Agarwal’s post on Byte Size Go :here: I decided to write about my approach to this, it’s different, and would like to share it. That post was well written and the solution was compact and simple, I recommend reading that first as well.

That is a blogvent series, I would also love to take part in blogvent but can’t be sure I’ll be completing this.

Introduction

Well, it is day 3 of the advent of code 2024, and I have been doing it on live streams. I am behind two days but working through them one by one. So far, I have learned a lot of things in Go. Let’s dive in for the day 3.

Part 1

Part one to any AOC problem seems straightforward, but as soon as part two is revealed, the real implementation starts to break the sweat (if you weren’t optimistic or thoughtful)

For part 1 for this day, was to parse a string that contained mul(X,Y) an expression, where X could be any 3-digit number. So, there could be multiple such expressions within the string and the purpose was to multiply the X and Y in the individual expression and add them up.

AOC Day 3 Part 1


xmul(2,4)&mul[3,7]!^don't()_mul(5,5)+mul(32,64](mul(11,8)undo()?mul(8,5))

In this example above there are 4 such expressions, and if we add the multiplications of those, we get the answer as 161.

Approach

It looks like a Regex pattern, finding an expression-like pattern in a string. So, the approach would be to find such expressions with a regex pattern and parse the numbers to integers, and multiply them, simply.

You could go ahead and write the parser for iterating over each character in the string and parsing the tokens, then evaluating the expression. That is a valid approach, but I choose to do this because I don’t know how to write a parser honestly, I want to try that solution at the end as well.

But for the first part, a quick regular expression seems a good idea.

Constructing the Regular Expression

The first thing is to write the regular expression for the mul(X,Y) part which is the only challenging section in part one. The rest is just simple math.

So, we need to find mul, then a ( then any number which is 1 to 3 digits long, then , and again a number that is 1 to 3 digits long, and finally ends with a )

That translates to:

mul\((\d{1,3}),(\d{1,3})\) 

Let’s breakdown:

  • mul for capturing the literal word mul

  • \( this is for the first parenthesis in the expression mul() , we need to escape the bracket in a regular expression if we want to match it, so we use \ before it.

  • Then we have a match group (\d{1,3}) , this will be the X in mul(X,Y):

    • A match group is like a group of a match in a regex, basically, if you want to capture specific parts in the entire match, then we use () to group them individually, this is not necessary but helps in getting the right things without overhead

    • In this case, we are using a match group for capturing a number which can have 1 to 3 digits.

    • The other way to write this is ([0-9]{1,3}) , which also would do the same thing, (NOTE: there are some differences in [0-9] and \d, but that is very subtle and won’t affect this puzzle, if you are still curious, I prefer reading this StackOverflow question)

  • We then use , for the separator of X and Y operands in the mul(X,Y) expression

  • We then similarly do the match for Y in mul(X,Y) with the (\d{1,3}) match group

  • Finally, we end the regular expression with the ) to end the expression

Code it

This is quite straightforward, we grab the line as a string and use regexp.MustCompile function which gives us a Regexp object, that in turn has a few methods associated with it to find, match, replace, and other things that can be done with a regular expression on a string.

Once we have the mulRegex , we can use the FindAllStringSubmatch method associated with the Regexp struct in the regexp package. The function takes in a string to perform the regex on, and the maximum number of matches to return. We want all the results, so we pass them in -1 to get all the matches.

Now, this method returns a slice of a slice of strings, each slice is a match, and within each slice, there is a slice of string, with the matched string and the subsequent match groups in the string if any.

func FindMulExpression(line string) [][]string {
  mulRegex := regexp.MustCompile(`mul\((\d{1,3}),(\d{1,3})\)`)
  return mulRegex.FindAllStringSubmatch(line, len(line))
}

So, the above function will return something like this

[
    [mul(2,4) 2 4]
    [mul(5,5) 5 5]
    [mul(11,8) 11 8]
    [mul(8,5) 8 5]
]

This is a list of list of strings, these look like numbers but those are string types in Golang.

Now we have this, we can create the actual logic part of obtaining the result, which is to parse these expressions, multiply X and Y and add the results for each of the expressions.

func Multiply(matches [][]string) int {
	score := 0
	for _, match := range matches {
		x, err := strconv.Atoi(string(match[1]))
		if err != nil {
			panic(err)
		}
		y, err := strconv.Atoi(string(match[2]))
		if err != nil {
			panic(err)
		}
		score += x * y
	}
	return score
}

This is pretty straightforward, we iterate over each of the matches, that is one mul(X,Y) expression and parse the X and Y each into integers and multiply them, the result obtained is then added to the score. We do this for each mul(X,Y) expression that is found in the string(line) and return the final score.

Input

Now, the example gave us a single string, but I realized there were six lines in the input (individual input), so we needed to parse each line and add up the score.

Remember this as it will be critical in part 2, it took me some time and questioning my existence to realize we need to combine all the lines to get the result (not necessary in part 1 but for sure in part 2).

Part 2

This is where things go wrong usually. At least for me, it did.

I started with a very naive approach, with a forever loop and finding the index of do or don’t and stripping off those sections, and then looping until we have no do and don’ts left. That was working for the test case but failed on the actual input.

The approach I finally came up and was working by tweaking the same approach slightly.

Approach

What I came up with is to find the first match location of don’() and do() strings in the entire string, we take that and remove the parts after don’t() or before do() . That way we can trim down the string to only valid/enabled mul()expressions.

AOC Day 3 Part 2

So, the approach more clearly defined will be:

  • Find the location (index) of the don’t() and do() expressions

  • We need to keep track of whether the previous string was enabled or disabled so will keep a flag to append the enabled expressions (part of the string) to the final result

  • If none of them are found, return the string(line) as it is

  • If we found either of them then:

    • If we find don’t first (don’t() appears before do())

      • If the enabled flag is true → append the string before the don’t() expression

      • Then toggle the enabled as false and trim off the don’t() part
        (This way we have completed checking everything before the don’t expression, so we remove that part from the line string)

    • If we find do first (do() appears before don’t())

      • If the enabled flag is true → append the string before the do() expression

      • Then toggle the enabled as true and trim off the do() part
        (This way we have completed checking everything before the do expression, so we remove the part from the line string)

  • We do this until there is no line string left to be checked

Code

I used simple Strings.Index to get the first match index for the substring, In this case, I want the first matching index for don’t() and do() . Once we have the indices of both the matches, we can iterate over till we are not left with any do or don’ts in the string.

If we have either do or don’t we append to the string the part before don’t if enabled or part before do if enabled and toggle on and off the enabled flag accordingly. By the end of the loop, the result string will have only the enabled parts of the line/string.

func StripDoDont(line string) string {
    result := ""
    enabled := true
    dontOffset := len("don't()")
    doOffset := len("do()")

    for len(line) > 0 {
        dontIndex := strings.Index(line, "don't()")
        doIndex := strings.Index(line, "do()")

        if dontIndex == -1 && doIndex == -1 {
            if enabled {
                result += line
            }
            break
        }
        
        if dontIndex != -1 && (doIndex == -1 || dontIndex < doIndex) {
            if enabled {
                result += line[:dontIndex]
            }
            enabled = false
            line = line[dontIndex+dontOffset:]
        } else {
            if enabled {
                result += line[:doIndex]
            }
            enabled = true
            line = line[doIndex+doOffset:]
        }
    }

    return result
}

I take this function and pass it to the multiply function where I get the matching patterns for the mul expression and do the math.

The strings.Index method takes in a string and a substring to find within that string and returns the index of the first occurring instance of that substring. With that we can identify if the line string contains the do() or don’t() expressions, if they don’t we simply return the line and if there are instances of them, we loop and trim the string before and after the expressions depending on whether the flag is enabled or disabled.

Let’s take an example and walk through the logic:

abcxmul(1,3)don't()mul(9, 7)do()mul(1,2)don't()mul(8,7)

enabled = True
result = ""
line = "abcxmul(1,3)don't()mul(9, 7)do()mul(1,2)don't()mul(8,7)"
---
After Iteration 1:
    result -> abcxmul(1,3)
    line -> mul(9, 7)do()mul(1,2)don't()mul(8,7)
    enabled = False
---
After Iteration 2:
    result -> abcxmul(1,3)
    line -> mul(1,2)don't()mul(8,7)
    enabled = True
---
After Iteration 3:
    result -> abcxmul(1,3)mul(1,2)
    line -> mul(8,7)
    enabled -> False
---
After Iteration 4:
    No do and don't found
    result -> abcxmul(1,3)mul(1,2)
    break out of loop
---

Result -> abcxmul(1,3)mul(1,2)

We process the result with the same Multiply function that we used in the first part after passing it through the FindMulExpression function that will return all the mul expressions in the result line string.

Heads up with the input

The actual input of the puzzle is I think multiple lines, so we need to preserve this state of the line in all the remaining lines. OR, we could be smarter and create a single large string and process that. Both are valid and would give the same results. I just didn’t like to add an overhead of keeping track of all the states and line, so I just concatenated all the lines and processed that single string.

Conclusion

This was a simple problem in essence but if you are not aware of regex you could go down a rabbit hole of writing your own parser or wired string manipulation (just like I did).

That’s it form day 3, I will be doing more live stream solving over the weekend and may be the next week as well. Find the code for my AoC solutions here on GitHub.

Till then,

Happy Coding :)