Day 2 - Repeating digits in really big numbers
Created on Tuesday, December 2, 2025.
2025 Day 2 - Repeating digits in really big numbers
And after what feels like a terrible day 1 for me, we're into one of my least favourite types of problem, where we are matching digits.
We're going to be given a range, so 11-22, which can be expanded into the numbers 11,12,13,14,15,16,17,18,19,20,21,22, and then we look through that range and determine how many items in that range have repeating digits. But it's not just any repeating digits, the total ID must be made only of repeating digits, so the range 998-1012 which expands to 998,999,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012, only 1010 is invalid because it's 10``10, something like 1001 is not a repeating digit.
We're going to have to convert forwards and backwards between numbers and strings a lot in this one.
So, unlike normally, I'm going to try to break this problem down into simple functions and their tests and probably mostly work backwards instead of forwards.
We'll start with has_repeated_sequence_twice, which is going to take an id in string form, and we're going to very simply split the string in half and compare whether the first half and the second half are the same.
First a simple test with a set of test cases
#[test]
fn test_has_repeated_sequence_twice() {
for (expected, input) in [
(true, "11"),
(false, "13"),
(false, "15"),
(true, "22"),
(false, "98"),
(true, "99"),
(false, "100"),
(true, "1188511885"),
] {
assert_eq!(has_repeated_sequence_twice(input), expected);
}
}
And then a simple implementation
pub fn has_repeated_sequence_twice(input: &str) -> bool {
let (a, b) = input.split_at(input.len() / 2);
return a == b;
}
Now we have that, lets test the ability to create a sequence from a range. We want to take a start and end value, which are inclusive and integers, and generate each string digit between them.
Again, a simple test first
#[test]
fn test_generate_id_sequence() {
for (expected, start, end) in [
(vec!["11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22"], 11,22),
(vec!["95", "96", "97", "98", "99", "100", "101", "102", "103", "104", "105"], 95,105),
] {
assert_eq!(generate_id_sequence(start, end), expected);
}
}
And now a simple implementation I hope
pub fn generate_id_sequence(start: u32, end: u32) -> Vec<String> {
let mut result = Vec::new();
for i in start..end+1 {
result.push(i.to_string())
}
result
}
Couple of thoughts up front, I've never quite got the hang of when to use str and when to use String in rust, and this is another case where I initially went for a Vec<&str>, but to_string produces String objects. The compiler warned me, and doing as_str on the String got me further warnings, so it's a Vector of Strings rather than a Vector of strings.
Secondly, I'm not overly happy with the mutable vector here. This feels like it should be a map and be a single one liner. I think we'll do that, and see if it's neater. The nice thing about having the test now is that I can fiddle to make the code nicer.
Final note, we have to do end+1 because for start..end in rust is non-inclusive, it starts at start and goes all the way up to end-1, but doesn't include end itself.
Map versions
pub fn generate_id_sequence(start: u32, end: u32) -> Vec<String> {
(start..end+1).map(|digit|digit.to_string()).collect()
}
That's much nicer for this function.
So now we can work on parsing our inputs, we want to take a form of 123-321, so we'll parse each line, split on the -, and then return a tuple of integers.
#[test]
fn test_parse_id_range() {
for (expected, id_range) in [
((11,22), "11-22"),
((95,105), "95-105"),
((38593856,38593862), "38593856-38593862"),
] {
assert_eq!(parse_id_range(id_range), expected);
}
}
pub fn parse_id_range(id_range: &str) -> (u32, u32) {
let mut ids = id_range.split("-").map(|id|id.parse::<u32>().expect("Expected a whole number"));
(ids.next().unwrap(),ids.next().unwrap())
}
Not entirely happy with the next().unwrap() calls, I feel like a match statement would be better here, but I can feel like we've got all the parts we need.
So now we're going to implement our parse_input, which will take a comma separated list of id ranges, and turn them into id_ranges.
Then our part 1 will connnect everything together, we'll generate a list of id-ranges, for each id range we'll generate the list of ids, and then we'll filter that list of ids based on whether they are a repeated sequence twice. That should all filter down to a list of the ID's that are repeated. We'll turn them all back into numbers (again) and then we'll sum them all up!
That sounds like a lot, so we're going to break it into two steps, which we can test individually. Firstly, given the sequence of id_ranges, can we just create a single flat list of invalid ids?
Secondly, can we sum up that list?
So lets find invalid ids
#[test]
fn test_find_invalid_ids() {
for (expected, input) in [
(vec![11,22,99,1010, 1188511885], vec![(11,22),(95,115),(998,1012),(1188511880,1188511890)]),
] {
assert_eq!(find_invalid_ids(input), expected);
}
}
pub fn find_invalid_ids(ids: Vec<(u32, u32)>) -> Vec<u32> {
ids.iter()
.map(|(start, end)| generate_id_sequence(*start, *end))
.map(|id_sequence| {
id_sequence
.iter()
.filter(|id| has_repeated_sequence_twice(&id))
.map(|id| id.parse::<u32>().expect("Expected integer ids"))
.collect::<Vec<u32>>()
})
.flatten()
.collect()
}
This is a bit complicated, and rust needs some type hints, but broadly speaking, we take each of the start,end pairs, we map that to create a sequence. Then for each sequence we go over the ids in teh sequence, filtering id's that match our repeated_sequence function, turn those back into integers, and then we collect it back into a new vector of integers. That gives us a sequence of sequences, some are empty, some have 1 item, some have multiple items, so we flatten that down into a single long sequence and turn that back into a vector!
That makes our calculation of part 1 really quite simple, we parse the string into sequences of pairs, call the above function, and then sum up the total.
pub fn calculate_part1(input: &str) -> usize {
let ids = parse_input(input);
find_invalid_ids(ids).into_iter().sum::<u32>() as usize
}
That all looks good, so we run it on the input and... bang, error! We got a ParseIntError { kind: PosOverflow }. I did vaguely wonder this when typing my tests, but I took shortcuts. What's going on is that some of the numbers are bigger than can fit in a 32bit integer. I've been using u32 throughout as a shorthand, but some of these ranges did seem a little big.
In the tests, I got bored of typing out some of the long numbers, so I did most of my tests with just the first couple of test values. Lets add in some of the higher numbers and see if I can make the tests break.
Didn't find one in the test range, the largest was 1188511880-1188511890 which fits in 32 bits, but in my actual input data, I found a very large range, 777695646-7777817695, which does not fit in u32. Luckily, rust supports much higher sized numbers, so we can switch to u64 throughout and see if that helps.
That works, so we're done, and onto part 2
Part 2 - Over and over again
We're back after a day of work for me, and time to tackle part 2.
We now have to detect not just a single repetition, but any number of repetitions, so 1212121212 has 12 five times, 1111111 has 1 7 times and so on.
Having seen some things online, I've realised that approaching this as a regex in part 1 would make this significantly easier, but I think we've already committed to my string search ability, and devising and checking subsets of a set isn't that hard to implement, or at least if you don't care too much about performance at least.
What we're going to do is break each string down into subsets, and then check for repeated matches of the subset. So taking 123123123, we'll start by taking the first half, 12312 and comparing that against the remainder 3123, which doesn't match. We now tray taking a smaller set, 1231 and comparing against first 4 of the remainder 2312, which isn't a match. We then take the smaller set 123 and comparing against the first 3 of the remainders 123 whcih matches, so we consume and try again, 123, and it's a match.
We can probably add some optimisations if we want, so if the length of the remainder isn't exactly a multiple of the search string, it can't be a perfect match. Equally, we can stop searching when we find a match, so we'll use a lazy iterator to generate the subsets. I suspect we could do something interesting with working out lowest common multiples of the length, so for a string length 9, only (3,3) and (1,9) are multiples. But calculating those might be more than it's worth.
First we'll want some interesting test cases, 123123123 is a good one, length 3, repeated 3 times. 1212123 should match repeatedly for size 2 leaving 3 left at the end. 1111111 is another good match, it matches only for 1 because 7 is prime, so there are no other multiples.
We're going to work inside out for the implementation, so my first thought is to write a function that can check whether a substring is a repeated match for a string.
pub fn is_only_repeats(substr: &str, id: &str) -> bool {
let mut remainder = id.to_string();
if substr == id { return false } // Shortcircuit for matching itself
while remainder.starts_with(substr) {
remainder = remainder.split_at(substr.len()).1.to_string()
}
return remainder == ""
}
I'm not a huge fan of this, I think that I'd much rather use itertools and something like chunks, so produce an iterator that is exactly substr.len() characters long, and then confirm that they all match a predicate that they can compare with substr, but actually dealing with the chunks iterator is a pain and I'm in bad signal so can't check the manual easily. So this'll do for now, but I'm sure there should be a method to produce an iterable of substrings which would solve my problem.
Next we want to generate each of the potential substrings of a given string, which we can then feed into is_only_repeats.
pub fn generate_substrings(id: &str) -> Vec<&str> {
let mut result = Vec::<&str>::new();
for x in 1..id.len()/2+1 {
result.push(&id[0..x])
}
result
}
Now we can combine these together to find any string with repeated substrings
generate_substrings(id).iter().any(|substr|is_only_repeats(substr, id))
Now our part two needs to reimplement the find_invalid_ids to use our new function, and then we can test it and run it.
And that works. A few bits I'd love to tidy up, and I think there's a lot of room for performance optimisation but runs in under 2 seconds on my laptop, so that's enough for a day.
Next
Day 3 - Finding the biggest numbers
Previous