Leetcode: Optimizing the Performance Through Types

KOH MINGYANG | 2026-01-29

Introduction

Some time two weeks ago, I wrote an article about optimizing Leetcode's performance, mainly through the removal of an extra unordered_set, reserving space in the vector, and using a pragma directive.

Today, I will be looking at how to optimize the performance through a different method for today's Leetcode problem, Minimum Cost to Convert String I.

Methodology

But before we begin, let's discuss the methodology which will be used. First and foremost, I would like to acknowledge that the time you get when you run your code in Leetcode is not fully accurate. I do not intend to judge by the exact milliseconds.

Rather, in the previous article, I judged improvements by looking at significant improvements after submitting identical code a few times.

If I can consistently get an improvment of ~100ms for a ~800ms runtime, I think it will be fair to say that there is an improvement that is coming from the code change.

For the sake of a fair comparison, I will be running the code locally on my machine (M1 Macbook Air), with -o3 flag enabled. The test cases will be manually generated and reused.

Intuition

To solve today's problem, I will need to create an adjacency matrix, and initialize it with the costs of transforming the letters, something like this:

for (int i = 0; i < original.size(); i++) {
            graph[original[i] - 'a'][changed[i] - 'a'] = min(graph[original[i] - 'a'][changed[i] - 'a'], (long long)cost[i]);
        }

Subsequently, I will need use the Floyd Warshall algorithm to fill up the adjacency matrix. In this algorithm, I choose an intermediate node k, and calculate the distance from node i to node k for all nodes.

graph[i][j] = min(graph[i][j], graph[i][k] + graph[k][j]);

Initial code

My initial submission (after fixing bugs caused by integer overflow) looked like this:

class Solution {
public:
    long long minimumCost(string source, string target, vector<char>& original,
                          vector<char>& changed, vector<int>& cost) {
        // Build adj graph
        constexpr int NUM_ALPHABETS = 26;
        array<array<long long, NUM_ALPHABETS>, NUM_ALPHABETS> graph;
        for (int i = 0; i < NUM_ALPHABETS; i++) {
            for (int j = 0; j < NUM_ALPHABETS; j++) {
                if (i != j)
                    graph[i][j] = INT_MAX;
                else {
                    graph[i][j] = 0;
                }
            }
        }

        for (int i = 0; i < original.size(); i++) {
            graph[original[i] - 'a'][changed[i] - 'a'] = min(graph[original[i] - 'a'][changed[i] - 'a'], (long long)cost[i]);
        }

        // k: intermediate node
        for (int k = 0; k < NUM_ALPHABETS; k++) {
            for (int i = 0; i < NUM_ALPHABETS; i++) {
                for (int j = 0; j < NUM_ALPHABETS; j++) {
                    if (graph[i][k] != INT_MAX && graph[k][j] != INT_MAX) {
                        graph[i][j] = min(graph[i][j], graph[i][k] + graph[k][j]);
                    }
                }
            }
        }


        long long ans = 0;
        for (int i = 0; i < source.size(); i++) {
            if (graph[source[i] - 'a'][target[i] - 'a'] == INT_MAX)
                return -1;
            ans += graph[source[i] - 'a'][target[i] - 'a'];
        }
        return ans;
    }
};

Running it, I was surprised that I had immediately beat 98% of users in the first attempt. Wow!

screenshot-1

I knew that a majority of the performance came from using std::array instead of std::vector. In this problem, using an array is the most optimal as the number of alphabets will always be fixed (26). Lets try running the code locally.

Running it locally, I got the following results:

# Run no.
1. 0.412041 ms
2. 0.5715 ms
3. 0.511209 ms
Average: 0.49825 ms

Editorial solution

The editorial's solution (using Floyd Warshall) implemented it using vector instead, giving us the following runtimes:

# Run no.
1. 0.6305 ms
2. 0.507291 ms
3. 0.639041 ms
Average: 0.59227733 ms

Difference: about 18% slower. (On Leetcode, the editorial's implementation had a runtime of 60ms and beat 31%)

Investigating the cause

Honestly, I was not fully sure of the answer, but I knew it had to do with the differences in how memory is allocated. So with the help of Gemini, I am happy to share what I have learnt (in my own words)!

`std::array` is contigious

Compared to std::vector, std::array is very primitive. It is a single block of memory that lives in the stack, that is all. There are no methods to add, remove, or resize the array, unlike vectors. This makes it super efficient to access elements in the array, using the simple formula base + offset. For example, to access graph[i][j], we know that the element is at the base address of the array + i * 26 + j.

On the other hand, vectors are not contigious. The second row of the vector may not be directly after the first row. This means that to access the second row of our vector, there is a chance that the CPU will need to fetch it from a different part of the RAM (cache miss). On the other hand, in std:array, it is likely that the second row of the array is already in the L1 cache of the CPU.

Double indirection in `std::vector`

As established earlier, accessing array elements is extremely fast, as we can calculate the address the element is stored instantly. However, in the vector, we will need to follow the pointers in the heap to access the element (double indirection).

Allocation overhead

Since std::array is allocated on the stack, there is no allocation overhead - we simply move the stack pointer.

On the other hand, std::vector is allocated on the heap. During the initialization phase, we will need to call malloc to allocate space for each row of the vector.

So why std::array?

When I first started learning C++, one of the most confusing things was why everyone is using vectors instead of arrays like most other languages. Thanks to learncpp.com (which I had been religiously studying the past 3 weeks), I am able to confidently tell the difference between arrays and vectors and know when to use each (although I am abit hazy on the lower level details, hence I had to sort Gemini's help for explaining the previous section in greater clarity).

Putting it simply, vectors are used when the size is not known at compile time, while arrays are used when the sized is known at compile time. In technical terms, we say that we use arrays when the size is a constant expression.

And since there are 26 alphabets in the English language (unless they add more), I intuitively used std::array in my code.

Thank you!

If you have read until here, thank you for reading! I hope to be sharing more articles soon! (although they are unlikely to be Leetcode related).

< Back to blog