Stop Memorizing DP Formulas: The Real Way to Learn Dynamic Programming

Let’s be brutally honest. Dynamic Programming has a terrifying reputation. For many developers, the phrase “Dynamic Programming” triggers PTSD flashbacks of staring at a blank whiteboard during an interview while the interviewer aggressively clears their throat.

Many people think DP involves staring at a 2D Excel grid until a mathematical equation magically reveals itself in a vision. Or worse, they try to straight-up memorize the formulas!

Sorry to be the bearer of bad news, but that is the worst way to learn!

How do I know? Because that is exactly how it was taught at my university, and I failed my algorithms exam so spectacularly. I spent weeks trying to memorize different state transition formulas, convinced my brain was just fundamentally incompatible with computer science.

Dynamic Programming is nothing more than smart recursion. It is literally just spicy recursion with a notebook. If you can write a brute-force recursive solution (which you absolutely can), you can write a DP solution. It is a mechanical, step-by-step translation, not a magic trick.

In this post, we are going to do LeetCode 188: Best Time to Buy and Sell Stock IV. We are not going to start by drawing a grid. We will start with a simple decision tree, make our code realize it has amnesia, and evolve our solution into a blazing-fast, space-optimized masterpiece :D!

The Problem Statement

You are given an integer array prices where prices[i] is the price of a given stock on the i-th day, and an integer k.

Find the maximum profit you can achieve. You may complete at most k transactions. Note: You may not engage in multiple transactions simultaneously (i.e., you must sell the stock before you buy again).

Example:

1
2
3
4
5
6
7
8
k = 2
prices = [3, 2, 6, 5, 0, 3]

Output: 7
Explanation: 
Buy on day 2 (price = 2) and sell on day 3 (price = 6), profit = 6 - 2 = 4.
Then buy on day 5 (price = 0) and sell on day 6 (price = 3), profit = 3 - 0 = 3.
Total profit = 4 + 3 = 7.

The Brute Force

Forget about performance for a second. Imagine you are a time traveler standing on day 0. What choices do you actually have?

Like Dr. Strange looking at 14 million possible futures, your choices depend entirely on your current state in the timeline. What defines your exact situation at any given moment?

Current Day (day): Which day is it?
Transactions Remaining (txLeft): How many more times can I buy?
Holding Status (holding): Do I currently own the stock? (0 = No, 1 = Yes).

Based on this state, decision tree is simple:

If we have no stock (Holding = 0):
1. Skip: Do nothing. Move to the next day.
2. Buy: Spend money (-prices[day]), move to the next day, mark that we are now holding stock. This burns 1 transaction token.
If we hold stock (Holding = 1):
1. Skip: Hold the stock and move to the next day.
2. Sell: Gain money (+prices[day]), move to the next day, and mark that we are no longer holding.

At every day, we make a binary choice. Notice how the state toggles between Holding and Not Holding.

Let’s turn this logic into pure, recursive C++

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <vector>
#include <algorithm>

using namespace std;

class Solution {
public:
    int maxProfit(int k, vector<int>& prices) {
        // Start at day 0, with k transactions left, holding 0 stocks
        return solve(0, k, 0, prices);
    }

private:
    int solve(int day, int txLeft, int holding, vector<int>& prices) {
        // Base Case 1 - we ran out of time
        if (day == prices.size()) return 0;
        
        // Base Case 2 - we ran out of tokens, and we aren't holding anything to sell
        if (txLeft == 0 && holding == 0) return 0;

        // Choice 1 - Do nothing today. Just time-travel to tomorrow
        int skip = solve(day + 1, txLeft, holding, prices);

        // Choice 2 -Transact (Buy or Sell)
        int transact = 0;
        if (holding == 1) {
            // We have a stock! Let's sell it to get rich (+prices[day])
            transact = prices[day] + solve(day + 1, txLeft, 0, prices);
        }
        else {
            // We have no stock. Let's buy one and lose money (-prices[day])
            // Buying costs us 1 transaction token
            transact = -prices[day] + solve(day + 1, txLeft - 1, 1, prices);
        }

        // Return whatever timeline makes us the most money
        return max(skip, transact);
    }
};

This solution is 100% logically correct. It is also 100% going to give you a Time Limit Exceeded error.

It has a time complexity of roughly O(2^n). For an array of 50 days, this tree will branch into quadrillions of recursive calls. The heat death of the universe will happen before code finishes running.

Brute force recursion calculates every single possibility, resulting in a massive, exponentially growing call stack.

Memoization (The Top-Down Approach)

Why is the recursive solution so slow? Because your code has severe amnesia. It calculates the exact same futures over and over again.

Imagine reaching Day 5 with 1 transaction left and holding stock via two completely different timelines:

Buy Day 1 -> Sell Day 2 -> Buy Day 5.
Buy Day 3 -> Sell Day 4 -> Buy Day 5.

The recursion doesn’t know it has been in this exact state before. It recalculates the entire future from day 5 down to the end of the array twice.

Different histories can result in the exact same current state. We should not predict the future twice.

To fix this, we introduce the concept of a Cache (Memoization). When we calculate the answer for a specific state, we write it down in our notebook. The next time we arrive at that exact state, we just read the answer from the notebook instead of calculating it again.

The Golden Rule: Store the Future, Ignore the Past

This is the part that trips up almost everyone (including university-level me). What exactly are we storing in this notebook?

We are storing the future, not the past.

Think of it like being dropped into a Las Vegas casino. The function solve(day, txLeft, holding) calculates the maximum profit you can make from today until the market closes.

The stock market does not care how you got to day 5. It doesn’t care if you made a fortune on day 2, or if you lost your life savings on day 4. If you stand on day 5, with 1 transaction token, holding 0 stocks, your future profit potential is exactly the same regardless of your past.

Because the past doesn’t matter, we can safely cache the result. If a timeline brings us to (day 5, 1 token, 0 stock), we look in our notebook: “Ah, I calculated this earlier. From this exact situation, the most money I can make by the end of the game is $10.” Boom. Sub-tree pruned.

Why a 3D Cache?

To make our notebook work, every unique state needs its own specific slot to store its answer.

Look at our recursive function signature: solve(day, txLeft, holding). The array prices never changes, but the other three variables do. Therefore, to uniquely identify our exact situation, we need a 3-dimensional grid.

Think of it like finding a specific seat in a movie theater:

Block (day): Which day is it? (Values from 0 to N)
Row (txLeft): How many tokens do we have? (Values from 0 to K)
Seat (holding): Are we holding a stock? (Values 0 or 1)

Our memoization table is essentially a 3D grid where every block holds the maximum future profit for that specific state.

Here is what the C++ code looks like when we give it a memory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Solution {
    // Our notebook (3D cache)
    vector<vector<vector<int>>> memo;

public:
    int maxProfit(int k, vector<int>& prices) {
        int n = prices.size();
        // Resize notebook: [days][transactions][holding]. Fill with -1 (uncalculated)
        memo.assign(n, vector<vector<int>>(k + 1, vector<int>(2, -1)));
        
        return solve(0, k, 0, prices);
    }

    int solve(int day, int txLeft, int holding, vector<int>& prices) {
        if (day == prices.size() || (txLeft == 0 && holding == 0)) return 0;

        // Hey, have we been here before? Check the notebook!
        if (memo[day][txLeft][holding] != -1) {
            // Return cached answer!
            return memo[day][txLeft][holding];
        }

        int skip = solve(day + 1, txLeft, holding, prices);

        int transact = 0;
        if (holding == 1) {
            transact = prices[day] + solve(day + 1, txLeft, 0, prices);
        }
        else {
            transact = -prices[day] + solve(day + 1, txLeft - 1, 1, prices);
        }

        // Before returning, write the answer in the notebook for next time
        return memo[day][txLeft][holding] = max(skip, transact);
    }
};

Complexity:

Time: O(N * K). We only calculate each state exactly once!
Space: O(N * K * 2) for the 3D cache, plus recursion stack memory.

This will pass LeetCode, but we can do better.

Tabulation (The Bottom-Up Approach)

Recursion is cool until your input is massive, call stack explodes and it takes app down.

Tabulation simply means filling that memo array using for loops instead of recursion.

In recursion, we started at day 0 and looked forward. In Tabulation, we time-travel to the end of the array and work our way backwards to day 0.

Why backwards? Because to know the best decision to make on day 10, we already need to know the optimal futures for day 11.

We iterate backwards from the last day down to day 0 to ensure future dependencies are already calculated.

Let’s translate the recursive logic directly into a loop:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
class Solution {
    using table = vector<vector<vector<int>>>;
public:
    int maxProfit(int k, vector<int>& prices) {
        int n = prices.size();
        if (n == 0) return 0;

        // Create the 3D DP table: dp[day][transactions][holding]
        // Initialize everything to 0 (handles base cases automatically)
        table dp(n + 1, vector<vector<int>>(k + 1, vector<int>(2, 0)));

        // Loop backwards from the last day to day 0
        for (int day = n - 1; day >= 0; day--) {
            for (int txLeft = 1; txLeft <= k; txLeft++) {
                for (int holding = 0; holding <= 1; holding++) {
                    
                    // Choice 1 - Skip (take the value from tomorrow, same state)
                    int skip = dp[day + 1][txLeft][holding];
                    
                    // Choice 2 - Transact
                    int transact = 0;
                    if (holding == 1) {
                        // Sell: gain price, holding becomes 0, txLeft stays the same
                        transact = prices[day] + dp[day + 1][txLeft][0];
                    }
                    else {
                        // Buy: lose price, holding becomes 1, consume 1 transaction
                        transact = -prices[day] + dp[day + 1][txLeft - 1][1];
                    }

                    // Store the best choice
                    dp[day][txLeft][holding] = max(skip, transact);
                }
            }
        }

        // The answer is the state we started: day 0, k transactions, holding 0
        return dp[0][k][0]; 
    }
};

Space Optimization

Look closely at the loop in the Tabulation code above. Notice anything interesting? To calculate values for day, we only look at values from day + 1. We don’t care about day + 2, or day + 5, or day + 100.

Why are we keeping the entire 3D calendar in memory when we only ever care about today and tomorrow?

We can completely remove the Day dimension from our array! All we need is a 2D array representing nextDay, and a 2D array representing currDay. As we step backward through time, we calculate currDay based on nextDay, and then overwrite nextDay with currDay.

Because we only ever look one step ahead, we don’t need the entire 3D cube. We can collapse the time dimension into two 2D grids swapping back and forth.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Solution {
public:
    int maxProfit(int k, vector<int>& prices) {
        int n = prices.size();
        if (n == 0 || k == 0) return 0;

        // We only need two 2D arrays: [holding (0 or 1)][transactions (0 to K)]
        vector<vector<int>> nextDay(2, vector<int>(k + 1, 0));
        vector<vector<int>> currDay(2, vector<int>(k + 1, 0));

        for (int day = n - 1; day >= 0; day--) {
            for (int holding = 0; holding <= 1; holding++) {
                for (int txLeft = 1; txLeft <= k; txLeft++) {
                    
                    int skip = nextDay[holding][txLeft];
                    int transact = 0;

                    if (holding == 1) {
                        // Sell!
                        transact = prices[day] + nextDay[0][txLeft];
                    }
                    else {
                        // Buy!
                        transact = -prices[day] + nextDay[1][txLeft - 1];
                    }

                    currDay[holding][txLeft] = max(skip, transact);
                }
            }
            // Move backwards in time for the next loop iteration
            nextDay = currDay;
        }

        return currDay[0][k];
    }
};

The Takeaway

See how we completely avoided staring at a blank whiteboard trying to pull math formulas out of thin air?

We started with basic human logic: What are my choices right now?

We wrote a slow recursive function, gave it a notebook to remember things (Memoization), converted it into a loop (Tabulation), and threw away the parts of the notebook we didn’t need anymore (Space Optimization).

That is how you master Dynamic Programming. No magic required.

The Problem Statement#

The Brute Force#

Memoization (The Top-Down Approach)#

The Golden Rule: Store the Future, Ignore the Past#

Why a 3D Cache?#

Tabulation (The Bottom-Up Approach)#

Space Optimization#

The Takeaway#