How the AI Thinks: Minimax Explained Simply

The algorithm behind perfect-play tic-tac-toe, explained without jargon. · 12 min read

If you've played the "Impossible" mode of our classic tic-tac-toe, you've noticed something annoying: the AI never loses. You can draw, sometimes. You can never win. We tested it ourselves — across 500 simulated games against random play, the AI lost exactly zero of them.

How does it do this? Not with magic, and not with any tic-tac-toe-specific tricks. It uses a general-purpose algorithm called minimax. The same algorithm, with refinements, is what runs inside chess engines, Go-playing programs (alongside other techniques), checkers solvers, and many other game-playing AIs. It's one of the most important algorithms in computer science.

And it's surprisingly simple. By the end of this article, you'll understand how it works well enough to implement it yourself.

The core idea: think ahead, assume the worst

Imagine you're playing tic-tac-toe and it's your move. You're considering placing your X in some particular square. To know whether this is a good move, you ask yourself:

"If I play here, what's the worst outcome possible?"

The worst outcome is the one that assumes your opponent plays as well as they possibly can in response. If even when they play their best, you still win — great move. If their best response leads to your loss — bad move.

That's it. That's minimax. The algorithm just makes this thought process rigorous and systematic.

The name "minimax" comes from this asymmetry: you want to maximize your score, and your opponent wants to minimize it (because their interests are the opposite of yours). So you alternate: max-min-max-min, down through the tree of possible future moves, until you reach an outcome you can score.

Step 1: Score the leaf positions

The algorithm needs to know, at the bottom of the search, who's winning. For tic-tac-toe, there are only three possible final outcomes:

The AI wins: score = +10
The opponent wins: score = -10
Draw: score = 0

The numbers themselves don't matter — what matters is the ordering: AI win > draw > AI loss. You could just as easily use 1, 0, and -1. We pick +10 / -10 because it gives us room to subtract the depth of the search later, which lets the AI prefer faster wins and slower losses. We'll get to that.

Step 2: Build a tree of possible futures

Suppose it's the AI's turn and the board looks like this:

 X | O | X
-----------
   | O |
-----------
   |   |

The AI is O, and it needs to move. There are five empty squares — five possible moves. For each of them, the AI imagines playing there, then imagines what the opponent (X) would do in response, then what the AI would do in response to that, and so on, until each game-tree branch reaches a final outcome.

This is called a game tree. The root is the current position. Each branch is a move. Each leaf is a finished game with a score.

Here's a tiny piece of such a tree:

Current position (O to move) ├─ Move A (O plays middle-left) │ ├─ X responds bottom-right → O wins (+10) │ └─ X responds bottom-left → draw (0) ├─ Move B (O plays middle-right) │ ├─ X responds bottom-left → X wins (-10) │ └─ X responds bottom-right → X wins (-10) └─ Move C (O plays bottom-center) ├─ X responds middle-left → draw (0) └─ X responds bottom-left → O wins (+10)

The leaves are scored. Now we need to figure out what to do at each level above.

Step 3: Propagate scores up the tree

Working backwards from the leaves, we ask at each level: "whose turn is it, and what would they pick?"

The opponent (X) will always pick the move that's worst for the AI — that is, the minimum score. So at any X node, the value is the minimum of its children.

The AI (O) will always pick the move that's best for itself — that is, the maximum score. So at any O node, the value is the maximum of its children.

Applying this to the tree above:

Current position (O picks MAX = +10) ← best move is Move A or C ├─ Move A (X picks MIN = 0) ← X plays bottom-left for the draw │ ├─ X responds bottom-right → +10 │ └─ X responds bottom-left → 0 ← X picks this ├─ Move B (X picks MIN = -10) ← either move loses for AI │ ├─ X responds bottom-left → -10 │ └─ X responds bottom-right → -10 └─ Move C (X picks MIN = 0) ← X plays middle-left for the draw ├─ X responds middle-left → 0 ← X picks this └─ X responds bottom-left → +10 Wait — re-checking: the root value should be max(0, -10, 0) = 0. The AI guarantees at least a draw with Move A or Move C.

So the AI plays either Move A or Move C. Both guarantee a draw against optimal opposition, which in this position is the best achievable outcome.

Notice what happened: the AI didn't "hope" the opponent would make a mistake. It assumed the opponent would play perfectly, and chose the move that gave the best result even against perfect play. That's the essence of minimax.

Step 4: Recursion does the work

You can describe minimax in plain English in two sentences:

If the game is over, return the score (+10 for AI win, -10 for AI loss, 0 for draw).
Otherwise, look at every legal move. Apply minimax recursively to each resulting position. Then return either the maximum or the minimum of those values, depending on whose turn it was.

That's the whole algorithm. Here it is in JavaScript, roughly the way our site implements it:

function minimax(board, currentPlayer, aiPlayer, depth) {
  const opponent = aiPlayer === 'X' ? 'O' : 'X';

  // Base cases: game is over
  if (checkWin(board, aiPlayer))   return { score: 10 - depth };
  if (checkWin(board, opponent))   return { score: depth - 10 };
  if (isDraw(board))               return { score: 0 };

  const moves = availableSquares(board);
  let best;

  if (currentPlayer === aiPlayer) {
    // AI's turn — maximize
    best = { score: -Infinity };
    for (const move of moves) {
      board[move] = currentPlayer;
      const result = minimax(board, opponent, aiPlayer, depth + 1);
      board[move] = null;            // undo the move
      if (result.score > best.score) {
        best = { score: result.score, move };
      }
    }
  } else {
    // Opponent's turn — minimize
    best = { score: Infinity };
    for (const move of moves) {
      board[move] = currentPlayer;
      const result = minimax(board, aiPlayer, aiPlayer, depth + 1);
      board[move] = null;
      if (result.score < best.score) {
        best = { score: result.score, move };
      }
    }
  }
  return best;
}

That's it. About thirty lines of code, and you have an undefeated tic-tac-toe player. The function calls itself recursively — once for each possible move at each level — until it bottoms out at terminal positions, then propagates the scores back up.

Why subtract the depth?

You might have noticed: in the code above, when the AI wins, we return 10 - depth instead of just 10. When the AI loses, we return depth - 10 instead of just -10. Why?

Without this adjustment, the AI treats any winning sequence as equally good — whether it wins in 1 move or 5 moves. That can lead to weird behavior. Imagine the AI has a forced win, but it picks a slow path that gives the opponent extra chances to mess up. Or, when losing, it picks the fastest possible loss instead of dragging out the game in case the opponent makes a mistake.

Subtracting depth nudges the AI toward faster wins and slower losses. A win at depth 2 is now worth +8, while a win at depth 4 is only +6. So among multiple winning lines, the AI picks the quickest one. Among multiple losing lines, it picks the one that takes longest (which is the line where the opponent has the most opportunities to slip up).

It's a small tweak with a big effect on how the AI feels to play against.

Alpha-beta pruning: making it fast

Pure minimax works fine for tic-tac-toe, but for bigger games it gets slow. Even at the start of a 3×3 game, the algorithm needs to explore about 250,000 nodes (well, 549,946 if you don't account for finished games — and far fewer with symmetry, but still a lot). For chess, the number is astronomical. So there's a trick called alpha-beta pruning that cuts down the work without changing the result.

The idea: while exploring moves, keep track of the best score the AI is guaranteed to achieve (call it alpha) and the worst score the opponent is forced to accept (call it beta). As you search, you sometimes encounter branches where even in the best case they're worse than something the AI has already found. There's no need to explore those branches — the AI won't pick them anyway. So you skip them. That's the prune.

For tic-tac-toe, alpha-beta speeds up the AI by roughly 5–10×. For chess, it can speed things up by factors of thousands. It's mathematically guaranteed not to change the result — the AI plays exactly the same moves, just faster.

If you look at the source code of our games — for example, the classic-game.js file — you'll see the actual implementation includes alpha-beta. The core logic is the same as above, with two extra parameters to track the bounds.

Why this doesn't work for chess (or Go, or anything big)

Minimax has one fundamental limitation: it requires you to explore the entire game tree, all the way to the end. For tic-tac-toe with 9 squares, that's manageable. For chess, where each player has roughly 35 legal moves per turn and games can last 80+ moves, the tree is unimaginably huge — far beyond what any computer can fully search.

So for larger games, the algorithm gets modified in two important ways:

Depth limit: instead of searching to the end, search only a fixed number of moves ahead — say 10 or 15. Then use a heuristic evaluation function to estimate how good the position is at that depth. (For chess, this is the engine's evaluation: "+2.5" or "-1.0" reflects a rough material/positional assessment.)
Smarter move ordering: if you explore the most promising moves first, alpha-beta can prune more aggressively. Engines use various techniques (transposition tables, killer-move heuristics) to order moves well.

You can see this idea in action on our site: the 4×4 and 5×5 games use depth-limited minimax with a heuristic (counting threats). The Ultimate Tic-Tac-Toe AI does the same with a more elaborate scoring function that values both the small boards and the meta-board. Neither is mathematically perfect, but both play very strongly.

The deeper insight

Step back from the code and think about what minimax really is. It's a formalization of a basic human cognitive strategy: think ahead, anticipate the opponent's best response, and act on that anticipation. Every time you've played a strategic game well, you were doing some informal version of this. Minimax just makes the process explicit and exhaustive.

The reason it works on tic-tac-toe — and the reason the algorithm has remained important for 70+ years — is that this kind of reasoning is fundamental to competitive decision-making. It comes up in economics (game theory), in military strategy, in negotiation, in evolution (where "opponents" are other organisms), and of course in every two-player zero-sum game you can think of.

Tic-tac-toe is the smallest example we have of a competitive game with perfect information and clean rules. That's exactly what makes it the perfect place to learn minimax. The same algorithm that plays our "Impossible" mode is the conceptual ancestor of Deep Blue and AlphaGo. They've added a lot of refinements — neural networks for position evaluation, Monte Carlo tree search, massive parallel computation — but the heart is still there: think ahead, assume your opponent is rational, pick the move that wins against their best response.

It all starts with a nine-square grid.

Want to see this in action? Try playing the "Impossible" difficulty on Classic — that's pure minimax with alpha-beta. The Ultimate and Big Board games use the heuristic/depth-limited variant described above.

Play vs the AI →