Building a Chess Move Generator: What You Need to Get Started

Board representation, piece + move encoding, how to debug your move generator

Nov 30, 2022

This is everything you need to know to write a move generator. Strictly for freshman, not for veterans.

Board Representation

The intuitive way to represent the board is a 2d array. You trade a lot of performance, but it’s easy to code. This is the McDonald’s option, not the best but it’s quick and gets the job done.

For something more performant you might think to go with a 1d array. 64 squares, 64 indexes. But for the keen observer, you will notice how moves can wrap around the board (ie. to go from g1 to h1 add 1, but to go from h1 to a2 you also add 1), and the checks you would have to make to avoid this are actually quite cumbersome. The popular board representations fix this problem for us.

Mailbox (or 10x12)

The mailbox board representation refers to a 10x12 array (it’s supposed to look like a mailbox). You still have your 64 square board, but it’s surrounded by sentinel squares.

int[] mailbox = {
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
     -1,  0,  1,  2,  3,  4,  5,  6,  7, -1,
     -1,  8,  9, 10, 11, 12, 13, 14, 15, -1,
     -1, 16, 17, 18, 19, 20, 21, 22, 23, -1,
     -1, 24, 25, 26, 27, 28, 29, 30, 31, -1,
     -1, 32, 33, 34, 35, 36, 37, 38, 39, -1,
     -1, 40, 41, 42, 43, 44, 45, 46, 47, -1,
     -1, 48, 49, 50, 51, 52, 53, 54, 55, -1,
     -1, 56, 57, 58, 59, 60, 61, 62, 63, -1,
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1
};

Now, as you generate your moves, you can check the value of the square to see if the move wraps around.

2 extra ranks on the top and bottom are necessary to account for knight jumps. We can get away with one on the left and right because they stack.

Here are the pros and cons according to chessprogrammingwiki

Pros:
Easy, straightforward to understand and implement
Suitable with the same efficiency for any size of boards, from one can be fitted on 64-bit integers to much larger. Thus it is also easier to support multi-chess variants which boards’ sizes are quite different.
Suitable for almost all chess tasks and software where computing speed is not high requirements such as chess tools, GUI since it is much easier to develop and maintain.
Cons:
Programming must use many loop and branch commands such as for, while, if (compare with Bitboards)
Hard to store patterns and match them
May have some inefficient high-frequency-use functions such as finding King locations, in-check
Not efficient to generate moves in stages since generating all moves, captures only, non-captures may take quite similar periods

0x88 (or 16x8)

The 0x88 board is represented by a 16x8 array.

Here you have a “ghost” board next to your real, actual board.

Instead of relying on sentinel values like the mailbox method, the 0x88 relies on a trick. Each of the on-board indexes, when joined with the hexadecimal 0x88 with the bitwise AND operator, will result in 0.

boolean valid = destination & 0x88 == 0

Any indices from the ghost board and any negative indices will give you a 1.

For a thorough breakdown of how this works, check out this post on the Mediocre Chess blog.

Attack lookup

The advantage the 16x8 board has over the mailbox method is you can easily find the relationship between two squares by finding the difference between the indices.

A difference of 3 will always mean the squares are on the same rank. A difference of 17 will always mean the squares lay on the same diagonal. This is not true for the mailbox method.

This stability makes it possible to calculate attacks without manually generating moves which makes it much faster to compute checks and find capture/non-capture moves.

You can use this information to come up with an attack array which outlines the pieces that are capable of attacking the square for any given difference. Let’s say you want to see if your king on E1 can be attacked by a bishop on C3.

int difference = E1 - C1 + 119 (add 119 because that is the index of the last square on the board and you want to avoid negative numbers)

Then if ATTACK_TABLE[difference] includes a bishop, then the king is in check.

I found this attack lookup from the Mediocre Chess blog, and I’ve done the work of validating it. (Mine is flipped, he calculates his from the pov of the piece attacking, I calculate mine from the pov of the square being attacked. Also I cut mine down to a size of 240.)

int[] ATTACK_TABLE = {
        5, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 5, 0, 0, 5, 0, 0,
        0, 0, 0, 2, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 5, 0, 0, 0, 0, 2,
        0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 2, 0, 0, 0, 5,
        0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 2, 0, 0, 5, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 5, 6, 2, 6, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 6, 3, 1, 3, 6, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 1, 0,
        1, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 6, 4, 1, 4, 6, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 6, 2, 6, 5, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 5, 0, 0, 2, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 5,
        0, 0, 0, 2, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 2,
        0, 0, 0, 0, 5, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0,
        0, 5, 0, 0, 5, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 5
}

possible attackers
0 = no attacks
1 = king queen rook
2 = queen rook
3 = king queen bishop white pawn
4 = king queen bishop black pawn
5 = queen bishop
6 = knight

For sliding pieces like bishops, queens, and rooks, you would need to find if there are any obstacles in the way . For this it is helpful to have an offset array. (credit to Mediocre Chess blog for putting this together. The same changes I made to the ATTACK_TABLE I made to the OFFSET_TABLE)

int[] OFFSET_TABLE = {
        17, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 15, 0, 0, 17, 0, 0,
        0, 0, 0, 16, 0, 0, 0, 0, 0, 15, 0, 0, 0, 0, 17, 0, 0, 0, 0, 16,
        0, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 16, 0, 0, 0, 15,
        0, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 16, 0, 0, 15, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 17, -33, 16, -31, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, -18, 17, 16, 15, -14, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0,
        -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 14, -15, -16, -17, 18, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, -15, 31, -16, 33, -17, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, -15, 0, 0, -16, 0, 0, -17, 0, 0, 0, 0, 0, 0, 0, 0, -15,
        0, 0, 0, -16, 0, 0, 0, -17, 0, 0, 0, 0, 0, 0, -15, 0, 0, 0, 0, -16,
        0, 0, 0, 0, -17, 0, 0, 0, 0, -15, 0, 0, 0, 0, 0, -16, 0, 0, 0, 0,
        0, -17, 0, 0, -15, 0, 0, 0, 0, 0, 0, -16, 0, 0, 0, 0, 0, 0, -17
};

Bitboards

This is out of my depth as I don’t have any experience with this so check out the chessprogrammingwiki for more info.

This is the hardest to implement as well as the most performant - in both space and compute time.

Bitboards only use 64-bits to represent the board, and instead of having one centralized board state, you have a bitboard for each piece type of each color.

To generate moves, bitwise operators are used. This is much more efficient than adding/subtracting offsets.

I will try to tackle this in the future, but for now, this is all I’ve got.

Hybrid approaches

Hybrid approaches are generally used to compensate for the weaknesses of your primary board representation.

One hybrid approach you can try is the 0x88 + mailbox.

0x88 + mailbox

This approach uses a 15x12 array to represent the board. Here the ghost board is filled with sentinel values and just like the 10x12 approach, the top and bottom are both padded with two sentinel ranks.

This approach is more efficient than the standard 0x88 method because since you have to check the value of the board to see if the square is occupied anyways, the 0x88 check is a bit redundant.

Unlike the mailbox approach, however, the addition of the extra columns makes using an attack table possible. (the difference between h1 and a2 would be 8, but the maximum difference between any square on the same rank would be 7)

Piece Lists

If you are using bitboards you can skip the section.

One downfall of the centralized board approach is it can be difficult to find all your pieces. You can either iterate over your board to find your pieces or you can keep the positions of your pieces in a list.

You can either use index ranges, disjoint lists or arrays, or even a combination of both to differentiate piece type and color. For example this is how I store my piece lists:

public static Map<Color, Square[]> pieceList = new HashMap<>();

static {
        pieceList.put(Color.W,
                new Square[51]);
        pieceList.put(Color.B,
                new Square[51]);
}

Basically it’s a map (or object if you use js) with white or black as its keys. And the values are arrays of length 51 that holds the squares of the pieces.

Now, obviously there aren’t 51 pieces on the board let alone for one side. But the reason you want have an array that large is to accommodate promotions. There are 8 pawns on the board, so hypothetically it’s possible to have 10 rooks or 10 knights. So for each piece, they have an index range of 10 ie. 0 - 9 are for pawns, 10 - 19 are for knights, and so on. The only piece this doesn’t apply to is the king, which is the last member of the array.

Some engines use one long array, some engines use completely disjointed arrays for each piece of each color. Some even use linked lists. (Linked lists make it easier to update the piece list as you make/unmake your moves)

Just make sure that, if you’re using a fixed size data structure, you give the pieces that pawns can promote into 10 slots.

Piece Encoding

The way I did it was by assigning a number to each piece type like so:

NULL = 0;

PAWN = 1;

KNIGHT = 2;

BISHOP = 3;

ROOK = 4;

QUEEN = 5;

KING = 6;

and assigning 8 and 16 to the colors:

WHITE = 8;

BLACK = 16;

Then, you join a color with a piece type with a bitwise OR operation to get a piece.

WHITE | PAWN

This way the first 3 bits can be used to identify the piece type and the last 2 bits can be used to identify color.

01 001 ← white pawn

10 110 ← black king

Credit to Sebastian Lague for this method.

Move Encoding

The easiest, most readable way of encoding a move is to use maps or objects. Things you will have to account for are:

from square
to square
castling
piece type of pawn promotion

To save space, you can use a 21-bit integer to store this information instead, where the:

first 7 bits refer to the "to" square index
next 7 bits refer to the "from" square index
next 4 bits refer to castling
next 3 bits refer to promotion piece type

So if you were castling kingside, the move might look like this:

((W_CASTLE_KINGSIDE.value << 14) | E1.idx << 7) | G1.idx

where W_CASTLE_KINGSIDE.value would either be 8, 4, 2, or 1.

Encoding castling rights

The reason for those numbers is because this allows you to have a bit for each castling right.

W_CASTLE_KINGSIDE.value = 8; (binary: 1000)

W_CASTLE_QUEENSIDE.value = 4; (binary: 0100)

B_CASTLE_KINGSIDE.value = 2; (binary: 0010)

B_CASTLE_QUEENSIDE.value = 1; (binary: 0001)

Then all you need to do to see if white can castle kingside is do a bitwise AND operation like so

boolean valid = (castleRights & W_CASTLE_KINGSIDE.value) != 0

Debugging Your Move Generator

Fen strings

Fen strings are a string representation of a board state. They will come in handy when you need to debug your move generator so your engine should have some way of decoding a fen string.

Here is my code. It’s written in Java. Feel free to use it.

public static void loadFen(String fen) {
    board = new int[128];

    Arrays.fill(pieceList.get(Color.W),
            Square.NULL);
    Arrays.fill(pieceList.get(Color.B),
            Square.NULL);

    Map<Character, Piece> pieceMap = new HashMap<>();
    pieceMap.put('k',
            Piece.KING);
    pieceMap.put('q',
            Piece.QUEEN);
    pieceMap.put('r',
            Piece.ROOK);
    pieceMap.put('n',
            Piece.KNIGHT);
    pieceMap.put('b',
            Piece.BISHOP);
    pieceMap.put('p',
            Piece.PAWN);

    Map<Character, Integer> castleMap = new HashMap<>();
    castleMap.put('K',
            8);
    castleMap.put('Q',
            4);
    castleMap.put('k',
            2);
    castleMap.put('q',
            1);

    String[] fenState = fen.split(" ");

    String fenBoard = fenState[0];
    int rank = 7, file = 0;

    for (int i = 0; i < fenBoard.length(); i++) {
        char c = fenBoard.charAt(i);

        if (c == '/') {
            file = 0;
            rank--;
            continue;
        }

        if (Character.isDigit(c)) {
            file += Character.getNumericValue(c);
            continue;
        }

        // c represents a piece
        char type = Character.toLowerCase(c);
        Color color = type == c ?
                Color.B :
                Color.W;
        int idx = rank * 16 + file;
        Piece piece = pieceMap.get(type);
        board[idx] = color.id | piece.id;
        pushToPieceList(color,
                piece,
                Square.lookup.get(idx));
        file++;
    }

    String fenCastle = fenState[2];
    castleRights = 0;
    if (!fenCastle.equals("-")) {
        for (int i = 0; i < fenCastle.length(); i++) {
            castleRights |= castleMap.get(fenCastle.charAt(i));
        }
    }

    activeColor = Objects.equals(fenState[1],
            "w") ?
            Color.W :
            Color.B;

    enPassant = !Objects.equals(fenState[3],
            "-") ?
            Square.valueOf(fenState[3].toUpperCase()) :
            Square.NULL;

    halfmoves = Integer.parseInt(fenState[4]);

    zobristHash = ZobristKey.hash();
}

Don’t worry about the zobrist hash for now. I’ll go over that in a later post.

Quick breakdown of the code

The board state portion of the fen string is represented as such:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR

(fen string for the initial position)

Each rank is separated by a “/” and it reads left-to-right, from the 8th rank to the 1st. Lowercase letters represent black pieces, uppercase white pieces. Numbers represent empty squares.

String fenBoard = fenState[0];
    int rank = 7, file = 0;

    for (int i = 0; i < fenBoard.length(); i++) {
        char c = fenBoard.charAt(i);

        if (c == '/') {
            file = 0;
            rank--;
            continue;
        }

        if (Character.isDigit(c)) {
            file += Character.getNumericValue(c);
            continue;
        }

        // c represents a piece
        char type = Character.toLowerCase(c);
        Color color = type == c ?
                Color.B :
                Color.W;
        int idx = rank * 16 + file;
        Piece piece = pieceMap.get(type);
        board[idx] = color.id | piece.id;
        pushToPieceList(color,
                piece,
                Square.lookup.get(idx));
        file++;
    }

So you start with two variables: rank and file. Rank is initiated to 7 (because you start on the last rank), file is initiated to 0 (because you start on the 1st file).

As you loop over the string, if you encounter a “/” (which means a new rank) you reset the file to 0 (because you start on the first file always), decrement the rank, and escape the current iteration of the loop.

If you encounter a number, you add that number to the your files variable because those are all empty squares.

If it’s a piece, then you get the color and piece type, place the piece on the right index, which if you’re using a 0x88 board like I am, is

rank * 16 + file

then increment the file.

Perft

Once you have your move generator mostly working, you want to test it to see if there are any edge cases you are missing. In chess, this is done by counting the number of moves your move generator creates and comparing it to numbers confirmed by other engines.

In the initial position at at depth of 1 there are 20 possible moves. At a depth of 2, that number grows to 400 possible combinations. At depth 3, 8,902. And you want to make sure your move generator is hitting these numbers.

Here is a link to the perft numbers.

Here is the code I’ve written to count nodes:

public static int countNumOfPositions(int depth, boolean print) {

    if (depth == 0) return 1;
    int count = 0;
    UnmakeDetails moveDetails = new UnmakeDetails();

    List<Integer> moves = getValidMoves(activeColor);
    for (Integer move : moves) {
        int moveCount = 0;
        makeMove(move, moveDetails);
        moveCount += countNumOfPositions(depth - 1);
        count += moveCount;
        if (print) {
            System.out.printf("%s%s: %d; depth: %d %n", Square.lookup.get((move >> 7) & 127),
                    Square.lookup.get(move & 127),
                    moveCount, depth);
        }
        unmakeMove(moveDetails);
        moveDetails.reset();
    }
    return count;
}

public static int countNumOfPositions(int depth) {
    if (depth == 0) return 1;
    int count = 0;
    UnmakeDetails moveDetails = new UnmakeDetails();

    List<Integer> moves = getValidMoves(activeColor);
    for (Integer move : moves) {
        makeMove(move, moveDetails);
        count += countNumOfPositions(depth - 1);
        unmakeMove(moveDetails);
        moveDetails.reset();
    }
    return count;
}

I included the option to print moves generated and their count because being able to read the results for each branch is very helpful when trying to debug. You can compare your results with Stockfish’s results to see which branch is acting up.

Installing stockfish

This is for Linux users only as that is the only OS I’ve installed Stockfish on.

Download Stockfish here.

Unzip the contents, then open up your terminal and cd into the /src directory of your stockfish folder. Then you need to build the executable like so:

$ make build ARCH=x86-64

To run stockfish enter this command:

$ ./stockfish

Using stockfish

Here’s a link to all the stockfish cli commands, but here are the relevant bits.

$ position fen $FEN_STRING

This command loads a fen string.

$ position startpos

This command resets the board state to the intial position.

$ position (fen $FEN_STRING || startPos) moves $MOVE1 ...$MOVEN

You can load up a position and then make moves to get to your desired board state. Moves are writting in long algebraic notation. If you’re not a chess nerd like me and that makes no sense to you, here are some examples:

e2e4, e7e5, e1g1 (white short castling), e7e8q (for promotion)

Once you loaded up your position, you can run this command to get the perft results:

$ go perft $DEPTH

You should get something like this:

To debug your move generator, you want to check for any inconsistencies then step through the branch(es) until you can figure out what the issue is.

There is a tool called perftree that can help you do this semi-automatically. I wasn’t able to get it working, but you might be better at it than me. Here’s a link.

Tricky rules/Common bugs

enPassant
pins
interaction between enPassant + pins
can’t castle through/into/out of checks
toggling castling
unmake function

Last Thoughts

As someone who doesn’t have a strong technical background, a lot of this stuff was intimidating to me before I started - especially dealing with binary. But with some help from the resources available on the internet, I’m now fairly comfortable with this stuff. Big shoutout to these resources:

Alex’s Newsletter

Discussion about this post