Created initial CEGO protocol driver code and search code. Updated Burn

to 0.11.1 to utilize new features.
miestrode · Dec 13, 2023 · 8dcbb2c · 8dcbb2c
1 parent 26785e0
commit 8dcbb2c
Show file tree

Hide file tree

Showing 29 changed files with 402 additions and 231 deletions.
diff --git a/README.md b/README.md
@@ -1,36 +1,17 @@
 # The Hash Chess Engine
-
 [![Status](https://github.com/miestrode/hash/workflows/Rust/badge.svg)](https://github.com/miestrode/hash/actions)
 
-Hash is an experimental Chess engine written in Rust, with the goal of putting to use recent advancements in statistics,
-computer science and computer Chess.
-Unlike most traditional Chess engines, Hash doesn't use the alpha-beta framework, and instead opts to perform directed
-tree search in the form of AlphaZero-style MCTS. However, unlike Chess engines such as Leela Chess Zero, Hash
-incorporates new ideas in its search, utilizing root-tree parallelization and move-picking via Murphy Sampling, which
-should greatly improve its play.
-
-A secondary goal of Hash is to use as much Rust as possible in its design, to test the boundaries of what is possible
-to do well currently, using Rust. Some areas may suffer, or just won't use Rust as a result, such as network training.
-
-## To do
-
-### Move generation (`hash-core`)
+Hash is an experimental Chess engine written in Rust, with the goal of putting to use recent advancements in statistics, computer science and computer Chess. Unlike most traditional Chess engines, Hash doesn't use the alpha-beta framework, and instead opts to perform directed tree search in the form of AlphaZero-style MCTS, while in the future, incorporating and facilitating new ideas in network architecture and MCTS algorithmics.
 
-- [ ] Make the FEN parser fail when the board it is parsing is illegal, as `Board` should never, whilst only using safe
-  functions result in an invalid position.
-- [ ] Try to reimplement the `Pins` data structure and other ideas from the old move generation code. It is possible
-  that reimplementing the generation of slide constraints could make it a viable, fast option again.
-- [ ] Refactor the build script, and it's magic bitboards setup (consider using `phf`, and unrelatedly switching to
-  black
-  magic bitboards)
+A secondary goal of Hash is to use as much Rust as possible in its design, to test the boundaries of what is possible to do with modern Rust. Therefore, Hash currently uses the Burn deep learning framework for running its neural networks, instead of more established options, such as Tensorflow or PyTorch. The hope is that, in the future, there will be less of a feature gap between the frameworks, and that optimization tools will grow to support Burn and similar Rust-based projects.
 
-### MCTS
+Hash is currently in the process of being written, and has not officially released in any form. It is unlikely the code in the repository here currently works as a full Chess engine.
 
-- [ ] Create an MCTS searcher using the networks (incorporating parallelism, Murphy Sampling and the like)
-- [ ] Consider not tying a board to the tree, saving memory
-- [ ] Consider to the contrary tying the relevant move to each child, or at least a move integer.
+## CGCF (or, why Hash doesn't support UCI)
+Hash doesn't support UCI, and instead uses its bespoke protocol, CGCF (Chess engine Game Control Format). The reasons for this are explained in [here](docs/CGCF.md). It suffices to say, we felt UCI assumed too many things about the engines implementing it, and that running Hash on a regular GUI was not a sought after goal at this time.
 
-### Network training
+### Documentation
+As we feel Hash is a sufficiently large project, documentation explaining things such as its current network structure and things of the like can be seen in [here](docs/). Note that documentation is currently largely incomplete.
 
-- [ ] Create a network trainer in Rust
-- [ ] Create an evaluation framework, similar to FishTest or OpenBench
+## Contributing
+The project currently will not accept contributions which significantly alter the source code, and so does not have guidelines for doing so. This is because things are currently far too underdeveloped. In the future, a `CONTRIBUTING.md` file will be made.
diff --git a/assets/docs/CEGO/REVISION-1/En passant.png b/assets/docs/CEGO/REVISION-1/En passant.png
diff --git a/docs/CEGO/REVISION-1.md b/docs/CEGO/REVISION-1.md
@@ -0,0 +1,93 @@
+# The CEGO (Chess Engine Game Operation) protocol, revision 1
+The CEGO protocol exists to answer a very specific need in Chess programming: the simple and fast operation of a Chess game between two Chess engines. All of this, while assuming little about how these engines operate. There exist other protocols to achieve this, such as the CECP, or UCI protocols. These answer a different, sometimes more general issue however, that of controlling a Chess engine in a GUI. In addition, they assume certain things on how engines implementing them operate. How can one generate a principal variation if all they are using is a neural network to pick the best move? This is not to say these protocols are bad, although they have some questionable and dated design decisions, but rather to say a better solution for the use-case described above may exist. It is the goal of this protocol to present such a solution.
+
+This document details revision 1 of CEGO. Whenever any change will be made to the specifications of the protocol, a new revision with an incremented number will be made. It is up to mediators and engines (see "Communication") to disclose which revision they currently use/support.
+
+## Communication
+As a protocol, CEGO facilitates communication between two entities: an engine, and a mediator. A mediator here is a general term for any program which is operating a Chess game. This communication is done in plain-text, encoded in ASCII, by passing messages using the standard streams. Specifically, the mediator sends text to the standard input of the engine, and the engine sends text to its standard output. Every message must contain only a single newline, at its end. On windows, this means `\r\n`, and on other platforms, `\n`.
+
+Due to many reasons, communication can sometimes fail, resulting in malformed updates. Therefore, it is important to understand CEGO implements no mechanisms for fault tolerance. In other words, it fails *hard* and *fast*. In addition, text in the protocol is case-sensitive, and arbitrary whitespace separators, like those seen in UCI cannot be used.
+
+The specification here is written while only referring to the communications between the mediator and a single engine, although in actual use, the mediator would have to communicate with two Chess engines. Luckily, the protocol is entirely symmetric, making this trivial. Whenever one engine "sends a move", the mediator sends a message to the other Chess engine, using the protocol, so it can make a move. Once it does, that move is sent by the mediator to the original Chess engine, and this process repeats.
+
+## Message styling and protocol invariants
+When displaying raw text in this specification, it should be interpreted as is, unless part of it is of the form:
+```
+<NAME>
+```
+where `NAME` is some kind of "name". When this is the case, `<NAME>` is a "parameter", and may be any string, based on the restrictions provided. Throughout the specification, parameters relating to time will always be in nanoseconds, as an integer. This means that 30 seconds would be represented as `30000000000`. Additionally, any Chess moves will be represented in long algebraic notation, as seen in UCI.
+
+### Long Algebraic Notation
+Every single move in long Algebraic Notation is of the form:
+```
+<ORIGIN-SQUARE><TARGET-SQUARE><PROMOTION>
+```
+
+such that these parameters take on different values for different kinds of moves, as described below. In all cases however, the square parameters will contain values such as `a4`, `e7`, and others, and the promotion parameter, which represents a piece, may be a `q`, `r`, `b`, `n`, or simply nothing, representing a queen, rook, bishop, knight, and no promotion taking place respectively.
+
+#### Normal Chess moves
+For a normal Chess move, parameter values are as expected.
+#### Promotions
+Promotions, which of course can only happen with pawns, will have the promotion parameter set to the piece the pawn is promoted to.
+
+#### Castling
+When castling, the origin and target squares used are those of the king. This means that for a king-side castling by white, the move will be `e1g1`.
+
+#### En passants
+For en passants, the origin and target squares used are those of the pawn performing the capture. Therefore, for the following board:
+
+![A white pawn on E5, about to en passant a pawn on D5](../../assets/docs/CEGO/REVISION-1/En passant.png)
+
+the highlighted move will be `e5d6`.
+
+## Stages
+
+### Initialization
+The first stage of CEGO is initialization, and it ensures the engine are ready for playing, as some engines have long initialization times, which can be caused by, for example, loading a neural network, downloading a tablebase, etc. Therefore, once communication begins, engines should send the message:
+```
+ready\n
+```
+as soon as it is ready to play. Note that like all messages, this one should be terminated by a newline, as seen above.
+
+### First move
+After the mediator receives the confirmation, it should send the engine a first move message, of the form:
+```
+<YOUR-TIME> <YOUR-INCREMENT> <OPPONENT-TIME> <OPPONENT-INCREMENT> <CURRENT-BOARD-FEN>\n
+```
+where `<CURRENT-BOARD-FEN>` is the current position of the board, at the time of the engine's first move, in FEN notation. Note the increments which are a part of this message. `<YOUR-INCREMENT>` should be added to the time left for the engine once it makes its move. With this in mind, one can see that this protocol doesn't support timing methods, such as Bronstein.
+
+Once the engine makes it move, it should send a message of the form:
+```
+<MOVE-TO-PLAY>\n
+```
+where `<MOVE-TO-PLAY>` is the move it has chosen to make. Even after the engine sends its move, it is free to ponder or do anything else. This is true in general: the engine may do anything at any point in time, provided it precisely follows the protocol.
+
+### Subsequent moves
+Once the mediator receives the opponents chosen move, it should send a message of the form:
+```
+<YOUR-TIME> <OPPONENT-TIME> <PLAYED-MOVE>\n
+```
+where `<PLAYED-MOVE>` is the move the opponent made. Notice the increments are not present in this message, as they are constant throughout the game. Like in the "First move" section, the engine should reply with the move it has chosen to make. Once this is done, it will eventually receive a message of the same form as above, at which point the cycle repeats.
+
+## Termination of the game
+### By the mediator
+The mediator may stop the game at any time, at which point, they should terminate the two engine processes. No notice needs to be given to the engines. Despite this, the mediator *must* stop the game if it has reached a "definite conclusion". This means either stalemate, checkmate, or a draw by FIDE rules. When a game is terminated due to a FIDE draw, the mediator must make sure the playing engine could not deliver mate in their turn.
+
+Additionally, the mediator must stop the game if at least one of the engine processes somehow stop.
+
+### By an engine
+When the game is terminated by an engine, unlike the mediator, it must be for a specific reason, and therefore, with a specific message to the mediator.
+
+#### Encountering a protocol error
+When an engine encounters an error with the last message sent by the mediator, it will send the message:
+```
+error\n
+```
+once this message is sent, the engine may quit, and the game should be terminated by the mediator.
+
+#### Forfeiting the game
+When the engine sees fit, instead of sending the expected message, it may send the message:
+```
+forfeit\n
+```
+to notify the mediator it has forfeited the game. Once this happens, the game must be terminated, and the engine may quit. Note that there's no need to use this mechanism for ending a game due to an internal engine error, as simply stopping the engine process will terminate the game. Therefore, a forfeit should be considered a win for the other engine, at all times.
diff --git a/docs/TOPOLOGY.md b/docs/TOPOLOGY.md
@@ -0,0 +1,20 @@
+# Topology of the project
+Below is an explanation of the different files and folders in this project.
+
+## `hash-bootstrap`
+Is a crate containing the basic constructs needed for the build script in `hash-core` to function. The build script is required for generating certain lookup tables for move generation.
+
+## `hash-core`
+Is a crate containing code for performing move generation and representing a Chess board.
+
+## `hash-network`
+Is a crate containing the model definition and supporting code for the Hash neural network (currently H0). It uses the Burn deep learning framework for this.
+
+## `hash-train`
+Is a binary crate that uses `hash-network` in order to train a network using its model, and then save it so it can be used by the engine.
+
+## `hash-search`
+Is a crate that implements the primary searching logic for the engine, by providing an advanced searching algorithm based on AlphaZero-style MCTS.
+
+## `hash-engine`
+Is a binary crate functioning as the front-end for the Hash Chess engine. It contains logic for managing search using operations provided by `hash-search` and the networks produced by `hash-train`, and implements the CGCF protocol. It is intended to be used as a command-line program and has a CLI.
diff --git a/docs/networks/H0.md b/docs/networks/H0.md
@@ -0,0 +1 @@
+# The H0 network architecture
diff --git a/hash-core/src/board.rs b/hash-core/src/board.rs
@@ -1,11 +1,10 @@
 use std::{fmt, fmt::Display, mem, str::FromStr};
 
 use crate::{
-    cache::CacheHash,
     index,
     index::zobrist,
     mg,
-    repr::{ColoredPieceTable, Move, Piece, PieceKind, PieceTable, Player},
+    repr::{ChessMove, ColoredPieceTable, Piece, PieceKind, PieceTable, Player},
 };
 use hash_bootstrap::{BitBoard, Color, Square};
 
@@ -23,12 +22,6 @@ pub struct Board {
     pub hash: u64,
 }
 
-impl CacheHash for Board {
-    fn hash(&self) -> u64 {
-        self.hash
-    }
-}
-
 impl Board {
     pub fn starting_position() -> Self {
         // Taken from https://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notation
@@ -124,7 +117,7 @@ impl Board {
     }
 
     // INVARIANT: The passed move must be legal in relation to the current board.
-    pub unsafe fn make_move_unchecked(&mut self, chess_move: &Move) {
+    pub unsafe fn make_move_unchecked(&mut self, chess_move: &ChessMove) {
         self.en_passant_capture_square = None;
         self.checkers = BitBoard::EMPTY;
         self.pinned = BitBoard::EMPTY;
@@ -287,7 +280,7 @@ impl Board {
         }
     }
 
-    pub fn gen_child_boards(&self) -> impl Iterator<Item = (Move, Board)> + '_ {
+    pub fn gen_child_boards(&self) -> impl Iterator<Item = (ChessMove, Board)> + '_ {
         mg::gen_moves(self).into_iter().map(|chess_move| {
             let mut new_board = *self;
             unsafe {

diff --git a/hash-core/src/cache.rs b/hash-core/src/cache.rs
diff --git a/hash-core/src/game.rs b/hash-core/src/game.rs
@@ -2,48 +2,26 @@ use std::str::FromStr;
 
 use hash_bootstrap::Color;
 
-use crate::{board::Board, cache::Cache, mg, repr::Move};
-
-const REPETITIONS: usize = 1000;
+use crate::{board::Board, mg, repr::ChessMove};
 
 pub enum Outcome {
     Win(Color),
     Draw,
 }
 
-#[derive(PartialEq)]
-enum Repetition {
-    Once,
-    Never,
-}
-
 pub struct Game {
     board: Board,
-    repetitions: Cache<Board, Repetition, REPETITIONS>,
 }
 
 impl Game {
     pub fn starting_position() -> Self {
         Self {
             board: Board::starting_position(),
-            repetitions: Cache::new(),
         }
     }
 
-    fn was_current_board_repeated_thrice(&self) -> bool {
-        if let Some(repetition) = self.repetitions.get(&self.board) {
-            *repetition == Repetition::Once
-        } else {
-            false
-        }
-    }
-
-    fn can_either_player_claim_draw(&self) -> bool {
-        self.board.min_ply_clock >= 100 || self.was_current_board_repeated_thrice()
-    }
-
     pub fn outcome(&self) -> Option<Outcome> {
-        if mg::gen_moves(&self.board).is_empty() || self.can_either_player_claim_draw() {
+        if mg::gen_moves(&self.board).is_empty() {
             Some(if self.board.in_check() {
                 Outcome::Win(!self.board.playing_color)
             } else {
@@ -54,16 +32,7 @@ impl Game {
         }
     }
 
-    pub unsafe fn make_move_unchecked(&mut self, chess_move: Move) {
-        self.repetitions.insert(
-            &self.board,
-            if self.repetitions.get(&self.board).is_none() {
-                Repetition::Never
-            } else {
-                Repetition::Once
-            },
-        );
-
+    pub unsafe fn make_move_unchecked(&mut self, chess_move: ChessMove) {
         // SAFETY: Move is assumed to be legal in this position
         unsafe {
             self.board.make_move_unchecked(&chess_move);
@@ -81,7 +50,6 @@ impl FromStr for Game {
     fn from_str(s: &str) -> Result<Self, Self::Err> {
         Ok(Self {
             board: Board::from_str(s)?,
-            repetitions: Cache::new(),
         })
     }
 }
diff --git a/hash-core/src/lib.rs b/hash-core/src/lib.rs
@@ -1,7 +1,6 @@
 #![feature(test)]
 
 pub mod board;
-mod cache;
 pub mod game;
 mod index;
 pub mod mg;