-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
v1.0.0: CLI, datasets, & README (#5)
## Overview - add command line interface based on the `cxxopts` [library](https://github.com/jarro2783/cxxopts), and `src` build flow - add three new word dictionary datasets - add proper `README.md` - add `GNU AGPL v3` license - general code cleanup for release ## Details - the CLI, after parsing options, instantiates a proper `cw_csp` and attempts to solve it - add `cw_gen` class to handle CLI actions and own the `cw_csp` to be solved - add example preset crosswords generation parameters to CLI - add progress bar option - datasets added were the top 3000 english words, top 10000 english words, and [Crossfire's](https://beekeeperlabs.com/crossfire/) default dictionary - gifs in `README.md` created using [terminalizer](https://github.com/faressoft/terminalizer) ## Notes - this is the last PR to be submitted before the project is put on pause until the summer 2024 SWE internship recruiting season is over or an offer is accepted - official `v1.0.0` release to soon follow - the `Future Work` section in `README.md` serves as the feature map for future versions - `word_finder` (finally) updated to follow RAII as referenced in #4
- Loading branch information
1 parent
178a9d7
commit 028db42
Showing
30 changed files
with
203,892 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,10 @@ | |
*.out | ||
*.app | ||
|
||
#macOS | ||
.DS_Store | ||
|
||
# build products | ||
*.o | ||
test/build | ||
test/build | ||
src/build |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,94 @@ | ||
# crossword-gen | ||
|
||
This project is in progress: to run current testing suite, navigate to `crossword-gen/test` and run `sh test.sh` | ||
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) ![Version: 1.0](https://img.shields.io/badge/Version-1.0-blue) | ||
|
||
`cw_gen` lets you generate crossword puzzles by specifying a word list and a grid to fill in. | ||
|
||
![Animated terminal gif to illustrate generation of example crosswords](assets/example.gif) | ||
|
||
## 🚀 Installation | ||
|
||
![Animated terminal gif to illustrate building of project](assets/build.gif) | ||
|
||
The unix binary is already included as `cw_gen`. To re-build, run the build script: | ||
|
||
```bash | ||
sh build.sh | ||
``` | ||
|
||
## ⚙️ Usage | ||
|
||
![Animated terminal gif to illustrate generation of crosswords with user parameters](assets/demo.gif) | ||
|
||
```bash | ||
./cw_gen [OPTIONS] | ||
``` | ||
|
||
### 🧩 `size` | ||
|
||
| Option | Description | Type | Default | | ||
|----------------|--------------------------------------------------------------------------------------------------------------|---------|---------| | ||
| `-s` | Dimensions of the crossword to be generated, in the format `columns,rows`. Size must be greater than 1x1. Large sizes may take several minutes or more to solve, especially if a large dictionary is used, or if difficult or no grid contents are specified. | int,int | `4,4` | | ||
|
||
### 📝 `grid contents` | ||
|
||
| Option | Description | Type | Default | | ||
|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|---------| | ||
| `-c` | Crossword grid to fill in, from top left to bottom right by row, and **contained within quotes**. Each char represents one tile and must be a set letter (`a-z`), wildcard (`?`), or black tile (single space). Total length must equal `rows * columns`. If left unspecified, all tiles are set to wildcards. | string | none | | ||
|
||
### 📚 `word dictionary` | ||
|
||
| Option | Description | Type | Default | | ||
|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|------------| | ||
| `-d` | Word list to use. Possible options from smallest to largest are `top1000`, `top3000`, `top10000`, `crossfire`, and `all`. Larger dictionaries are more effective for finding solutions for larger puzzles at the cost of likely increased runtime and possibly more unfamiliar words chosen. | string | `top10000` | | ||
|
||
### 📋 `examples` | ||
|
||
| Option | Description | Type | Default | | ||
|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|---------| | ||
| `-e` | Preset example crossword generation params, **overriding size, contents, and dictionary if specified**. Possible options are `empty`, `cross`, `bridge`, `stairs`, `donut`, `crosshair1`, and `crosshair2`. | string | none | | ||
|
||
### 🗣️ `verbosity` | ||
|
||
| Option | Description | Type | Default | | ||
|---------------------|------------------------------------------------------------------------------------------------------------------------------------|--------|---------| | ||
| `-v` | Minimum verbosity for print statements. Possible options in decreasing level are `fatal`, `error`, `warning`, `info`, and `debug`. | string | `fatal` | | ||
|
||
### 🔄 `progress bar` | ||
|
||
| Option | Description | Type | Default | | ||
|----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|------|---------| | ||
| `-p` | Enables progress bar; useful when generating larger crossword puzzles with longer runtimes. Disabled by default unless specified. | none | none | | ||
|
||
### ℹ️ `help` | ||
|
||
| Option | Description | Type | Default | | ||
|------------|--------------------------------------------------------------------|------|---------| | ||
| `-h` | Print usage to display a summary of all options, their usage, and descriptions. | none | none | | ||
|
||
Since generation is done purely via heuristics without randomness, re-generating a crossword with the same parameters is likely to return the same output, especially if not many valid solutions for those parameters exist. Also please note that if used, the grid contents string length must match the puzzle size or the default 4x4 puzzle size if none is specified. | ||
|
||
## ✅ Tests | ||
|
||
![Animated terminal gif to illustrate testing the project](assets/test.gif) | ||
|
||
This project uses the [Catch2 v2.x](https://github.com/catchorg/Catch2/tree/v2.x) testing framework to test each module. To run the test suite, navigate to the `test/` directory and run the test script: | ||
|
||
```bash | ||
cd test | ||
sh test.sh | ||
``` | ||
|
||
## ⚙️ How it works | ||
|
||
After parsing the crossword parameters, `cw_gen` constructs a [constraint satisfaction problem (CSP)](https://en.wikipedia.org/wiki/Constraint_satisfaction_problem) that represents the crossword. CSP variables symbolize words, and their domains are initially set to all the words from the user-specified dictionary that fit in that slot. CSP constraints occur when any two variables intersect in the crossword; those two letters in each of the word variables must be equal. | ||
|
||
Using a combination of the [AC-3 algorithm](https://en.wikipedia.org/wiki/AC-3_algorithm) and [backtracking](https://en.wikipedia.org/wiki/Backtracking), the CSP iteratively assigns values to variables until a valid solution is found or the search space is exhausted. | ||
|
||
## 💡 Future work | ||
|
||
- Incorporate word frequency or other scoring heuristics into word selection to create minimize unfamiliar words used and speed up generation | ||
- Explore alternatives to arc consistency such as path consistency to improve CSP simplification performance | ||
- Investigate different CSP solving options in addition to backtracking like the VLNS method, local search, and/or linear programming | ||
- Upgrade CSPs to be [dynamic CSPs](https://en.wikipedia.org/wiki/Constraint_satisfaction_problem#Dynamic_CSPs) to allow insertion of black tiles to complete crosswords. This will minimize user effort required to avoid impossible crossword grid patterns | ||
- Clue generation/lookup |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# ================================================================== | ||
# Author: Ashley Zhang ([email protected]) | ||
# Date: 9/6/2023 | ||
# Description: build script to build tool w/ Makefile | ||
# ================================================================== | ||
|
||
SRCDIR="src" | ||
BUILDDIR="build" | ||
MAINFILE="cw_gen" | ||
LOGFILE="buildlog" | ||
|
||
cd ${SRCDIR} | ||
|
||
# soft clean | ||
rm -rf ${BUILDDIR} | ||
mkdir ${BUILDDIR} | ||
|
||
# build | ||
make > ${BUILDDIR}/${LOGFILE}.txt | ||
|
||
# clean up | ||
mv *.o ${BUILDDIR} | ||
cd .. | ||
mv ${SRCDIR}/${MAINFILE} ${MAINFILE} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
# ================================================================== | ||
# Author: Ashley Zhang ([email protected]) | ||
# Date: 7/4/2023 | ||
# Description: Makefile to build tool for running | ||
# ================================================================== | ||
|
||
# main directories | ||
SRC_DIR=./ | ||
|
||
# src dirs | ||
UTIL_DIR=utils | ||
COMMON_DIR=common | ||
WORDFINDER_DIR=word_finder | ||
CROSSWORD_DIR=crossword | ||
CWCSP_DIR=cw_csp | ||
CWGEN_DIR=cli | ||
|
||
# g++ settings | ||
D = -D | ||
DEFINES = $(D) CW_TOP | ||
CPPFLAGS += -pipe | ||
CXXFLAGS += -g -W -Werror -Wall -Wconversion -Wextra -std=c++17 $(DEFINES) | ||
INC=-I$(SRC_DIR) | ||
|
||
BIN = cw_gen | ||
|
||
all : $(BIN) | ||
|
||
clean : | ||
rm -f $(BIN) *.o | ||
|
||
# path of where object files will be created | ||
OFILES = \ | ||
common_data_types.o \ | ||
cw_utils.o \ | ||
common_parent.o \ | ||
\ | ||
word_finder.o \ | ||
word_finder_data_types.o \ | ||
\ | ||
crossword.o \ | ||
crossword_data_types.o \ | ||
\ | ||
cw_csp.o \ | ||
cw_csp_data_types.o \ | ||
\ | ||
cw_gen.o \ | ||
|
||
# full path of all header files to build | ||
HFILES = \ | ||
$(SRC_DIR)/$(UTIL_DIR)/cw_utils.h \ | ||
$(SRC_DIR)/$(COMMON_DIR)/common_data_types.h \ | ||
$(SRC_DIR)/$(COMMON_DIR)/common_parent.h \ | ||
\ | ||
$(SRC_DIR)/$(WORDFINDER_DIR)/word_finder.h \ | ||
$(SRC_DIR)/$(WORDFINDER_DIR)/word_finder_data_types.h \ | ||
\ | ||
$(SRC_DIR)/$(CROSSWORD_DIR)/crossword.h \ | ||
$(SRC_DIR)/$(CROSSWORD_DIR)/crossword_data_types.h \ | ||
\ | ||
$(SRC_DIR)/$(CWCSP_DIR)/cw_csp.h \ | ||
$(SRC_DIR)/$(CWCSP_DIR)/cw_csp_data_types.h \ | ||
\ | ||
$(SRC_DIR)/$(CWGEN_DIR)/cw_gen.h \ | ||
$(SRC_DIR)/$(CWGEN_DIR)/cw_gen_data_types.h \ | ||
$(SRC_DIR)/$(CWGEN_DIR)/cxxopts.hpp \ | ||
|
||
$(BIN) : $(OFILES) | ||
$(CXX) $(INC) *.o -o $(BIN) | ||
|
||
cw_gen.o : $(SRC_DIR)/$(CWGEN_DIR)/cw_gen.cpp | ||
$(CXX) $(CPPFLAGS) $(CXXFLAGS) $(INC) -c $< | ||
|
||
#################### SRC #################### | ||
|
||
# rule to compile all cpp files | ||
%.o : $(SRC_DIR)/*/%.cpp $(HFILES) | ||
$(CXX) $(CPPFLAGS) $(CXXFLAGS) $(INC) -c $< |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
// ================================================================== | ||
// Author: Ashley Zhang ([email protected]) | ||
// Date: 9/6/2023 | ||
// Description: class implementation & main() for main crossword generation tool | ||
// ================================================================== | ||
|
||
#include "cw_gen.h" | ||
|
||
using namespace cw_gen_ns; | ||
|
||
/** | ||
* @brief constructor for a crossword generation tool with all default values | ||
*/ | ||
cw_gen::cw_gen(string name) : common_parent(name) { | ||
// do nothing | ||
} | ||
|
||
/** | ||
* @brief shorthand function to squash param options into a readable str | ||
* | ||
* @param options vector of options to squash | ||
* @returns str representation of options | ||
*/ | ||
string cw_gen::squash_options(vector<string> options) { | ||
stringstream opts_ss; | ||
|
||
for(uint i = 0; i < options.size(); i++) { | ||
opts_ss << "\"" << options[i] << "\""; | ||
if(i != options.size() - 1) opts_ss << ", "; | ||
if(i == options.size() - 2) opts_ss << "or "; | ||
} | ||
|
||
return opts_ss.str(); | ||
} | ||
|
||
/** | ||
* @brief build cw_csp after all params set | ||
*/ | ||
void cw_gen::build() { | ||
if(has_grid_contents) { | ||
csp = make_unique<cw_csp>("puzzle", length, height, contents, dict_path[dict]); | ||
} else { | ||
csp = make_unique<cw_csp>("puzzle", length, height, dict_path[dict]); | ||
} | ||
} | ||
|
||
/** | ||
* @brief main function using command line interface w/ cxxopts to generate crosswords | ||
*/ | ||
int main(int argc, char** argv) { | ||
|
||
unique_ptr<cw_gen> cwgen = make_unique<cw_gen>("cwgen"); | ||
cxxopts::Options options("cw_gen", "Crossword puzzle generation tool given word list and blank grid to fill in"); | ||
|
||
// key: param name, value: set of allowed param values | ||
unordered_map<string, vector<string> > param_vals = { | ||
{"dict", {"top1000", "top3000", "top10000", "crossfire", "all"}}, | ||
{"example", {"empty", "cross", "bridge", "stairs", "donut", "crosshair1", "crosshair2"}}, | ||
{"verbosity", {"fatal", "error", "warning", "info", "debug"}} | ||
}; | ||
|
||
// because these are very long | ||
stringstream contents_desc, examples_desc; | ||
contents_desc << "Puzzle grid contents from top left to bottom right by row, all lowercase, in quotes. wildcard is: \"" | ||
<< WILDCARD << "\", black is: \"" << BLACK << "\""; | ||
examples_desc << "Example crossword puzzles (overrides size/contents/dict): " << cw_gen::squash_options(param_vals["example"]); | ||
|
||
options.add_options() | ||
("s,size", "Puzzle size, in format length,height", cxxopts::value<vector<int>>()->default_value("4,4")) | ||
("c,contents", contents_desc.str(), cxxopts::value<string>()) | ||
("d,dict", "Word dictionary: " + cw_gen::squash_options(param_vals["dict"]), cxxopts::value<string>()->default_value("top10000")) | ||
("e,example", examples_desc.str(), cxxopts::value<string>()) | ||
("v,verbosity", "Debug verbosity: " + cw_gen::squash_options(param_vals["verbosity"]), cxxopts::value<string>()->default_value("fatal")) | ||
("p,progress", "Enable progress bar (default: false)", cxxopts::value<bool>()) | ||
("h,help", "Print usage") | ||
; | ||
|
||
auto result = options.parse(argc, argv); | ||
|
||
// ############### help ############### | ||
|
||
// print help msg if specified | ||
if(result.count("help")) { | ||
cout << options.help() << endl; | ||
exit(0); | ||
} | ||
|
||
// ############### example ############### | ||
|
||
if(result.count("example")) { | ||
string example = result["example"].as<string>(); | ||
if(std::find(param_vals["example"].begin(), param_vals["example"].end(), example) == param_vals["example"].end()) { | ||
cout << "Error: got invalid example option " << example << ", allowed: " << cw_gen::squash_options(param_vals["example"]) << endl; | ||
exit(1); | ||
} | ||
|
||
cwgen->set_dimensions(examples_map[example].length, examples_map[example].height); | ||
cwgen->set_dict(examples_map[example].dict); | ||
cwgen->set_contents(examples_map[example].contents); | ||
} else { | ||
// non-example, use user-specified size/dict/contents | ||
|
||
// ############### size ############### | ||
|
||
vector<int> dimensions = result["size"].as<vector<int>>(); | ||
if(dimensions.size() != 2) { | ||
cout << "Error: got unknown number of size dimensions (" << dimensions.size() << "), should be 2" << endl; | ||
exit(1); | ||
} | ||
if(dimensions[0] <= 0 || dimensions[1] <= 0) { | ||
cout << "Error: crossword dimensions must be positive and non-zero" << endl; | ||
exit(1); | ||
} else if(dimensions[0] == 1 && dimensions[1] == 1) { | ||
cout << "Error: 1x1 crossword contains no valid words, word min len = " << MIN_WORD_LEN << endl; | ||
exit(1); | ||
} | ||
cwgen->set_dimensions((uint)dimensions[0], (uint)dimensions[1]); | ||
|
||
// ############### dictionary ############### | ||
|
||
string dict = result["dict"].as<string>(); | ||
if(std::find(param_vals["dict"].begin(), param_vals["dict"].end(), dict) == param_vals["dict"].end()) { | ||
cout << "Error: got invalid dictionary option " << dict << ", allowed: " << cw_gen::squash_options(param_vals["dict"]) << endl; | ||
exit(1); | ||
} | ||
cwgen->set_dict(dict); | ||
|
||
// ############### contents ############### | ||
|
||
if(result.count("contents")) { | ||
string contents = result["contents"].as<string>(); | ||
if(contents.size() != cwgen->num_tiles()) { | ||
cout << "Error: got illegal contents length (" << contents.size() << ") given dimensions, should be: " << cwgen->num_tiles() << endl; | ||
exit(1); | ||
} | ||
for(char c : contents) { | ||
if(c != BLACK && c != WILDCARD && (c < 'a' || c > 'z')) { | ||
cout << "Error: got illegal char in contents: " << c << ", all chars should be a lowercase letter, a wildcard (\"" << WILDCARD << "\"), or a black tile (\"" << BLACK << "\")" << endl; | ||
exit(1); | ||
} | ||
} | ||
cwgen->set_contents(contents); | ||
} | ||
} | ||
|
||
// ############### verbosity ############### | ||
|
||
string verbosity = result["verbosity"].as<string>(); | ||
if(std::find(param_vals["verbosity"].begin(), param_vals["verbosity"].end(), verbosity) == param_vals["verbosity"].end()) { | ||
cout << "Error: got invalid verbosity option " << verbosity << ", allowed " << cw_gen::squash_options(param_vals["verbosity"]) << endl; | ||
exit(1); | ||
} | ||
VERBOSITY = verbosity_map[verbosity]; | ||
|
||
// ############### progress bar ############### | ||
|
||
if(result.count("progress")) { | ||
cwgen->enable_progress_bar(); | ||
} | ||
|
||
// ############### cw_csp solving ############### | ||
|
||
cwgen->build(); | ||
if(cwgen->solve()) { | ||
cout << "found crossword: " << cwgen->result() << endl; | ||
} else { | ||
cout << "Error: no valid crossword generated for the given parameters. " | ||
<< "Try an example, or use different dimensions, grid contents, or dictionary" << endl; | ||
exit(1); | ||
} | ||
|
||
return 0; | ||
} |
Oops, something went wrong.