---
title: "Sudoku"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Sudoku}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = FALSE,
comment = " "
)
```
```{r setup, echo=FALSE}
library(satire)
library(ggplot2)
```
```{r echo=FALSE}
board <- function(n = 2) {
b0 <- expand.grid(x = seq(n^2), y = seq(n^2))
b1 <- expand.grid(x = seq(n) * n - (n-1)/2, y = seq(n) * n - (n-1)/2)
ggplot() +
geom_tile(data = b0, aes(x, y), width = n^0, height = n^0, fill = NA, color = 'black', linewidth = 0.25) +
geom_tile(data = b1, aes(x, y), width = n^1, height = n^1, fill = NA, color = 'black', linewidth = 1) +
coord_equal() +
theme_void() +
theme(plot.title = element_text(hjust = 0.5, size = 15)) +
scale_y_reverse()
}
```
[Sudoku](https://en.wikipedia.org/wiki/Sudoku) are a logic-based, combinatorial
number placement puzzle. The objective is to fill a grid with digits such that each column,
row and sub-grid contains all the required digits.
This is usually a 9x9 grid, with 9 3x3 sub-grids.
## Solving a 4x4 sudoku with SAT
In this vignette, this 4x4 sudoku will be solved using SAT techniques.
```{r echo=FALSE}
prob <- data.frame(
row = c(1, 2, 2, 4),
col = c(2, 2, 3, 4),
val = c(3, 2, 3, 4)
)
board() +
geom_text(data = prob, aes(col, row, label = val), size = 12) +
labs(title = "Problem to solve") +
theme(plot.title = element_text(hjust = 0.5, size = 15))
```
## Nomenclature
To start, assign a name to each cell.
E.g. `c23` is the cell in the **second** row, **third** column.
```{r echo=FALSE}
cells <- expand.grid(row = 1:4, col = 1:4)
cells$nm <- with(cells, paste0("c", row, col))
board() +
geom_text(data = cells, aes(col, row, label = nm)) +
labs(title = "Cell names") +
theme(title = element_text(hjust = 0.5))
```
Each cell is represented by **4** variables - one for each possible digit that can
exist at the location.
E.g.
* `c231` indicates that the number `1` is present in the cell in the second row, third column.
* `c413` indicates that the number `3` is present in the cell in the fourth row, first column.
```{r echo=FALSE, fig.height = 4, fig.width = 4}
vars <- expand.grid(row = 1:4, col = 1:4)
vars$nm <- vapply(seq(nrow(vars)), \(i) {
cell <- paste0('c', vars$row[i], vars$col[i])
cell <- paste0(cell, 1:4, collapse = ", ")
strwrap(cell, width = 15) |> paste(collapse = "\n")
}, character(1))
board() +
geom_text(data = vars, aes(col, row, label = nm), size = 3) +
labs(title = "Variable names") +
theme(title = element_text(hjust = 0.5))
```
## Constraints
* Each cell must contain one (and only one) of the numbers 1-4
* E.g. `c111` and `c112` cannot both be true
* Each of the numbers (1-4) must appear in each row, each column, and within
each of the 2x2 sub-squares
* In the definition of this particular problem, there is a `3` in the second column
of the first row. This means that `c123` is true (and `c121`, `c122` and `c124` are false)
* Similarly: `c222`, `c233` and `c444` are true as they are the starting
state of the problem
## Defining a SAT problem - building blocks
A SAT problem is defined by combinations of expressions about the variables.
Each of these expressions is constructed as a string.
### Building block: at least one of a set of variables must be true.
If `a`, `b` and `c` are Boolean variables, to state that at least one of them
must be true is:
`a | b | c`
To add this constraint to a `sat` object, use the cardinality helper function
i.e. `sat_card_atleast_one()`
```{r}
sat <- sat_new()
sat <- sat_card_atleast_one(sat, c('a', 'b', 'c'))
sat$exprs
```
### Building block: at most one of a set of variables must be true
To express that only one variable is true out of a set of variables, one
method is to state that "for all possible pairings, one of them must be false".
If `a`, `b` and `c` are Boolean variables, to state that at most one of them
must be true is:
`(!a | !b) & (!a | !c) & (!b | !c)`
To add this constraint to a `sat` object, use the cardinality helper function
i.e. `sat_card_atmost_one()`
```{r}
sat <- sat_new()
sat <- sat_card_atmost_one(sat, c('a', 'b', 'c'))
sat$exprs
```
### Building block: exactly one of a set of variables must be true
A common way to express that exactly one out of a set of variables is true
is to state that "at least one is true" **and** "at most one is true" at the
same time i.e.
`(a | b | c) & (!a | !b) & (!a | !c) & (!b | !c)`
To add this constraint to a `sat` object, use the cardinality helper function
i.e. `sat_card_exactly_one()`
```{r}
sat <- sat_new()
sat <- sat_card_exactly_one(sat, c('a', 'b', 'c'))
sat$exprs
```
## Defining the SAT problem with `{satire}`
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a new SAT problem
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat <- sat_new()
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# For each cell in the puzzle, it must contain exactly one of 1,2,3 or 4.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_exactly_one(sat, c('c111', 'c112', 'c113', 'c114'))
sat_card_exactly_one(sat, c('c121', 'c122', 'c123', 'c124'))
sat_card_exactly_one(sat, c('c131', 'c132', 'c133', 'c134'))
sat_card_exactly_one(sat, c('c141', 'c142', 'c143', 'c144'))
sat_card_exactly_one(sat, c('c211', 'c212', 'c213', 'c214'))
sat_card_exactly_one(sat, c('c221', 'c222', 'c223', 'c224'))
sat_card_exactly_one(sat, c('c231', 'c232', 'c233', 'c234'))
sat_card_exactly_one(sat, c('c241', 'c242', 'c243', 'c244'))
sat_card_exactly_one(sat, c('c311', 'c312', 'c313', 'c314'))
sat_card_exactly_one(sat, c('c321', 'c322', 'c323', 'c324'))
sat_card_exactly_one(sat, c('c331', 'c332', 'c333', 'c334'))
sat_card_exactly_one(sat, c('c341', 'c342', 'c343', 'c344'))
sat_card_exactly_one(sat, c('c411', 'c412', 'c413', 'c414'))
sat_card_exactly_one(sat, c('c421', 'c422', 'c423', 'c424'))
sat_card_exactly_one(sat, c('c431', 'c432', 'c433', 'c434'))
sat_card_exactly_one(sat, c('c441', 'c442', 'c443', 'c444'))
```
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Column 1 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c111', 'c211', 'c311', 'c411'))
sat_card_atleast_one(sat, c('c112', 'c212', 'c312', 'c412'))
sat_card_atleast_one(sat, c('c113', 'c213', 'c313', 'c413'))
sat_card_atleast_one(sat, c('c114', 'c214', 'c314', 'c414'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Column 2 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c121', 'c221', 'c321', 'c421'))
sat_card_atleast_one(sat, c('c122', 'c222', 'c322', 'c422'))
sat_card_atleast_one(sat, c('c123', 'c223', 'c323', 'c423'))
sat_card_atleast_one(sat, c('c124', 'c224', 'c324', 'c424'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Column 3 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c131', 'c231', 'c331', 'c431'))
sat_card_atleast_one(sat, c('c132', 'c232', 'c332', 'c432'))
sat_card_atleast_one(sat, c('c133', 'c233', 'c333', 'c433'))
sat_card_atleast_one(sat, c('c134', 'c234', 'c334', 'c434'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Column 4 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c141', 'c241', 'c341', 'c441'))
sat_card_atleast_one(sat, c('c142', 'c242', 'c342', 'c442'))
sat_card_atleast_one(sat, c('c143', 'c243', 'c343', 'c443'))
sat_card_atleast_one(sat, c('c144', 'c244', 'c344', 'c444'))
```
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Row 1 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c111', 'c121', 'c131', 'c141'))
sat_card_atleast_one(sat, c('c112', 'c122', 'c132', 'c142'))
sat_card_atleast_one(sat, c('c113', 'c123', 'c133', 'c143'))
sat_card_atleast_one(sat, c('c114', 'c124', 'c134', 'c144'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Row 2 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c211', 'c221', 'c231', 'c241'))
sat_card_atleast_one(sat, c('c212', 'c222', 'c232', 'c242'))
sat_card_atleast_one(sat, c('c213', 'c223', 'c233', 'c243'))
sat_card_atleast_one(sat, c('c214', 'c224', 'c234', 'c244'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Row 3 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c311', 'c321', 'c331', 'c341'))
sat_card_atleast_one(sat, c('c312', 'c322', 'c332', 'c342'))
sat_card_atleast_one(sat, c('c313', 'c323', 'c333', 'c343'))
sat_card_atleast_one(sat, c('c314', 'c324', 'c334', 'c344'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Row 4 must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c411', 'c421', 'c431', 'c441'))
sat_card_atleast_one(sat, c('c412', 'c422', 'c432', 'c442'))
sat_card_atleast_one(sat, c('c413', 'c423', 'c433', 'c443'))
sat_card_atleast_one(sat, c('c414', 'c424', 'c434', 'c444'))
```
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Top-left sub-square must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c111', 'c211', 'c121', 'c221'))
sat_card_atleast_one(sat, c('c112', 'c212', 'c122', 'c222'))
sat_card_atleast_one(sat, c('c113', 'c213', 'c123', 'c223'))
sat_card_atleast_one(sat, c('c114', 'c214', 'c124', 'c224'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Bottom-left sub-square must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c311', 'c411', 'c321', 'c421'))
sat_card_atleast_one(sat, c('c312', 'c412', 'c322', 'c422'))
sat_card_atleast_one(sat, c('c313', 'c413', 'c323', 'c423'))
sat_card_atleast_one(sat, c('c314', 'c414', 'c324', 'c424'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Top-right sub-square must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c131', 'c231', 'c141', 'c241'))
sat_card_atleast_one(sat, c('c132', 'c232', 'c142', 'c242'))
sat_card_atleast_one(sat, c('c133', 'c233', 'c143', 'c243'))
sat_card_atleast_one(sat, c('c134', 'c234', 'c144', 'c244'))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Bottom-right sub-square must have at least one 1, one 2, one 3 and one 4
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_card_atleast_one(sat, c('c331', 'c431', 'c341', 'c441'))
sat_card_atleast_one(sat, c('c332', 'c432', 'c342', 'c442'))
sat_card_atleast_one(sat, c('c333', 'c433', 'c343', 'c443'))
sat_card_atleast_one(sat, c('c334', 'c434', 'c344', 'c444'))
```
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The initially defined numbers in the puzzle are set to TRUE
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat_add_exprs(sat, "c123 & c222 & c233 & c444")
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Ready to solve
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sat
```
## Solving
There are 64 variables in this SAT problem, which means a brute-force approach
would have to try ~1.8E19 combinations which is not feasible.
Instead, use the [DPLL](https://en.wikipedia.org/wiki/DPLL_algorithm)-based solver
which uses unit propagation, pure-literal
elimination, pruning and backtracking to find a solution.
The solution returned is the TRUE/FALSE state of every variable.
It is up to the user to interpret what this means in the context of their problem
```{r}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 'sol' is a data.frame. Column names are the variables, and each row
# is a solution to the problem.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sol <- sat_solve_dpll(sat)
sol
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The actual solution needed for this sudoku is where the actual numbers
# exist, I.e. where the variables are true
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sol <- names(sol)[unlist(sol)]
sol
```
Plotting these true variables gives the solution to the sudoku
```{r echo=FALSE}
vars <- expand.grid(row = 1:4, col = 1:4, n = 1:4)
vars$nm <- with(vars, paste0("c", row, col, n))
vars <- vars[vars$nm %in% sol, ]
board() +
geom_text(data = vars, aes(col, row, label = nm)) +
labs(title = "'true' variables in the solution")
board() +
geom_text(data = vars, aes(col, row, label = n), size = 12) +
labs(title = "Solution")
```