Name: Sudoku Main class: org.drools.examples.sudoku.Main Type: Java application Rules file: sudokuSolver.drl, sudokuValidator.drl Objective: Demonstrates the solving of logic problems, and complex pattern matching.
This example demonstrates how Drools can be used to find a solution in a large potential solution space based on a number of constraints. We use the popular puzzle of Sudoku. This example also shows how Drools can be integrated into a graphical interface and how callbacks can be used to interact with a running Drools rules engine in order to update the graphical interface based on changes in the Working Memory at runtime.
Sudoku is a logic-based number placement puzzle. The objective is to fill a 9x9 grid so that each column, each row, and each of the nine 3x3 zones contains the digits from 1 to 9, once, and only once.
The puzzle setter provides a partially completed grid and the puzzle solver's task is to complete the grid with these constraints.
The general strategy to solve the problem is to ensure that when you insert a new number it should be unique in its particular 3x3 zone, row and column.
See
URL: http://en.wikipedia.org/wiki/Sudoku
for a more detailed description.
Download and install drools-examples as described above and then
execute java org.drools.examples.sudoku.Main
. This
example requires Java 5.
A window will be displayed with a relatively simple partially filled grid.
Click on the "Solve" button and the Drools-based engine will fill out the remaining values. The Console window will display detailed information about the rules which are executing to solve the puzzle in a human readable form.
Rule #3 determined the value at (4,1) could not be 4 as this value already exists in the same column at (8,1) Rule #3 determined the value at (5,5) could not be 2 as this value already exists in the same row at (5,6) Rule #7 determined (3,5) is 2 as this is the only possible cell in the column that can have this value Rule #1 cleared the other PossibleCellValues for (3,5) as a ResolvedCellValue of 2 exists for this cell. Rule #1 cleared the other PossibleCellValues for (3,5) as a ResolvedCellValue of 2 exists for this cell. ... Rule #3 determined the value at (1,1) could not be 1 as this value already exists in the same zone at (2,1) Rule #6 determined (1,7) is 1 as this is the only possible cell in the row that can have this value Rule #1 cleared the other PossibleCellValues for (1,7) as a ResolvedCellValue of 1 exists for this cell. Rule #6 determined (1,1) is 8 as this is the only possible cell in the row that can have this value
Once all of the activated rules for the solving logic have
executed, the engine executes a second rule base to check that the
solution is complete and valid. In this case it is, and the "Solve"
button is disabled and displays a text like
"Solved (1052ms)"
.
The example comes with a number of grids which can be loaded and solved. Click on "File", then "Samples" and "Medium" to load a more challenging grid. Note that the solve button is enabled when the new grid is loaded.
Click on the "Solve" button again to solve this new grid.
Now, let us load a Sudoku grid that is deliberately invalid. Click on "File", "Samples" and "!DELIBERATELY BROKEN!". Note that this grid starts with some issues, for example the value 5 appears twice in the first row.
Nevertheless, click on the "Solve" button to apply the solving rules to this invalid grid. Note that the "Solve" button is relabelled to indicate that the resulting solution is invalid.
In addition, the validation rule set outputs all of the issues which are discovered to the console.
There are two cells on the same column with the same value at (6,0) and (4,0) There are two cells on the same column with the same value at (4,0) and (6,0) There are two cells on the same row with the same value at (2,4) and (2,2) There are two cells on the same row with the same value at (2,2) and (2,4) There are two cells on the same row with the same value at (6,3) and (6,8) There are two cells on the same row with the same value at (6,8) and (6,3) There are two cells on the same column with the same value at (7,4) and (0,4) There are two cells on the same column with the same value at (0,4) and (7,4) There are two cells on the same row with the same value at (0,8) and (0,0) There are two cells on the same row with the same value at (0,0) and (0,8) There are two cells on the same column with the same value at (1,2) and (3,2) There are two cells on the same column with the same value at (3,2) and (1,2) There are two cells in the same zone with the same value at (6,3) and (7,3) There are two cells in the same zone with the same value at (7,3) and (6,3) There are two cells on the same column with the same value at (7,3) and (6,3) There are two cells on the same column with the same value at (6,3) and (7,3)
We will look at the solving rule set later in this section, but for the moment we should note that some theoretically solvable solutions can not be solved by the engine as it stands. Click on "File", "Samples" and then "Hard 3" to load a sparsely populated grid.
Now click on the "Solve" button and note that the current rules are unable to complete the grid, even though (if you are a Sudoku aficionado) you may be able to see a way forward with the solution.
At the present time, the solving functionality has been achieved by the use of ten rules. This rule set could be extended to enable the engine to tackle more complex logic for filling grids such as this.
The Java source code can be found in the /src/main/java/org/drools/examples/sudoku directory, with the two DRL files defining the rules located in the /src/main/rules/org/drools/examples/sudoku directory.
The package org.drools.examples.sudoku.swing
contains a set of classes which implement a framework for Sudoku
puzzles. Note that this package does not have any dependencies on
the Drools libraries. SudokuGridModel
defines an interface which can be implemented to store a Sudoku puzzle
as a 9x9 grid of Integer
values, some of which may be null,
indicating that the value for the cell has not yet been resolved.
SudokuGridView
is
a Swing component which can visualize any implementation of
SudokuGridModel
. SudokuGridEvent
and
SudokuGridListener
are used to
communicate state changes between the model and the view: events are
fired when a cell's value is resolved or changed. If you are familiar
with the model-view-controller patterns in other Swing components such
as JTable
then this pattern should be familiar.
SudokuGridSamples
provides a number of partially filled
Sudoku puzzles for demonstration purposes.
Package org.drools.examples.sudoku.rules
contains an
implementation of SudokuGridModel
which is based on Drools.
Two Java objects are used, both of which extend
AbstractCellValue
and represent a value for a specific cell
in the grid, including the row and column location of the cell, an index
of the 3x3 zone the cell is contained in, and the value of the cell.
PossibleCellValue
indicates that we do not currently know
for sure what the value in a cell is. There can be from 2 to 9 possible
cell values for a given cell. ResolvedCellValue
indicates
that we have determined what the value
for a cell must be. There can only be one resolved cell value for a
given cell. DroolsSudokuGridModel
implements
SudokuGridModel
and is responsible for converting an
initial two dimensional array of partially specified cells into a set
of CellValue
Java object, creating a Working Memory
based on solverSudoku.drl
and inserting the
CellValue
objects into the Working Memory. When the
solve()
method is called it calls in turn
fireAllRules()
on this Working Memory to try to solve
the puzzle. DroolsSudokuGridModel
attaches a
WorkingMemoryListener
to the Working Memory, which
allows it to be called back on insert and retract events as the
puzzle is solved. When a new ResolvedCellValue
is inserted
into the Working Memory, this callback allows the implementation to
fire a SudokuGridEvent
to its
SudokuGridListener
clientele, which can then update
themselves in realtime. Once all the rules fired by the solver Working
Memory have executed, DroolsSudokuGridModel
runs a
second set of rules, based on validatorSudoku.drl
which works with the same set of Java objects to determine whether the
resulting grid is a valid and a full solution.
The class org.drools.examples.sudoku.Main
implements
a Java application combining the components desribed.
The packae org.drools.examples.sudoku
contains two
DRL files. solverSudoku.drl
defines the rules
which attempt to solve a Sudoku puzzle, and
validator.drl
defines the rules which determin
whether the current state of the Working Memory represents a valid
solution. Both use PossibleCellValue
and
ResolvedCellValue
objects as their facts and
both output information to the Console window as their rules fire. In a
real-world situation we would insert logging information and use the
WorkingMemoryListener
to display this information to a
user, rather than use the console in this fashion.
We start with the validator rules as this rule set is shorter and simpler than the solver rule set.
The first rule simply checks that no PossibleCellValue
objects remain in the Working Memory. Once the puzzle is solved, only
ResolvedCellValue
objects should be present, one for
each cell.
The other three rules each match all of the
ResolvedCellValue
objects and bind them to the
variable $resolved1
. They then look for
ResolvedCellValues
that
contain the same value and are located, respectively, in the same row,
column, or 3x3 zone. If these rules are fired they add a message to a
global list of strings describing the reason the solution is invalid.
DroolsSudokoGridModel
injects this list before it runs
the rule set and checks whether it is empty or not after having called
fireAllRules()
. If it is not empty then it prints all the
strings in the list and sets a flag to indicate that the grid is not
solved.
Now let us look at the more complex rule set used to solve Sudoku puzzles.
Rule #1 is basically a book-keeping rule. Several of the other
rules insert ResolvedCellValues
into the working memory
at specific rows and columns after they have determined that a given
cell must have a
certain value. At this point, it is important to clear the Working
Memory of any inserted PossibleCellValues
at the same
row and column
with invalid values. This rule is therefore given a higher salience than
the remaining rules to ensure that as soon as the LHS is true,
activations for the rule move to the top of the Agenda and are fired. In
turn, this prevents the spurious firing of other rules due to the
combination of a ResolvedCellValue
and one or more
PossibleCellValues
being present in the same cell.
This rule also calls update()
on the
ResolvedCellValue
, even though its value has not in
fact been modified to ensure that Drools fires an event to any
WorkingMemoryListeners
attached to the Working Memory
so that they can update themselves - in
this case so that the GUI can display the new state of the grid.
Rule #2 identifies cells in the grid which have only one possible
value. The first line of the <kw>when</kw> clause matches all of the
PossibleCellValue
objects in the Working Memory. The
second line demonstrates a use of the <kw>not</kw> keyword. This rule
will only fire if no other PossibleCellValue
objects
exist in the Working Memory at the same
row and column but with a different value. When the rule fires, the
single PossibleCellValue
at the row and column is
retracted from the Working Memory and is replaced by a new
ResolvedCellValue
at the same
row and column with the same value.
Rule #3 removes PossibleCellValues
with a given
value from a row when they have the same value as a
ResolvedCellValue
. In other words, when a cell is filled
with a resolved value, we need to remove the
possibility of any other cell on the same row having this value. The
first line of the when clause matches all ResolvedCellValue
objects in the Working Memory. The second line matches
PossibleCellValues
which have both the same row and the
same value as these ResolvedCellValue
objects. If any are found, the rule activates and, when fired retracts
the PossibleCellValue
which can no longer be a solution
for that cell.
Rules #4 and #5 act in the same way as Rule #3 but check for
redundant PossibleCellValues
in a given column and a
given zone of the grid as a ResolvedCellValue
respectively.
Rule #6 checks for the scenario where a possible cell value only
appears once in a given row. The first line of the LHS matches against
all PossibleCellValue
facts in the Working Memory, storing
the result in a number of local variables. The second line checks that
no other PossibleCellValue
objects with the same value
exist on this row. The third to fifth lines check that there is not
a ResolvedCellValue
with the same value in the same zone,
row or column so that this rule does not fire prematurely.
It is interesting to note that we could remove lines 3 to 5 and give
rules #3, #4 and #5 a higher salience to make sure they always fire
before rules #6,#7 and #8. When the rule fires, we know that
$possible
must represent the value for the cell;
so, as in Rule #2, we retract $possible
and replace it
with the equivalent, new ResolvedCellValue
.
Rules #7 and #8 act in the same way as Rule #2 but check for
single PossibleCellValues
in a given column and a given
zone of the grid, respectively.
Rule #9 represents the most complex currently implemented rule. This rule implements the logic that, if we know that a pair of given values can only occur in two cells on a specific row, (for example we have determined the values of 4 and 6 can only appear in the first row in cells [0,3] and [0,5]) and this pair of cells can not hold other values, then, although we do not know which of the pair contains a four and which contains a six, we do know that these two values must be in these two cells, and hence we can remove the possibility of them occuring anywhere else in the same row.
Rules #10 and #11 act in the same way as rule #9 but check for the existance of only two possible values in a given column and zone, respectively.
To solve harder grids, the rule set would need to be extended further with more complex rules that encapsulate more complex reasoning.
There are a number of ways in which this example could be developed. The reader is encouraged to consider these as excercises.
Agenda groups are a great declarative tool for phased execution. In this example, it is easy to see we have two phases: "resolution" and "validation". Right now, they are executed by creating two separate rule bases, each for one "job". Presumably it would be better to define agenda groups for all the rules, spliting them in "resolution" rules and "validation" rules, all loaded in a single rule base. The engine executes resolution and right after that, executes validation.
Auto-focus is a great way of handling exceptions to the regular rules execution. In our case, if we detect an inconsistency, either in the input data or in the resolution rules, why should we spend time continuing the execution if it will be invalid anyway? It is better to simply (and immediately) report the inconsistency as soon as it is found. To do that, since we now have a single rulebase with all rules, we simply need to define the auto-focus attribute for all rules validating puzzle consistency.
Logical insert: an inconsistency only exists while wrong data is in the working memory. As so, we could state that the validation rules logically insert inconsistencies and as soon as the offending data is retracted, the inconsistency no longer exists.
session.iterateObjects()
: although a valid
use case having a global list to add the found problems, I think
it would be more
interesting to ask the Stateful Session by the desired list of
problems, using session.iterateObjects( new ClassObjectFilter(
Inconsistency.class ) ).
Having the inconsistency class
can also allow us to paint in red the offending cells in the
GUI.
kcontext.getKnowledgeRuntime().halt()
: even
reporting the error as soon as it is found, we need a way to tell
the engine to stop evaluating rules. We
can do that creating a rule that, in the presence of inconsistencies,
calls halt()
to stop evaluation.
Queries: looking at the method
getPossibleCellValues(int row, int col)
in
DroolsSudokuGridModel
, we see it iterating over all
CellValue
objects, looking for the few it wants. That
is a great opportunity to demonstrate Drools queries. We just
define a query to
return the objects we want and iterate over it, cleanly and nicely.
Other queries may be defined as needed.
Globals as services: the main objective of this change is to
attend the next proposed change , but it is nice by its own.
In order to teach the use of globals as services, it
would be nice to set up a callback, so that each rule that finds the
ResolvedCellValue
for a given cell can call, to
notify and update the corresponding cell in the GUI, providing
immediate feedback for
the user. Also, the last found cell could have its number painted in
a different color to facilitate the identification of the rules'
conclusions.
Step by step execution: now that we have immediate user
feedback, we can make use of the restricted run feature in Drools,
i.e., we could add a button in the GUI, that, when activated,
causes the execution of a single rule, by calling
fireAllRules( 1 )
. This way, the user would see,
step by step, what the engine is doing.