Sudoku Example

Name: Sudoku
Main class: org.drools.examples.sudoku.Main
Type: Java application
Rules file: sudokuSolver.drl, sudokuValidator.drl
Objective: Demonstrates the solving of logic problems, and complex pattern matching.

This example demonstrates how Drools can be used to find a solution in a large potential solution space based on a number of constraints. We use the popular puzzle of Sudoku. This example also shows how Drools can be integrated into a graphical interface and how callbacks can be used to interact with a running Drools rules engine in order to update the graphical interface based on changes in the Working Memory at runtime.

Sudoku Overview

Sudoku is a logic-based number placement puzzle. The objective is to fill a 9x9 grid so that each column, each row, and each of the nine 3x3 zones contains the digits from 1 to 9, once, and only once.

The puzzle setter provides a partially completed grid and the puzzle solver's task is to complete the grid with these constraints.

The general strategy to solve the problem is to ensure that when you insert a new number it should be unique in its particular 3x3 zone, row and column.

See

URL: http://en.wikipedia.org/wiki/Sudoku

for a more detailed description.

Running the Example

Download and install drools-examples as described above and then execute java org.drools.examples.sudoku.Main. This example requires Java 5.

A window will be displayed with a relatively simple partially filled grid.

Click on the "Solve" button and the Drools-based engine will fill out the remaining values. The Console window will display detailed information about the rules which are executing to solve the puzzle in a human readable form.

Rule #3 determined the value at (4,1) could not be 4 as this value already exists in the same column at (8,1)
Rule #3 determined the value at (5,5) could not be 2 as this value already exists in the same row at (5,6)
Rule #7 determined (3,5) is 2 as this is the only possible cell in the column that can have this value
Rule #1 cleared the other PossibleCellValues for (3,5) as a ResolvedCellValue of 2 exists for this cell.
Rule #1 cleared the other PossibleCellValues for (3,5) as a ResolvedCellValue of 2 exists for this cell.
... 
Rule #3 determined the value at (1,1) could not be 1 as this value already  exists in the same zone at (2,1)
Rule #6 determined (1,7) is 1 as this is the only possible cell in the row that can have this value
Rule #1 cleared the other PossibleCellValues for (1,7) as a ResolvedCellValue of 1 exists for this cell.
Rule #6 determined (1,1) is 8 as this is the only possible cell in the row that can have this value

Once all of the activated rules for the solving logic have executed, the engine executes a second rule base to check that the solution is complete and valid. In this case it is, and the "Solve" button is disabled and displays a text like "Solved (1052ms)".

The example comes with a number of grids which can be loaded and solved. Click on "File", then "Samples" and "Medium" to load a more challenging grid. Note that the solve button is enabled when the new grid is loaded.

Click on the "Solve" button again to solve this new grid.

Now, let us load a Sudoku grid that is deliberately invalid. Click on "File", "Samples" and "!DELIBERATELY BROKEN!". Note that this grid starts with some issues, for example the value 5 appears twice in the first row.

Nevertheless, click on the "Solve" button to apply the solving rules to this invalid grid. Note that the "Solve" button is relabelled to indicate that the resulting solution is invalid.

In addition, the validation rule set outputs all of the issues which are discovered to the console.

There are two cells on the same column with the same value at (6,0) and (4,0)
There are two cells on the same column with the same value at (4,0) and (6,0)
There are two cells on the same row with the same value at (2,4) and (2,2)
There are two cells on the same row with the same value at (2,2) and (2,4)
There are two cells on the same row with the same value at (6,3) and (6,8)
There are two cells on the same row with the same value at (6,8) and (6,3)
There are two cells on the same column with the same value at (7,4) and (0,4)
There are two cells on the same column with the same value at (0,4) and (7,4)
There are two cells on the same row with the same value at (0,8) and (0,0)
There are two cells on the same row with the same value at (0,0) and (0,8)
There are two cells on the same column with the same value at (1,2) and (3,2)
There are two cells on the same column with the same value at (3,2) and (1,2)
There are two cells in the same zone with the same value at (6,3) and (7,3)
There are two cells in the same zone with the same value at (7,3) and (6,3)
There are two cells on the same column with the same value at (7,3) and (6,3)
There are two cells on the same column with the same value at (6,3) and (7,3)

We will look at the solving rule set later in this section, but for the moment we should note that some theoretically solvable solutions can not be solved by the engine as it stands. Click on "File", "Samples" and then "Hard 3" to load a sparsely populated grid.

Now click on the "Solve" button and note that the current rules are unable to complete the grid, even though (if you are a Sudoku aficionado) you may be able to see a way forward with the solution.

At the present time, the solving functionality has been achieved by the use of ten rules. This rule set could be extended to enable the engine to tackle more complex logic for filling grids such as this.

Java Source and Rules Overview

The Java source code can be found in the /src/main/java/org/drools/examples/sudoku directory, with the two DRL files defining the rules located in the /src/main/rules/org/drools/examples/sudoku directory.

The package org.drools.examples.sudoku.swing contains a set of classes which implement a framework for Sudoku puzzles. Note that this package does not have any dependencies on the Drools libraries. SudokuGridModel defines an interface which can be implemented to store a Sudoku puzzle as a 9x9 grid of Integer values, some of which may be null, indicating that the value for the cell has not yet been resolved. SudokuGridView is a Swing component which can visualize any implementation of SudokuGridModel. SudokuGridEvent and SudokuGridListener are used to communicate state changes between the model and the view: events are fired when a cell's value is resolved or changed. If you are familiar with the model-view-controller patterns in other Swing components such as JTable then this pattern should be familiar. SudokuGridSamples provides a number of partially filled Sudoku puzzles for demonstration purposes.

Package org.drools.examples.sudoku.rules contains an implementation of SudokuGridModel which is based on Drools. Two Java objects are used, both of which extend AbstractCellValue and represent a value for a specific cell in the grid, including the row and column location of the cell, an index of the 3x3 zone the cell is contained in, and the value of the cell. PossibleCellValue indicates that we do not currently know for sure what the value in a cell is. There can be from 2 to 9 possible cell values for a given cell. ResolvedCellValue indicates that we have determined what the value for a cell must be. There can only be one resolved cell value for a given cell. DroolsSudokuGridModel implements SudokuGridModel and is responsible for converting an initial two dimensional array of partially specified cells into a set of CellValue Java object, creating a Working Memory based on solverSudoku.drl and inserting the CellValue objects into the Working Memory. When the solve() method is called it calls in turn fireAllRules() on this Working Memory to try to solve the puzzle. DroolsSudokuGridModel attaches a WorkingMemoryListener to the Working Memory, which allows it to be called back on insert and retract events as the puzzle is solved. When a new ResolvedCellValue is inserted into the Working Memory, this callback allows the implementation to fire a SudokuGridEvent to its SudokuGridListener clientele, which can then update themselves in realtime. Once all the rules fired by the solver Working Memory have executed, DroolsSudokuGridModel runs a second set of rules, based on validatorSudoku.drl which works with the same set of Java objects to determine whether the resulting grid is a valid and a full solution.

The class org.drools.examples.sudoku.Main implements a Java application combining the components desribed.

The packae org.drools.examples.sudoku contains two DRL files. solverSudoku.drl defines the rules which attempt to solve a Sudoku puzzle, and validator.drl defines the rules which determin whether the current state of the Working Memory represents a valid solution. Both use PossibleCellValue and ResolvedCellValue objects as their facts and both output information to the Console window as their rules fire. In a real-world situation we would insert logging information and use the WorkingMemoryListener to display this information to a user, rather than use the console in this fashion.

Sudoku Validator Rules (validatorSudoku.drl)

We start with the validator rules as this rule set is shorter and simpler than the solver rule set.

The first rule simply checks that no PossibleCellValue objects remain in the Working Memory. Once the puzzle is solved, only ResolvedCellValue objects should be present, one for each cell.

The other three rules each match all of the ResolvedCellValue objects and bind them to the variable $resolved1. They then look for ResolvedCellValues that contain the same value and are located, respectively, in the same row, column, or 3x3 zone. If these rules are fired they add a message to a global list of strings describing the reason the solution is invalid. DroolsSudokoGridModel injects this list before it runs the rule set and checks whether it is empty or not after having called fireAllRules(). If it is not empty then it prints all the strings in the list and sets a flag to indicate that the grid is not solved.

Sudoku Solving Rules (solverSudoku.drl)

Now let us look at the more complex rule set used to solve Sudoku puzzles.

Rule #1 is basically a book-keeping rule. Several of the other rules insert ResolvedCellValues into the working memory at specific rows and columns after they have determined that a given cell must have a certain value. At this point, it is important to clear the Working Memory of any inserted PossibleCellValues at the same row and column with invalid values. This rule is therefore given a higher salience than the remaining rules to ensure that as soon as the LHS is true, activations for the rule move to the top of the Agenda and are fired. In turn, this prevents the spurious firing of other rules due to the combination of a ResolvedCellValue and one or more PossibleCellValues being present in the same cell. This rule also calls update() on the ResolvedCellValue, even though its value has not in fact been modified to ensure that Drools fires an event to any WorkingMemoryListeners attached to the Working Memory so that they can update themselves - in this case so that the GUI can display the new state of the grid.

Rule #2 identifies cells in the grid which have only one possible value. The first line of the <kw>when</kw> clause matches all of the PossibleCellValue objects in the Working Memory. The second line demonstrates a use of the <kw>not</kw> keyword. This rule will only fire if no other PossibleCellValue objects exist in the Working Memory at the same row and column but with a different value. When the rule fires, the single PossibleCellValue at the row and column is retracted from the Working Memory and is replaced by a new ResolvedCellValue at the same row and column with the same value.

Rule #3 removes PossibleCellValues with a given value from a row when they have the same value as a ResolvedCellValue. In other words, when a cell is filled with a resolved value, we need to remove the possibility of any other cell on the same row having this value. The first line of the when clause matches all ResolvedCellValue objects in the Working Memory. The second line matches PossibleCellValues which have both the same row and the same value as these ResolvedCellValue objects. If any are found, the rule activates and, when fired retracts the PossibleCellValue which can no longer be a solution for that cell.

Rules #4 and #5 act in the same way as Rule #3 but check for redundant PossibleCellValues in a given column and a given zone of the grid as a ResolvedCellValue respectively.

Rule #6 checks for the scenario where a possible cell value only appears once in a given row. The first line of the LHS matches against all PossibleCellValue facts in the Working Memory, storing the result in a number of local variables. The second line checks that no other PossibleCellValue objects with the same value exist on this row. The third to fifth lines check that there is not a ResolvedCellValue with the same value in the same zone, row or column so that this rule does not fire prematurely. It is interesting to note that we could remove lines 3 to 5 and give rules #3, #4 and #5 a higher salience to make sure they always fire before rules #6,#7 and #8. When the rule fires, we know that $possible must represent the value for the cell; so, as in Rule #2, we retract $possible and replace it with the equivalent, new ResolvedCellValue.

Rules #7 and #8 act in the same way as Rule #2 but check for single PossibleCellValues in a given column and a given zone of the grid, respectively.

Rule #9 represents the most complex currently implemented rule. This rule implements the logic that, if we know that a pair of given values can only occur in two cells on a specific row, (for example we have determined the values of 4 and 6 can only appear in the first row in cells [0,3] and [0,5]) and this pair of cells can not hold other values, then, although we do not know which of the pair contains a four and which contains a six, we do know that these two values must be in these two cells, and hence we can remove the possibility of them occuring anywhere else in the same row.

Rules #10 and #11 act in the same way as rule #9 but check for the existance of only two possible values in a given column and zone, respectively.

To solve harder grids, the rule set would need to be extended further with more complex rules that encapsulate more complex reasoning.

Suggestions for Future Developments

There are a number of ways in which this example could be developed. The reader is encouraged to consider these as excercises.

  • Agenda groups are a great declarative tool for phased execution. In this example, it is easy to see we have two phases: "resolution" and "validation". Right now, they are executed by creating two separate rule bases, each for one "job". Presumably it would be better to define agenda groups for all the rules, spliting them in "resolution" rules and "validation" rules, all loaded in a single rule base. The engine executes resolution and right after that, executes validation.

  • Auto-focus is a great way of handling exceptions to the regular rules execution. In our case, if we detect an inconsistency, either in the input data or in the resolution rules, why should we spend time continuing the execution if it will be invalid anyway? It is better to simply (and immediately) report the inconsistency as soon as it is found. To do that, since we now have a single rulebase with all rules, we simply need to define the auto-focus attribute for all rules validating puzzle consistency.

  • Logical insert: an inconsistency only exists while wrong data is in the working memory. As so, we could state that the validation rules logically insert inconsistencies and as soon as the offending data is retracted, the inconsistency no longer exists.

  • session.iterateObjects(): although a valid use case having a global list to add the found problems, I think it would be more interesting to ask the Stateful Session by the desired list of problems, using session.iterateObjects( new ClassObjectFilter( Inconsistency.class ) ). Having the inconsistency class can also allow us to paint in red the offending cells in the GUI.

  • kcontext.getKnowledgeRuntime().halt(): even reporting the error as soon as it is found, we need a way to tell the engine to stop evaluating rules. We can do that creating a rule that, in the presence of inconsistencies, calls halt() to stop evaluation.

  • Queries: looking at the method getPossibleCellValues(int row, int col) in DroolsSudokuGridModel, we see it iterating over all CellValue objects, looking for the few it wants. That is a great opportunity to demonstrate Drools queries. We just define a query to return the objects we want and iterate over it, cleanly and nicely. Other queries may be defined as needed.

  • Globals as services: the main objective of this change is to attend the next proposed change , but it is nice by its own. In order to teach the use of globals as services, it would be nice to set up a callback, so that each rule that finds the ResolvedCellValue for a given cell can call, to notify and update the corresponding cell in the GUI, providing immediate feedback for the user. Also, the last found cell could have its number painted in a different color to facilitate the identification of the rules' conclusions.

  • Step by step execution: now that we have immediate user feedback, we can make use of the restricted run feature in Drools, i.e., we could add a button in the GUI, that, when activated, causes the execution of a single rule, by calling fireAllRules( 1 ). This way, the user would see, step by step, what the engine is doing.