JBoss.orgCommunity Documentation

Chapter 1. User Guide

1.1. Solver introduction
1.1.1. What is a Solver?
1.1.2. Status of drools-solver
1.1.3. Building drools-solver and running an example
1.2. Solver examples
1.2.1. Introduction
1.2.2. The n queens example
1.2.3. The lesson schedule example
1.2.4. The traveling tournament example
1.2.5. The ITC2007 examination example
1.3. Solver configuration
1.3.1. Types of solvers
1.3.2. The Solver interface
1.3.3. Building a solver
1.3.4. The Solution interface
1.3.5. The starting solution
1.3.6. A simple filler algorithm
1.3.7. Solving a problem
1.4. Score calculation with a rule engine
1.4.1. Rule based score calculation
1.4.2. The ScoreDefinition interface
1.4.3. Tips and tricks
1.5. Local search solver
1.5.1. Overview
1.5.2. A move
1.5.3. Move generation
1.5.4. A step
1.5.5. Getting stuck in local optima
1.5.6. Deciding the next step
1.5.7. Best solution
1.5.8. Finish

Drools-solver combines a search algorithm with the power of the drools rule engine to solve planning problems. Good examples of such planning problems include:

A planning problem consists out of a number of constraints. Generally, there are 3 types of constraints:

  • A (negative) hard constraint must not be broken. For example: 1 teacher can not teach 2 different lessons at the same time.

  • A (negative) soft constraint should not be broken if it can be avoided. For example: Teacher A does not like to teach on Friday afternoon.

  • A positive constraint (or reward) should be fulfilled if possible. For example: Teacher B likes to teach on Monday morning.

These constraints define the score function of a planning problem. This is where the drools rule engine comes into play: adding constraints with score rules is easy and scalable.

A planning problem has a number of solutions. Each solution has a score. There are 3 categories of solutions:

  • A possible solution is a solution that does or does not break any number of constraints. Planning problems tend to have a incredibly large number of possible solutions. Most of those solutions are worthless.

  • A feasible solution is a solution that does not break any (negative) hard constraints. The number of feasible solutions tends to be relative to the number of possible solutions. Sometimes there are no feasible solutions. Every feasible solution is a possible solution.

  • An optimal solution is a solution with the highest score. Planning problems tend to have 1 or a few optimal solutions. There is always at least 1 optimal solution, even in the remote case that it's not a feasible solution because there are no feasible solutions.

Drools-solver supports several search algorithms to efficiently wade through the incredbly large number of possible solutions. It makes it easy to switch the search algorithm, by simply changing the solver configuration.

Different solvers solve problems in different ways. Each type has advantages and disadvantages. We 'll roughly discuss a few of the solver types here. You can safely skip this section.

You can build a Solver instance with the XmlSolverConfigurer. Configure it with a solver configuration xml file:

    XmlSolverConfigurer configurer = new XmlSolverConfigurer();
    configurer.configure("/org/drools/solver/examples/nqueens/solver/nqueensSolverConfig.xml");
    Solver solver = configurer.buildSolver();

A basic solver configuration file looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<localSearchSolver>
    <scoreDrl>/org/drools/solver/examples/nqueens/solver/nQueensScoreRules.drl</scoreDrl>
    <scoreDefinition>
        <scoreDefinitionType>SIMPLE</scoreDefinitionType>
    </scoreDefinition>
    <finish>
        <feasableScore>0</feasableScore>
    </finish>
    <selector>
        <moveFactoryClass>org.drools.solver.examples.nqueens.solver.NQueensMoveFactory</moveFactoryClass>
    </selector>
    <accepter>
        <completeSolutionTabuSize>1000</completeSolutionTabuSize>
    </accepter>
    <forager>
        <foragerType>MAX_SCORE_OF_ALL</foragerType>
    </forager>
</localSearchSolver>

This is a tabu search configuration for n queens. We 'll explain the various parts of a configuration later in this manual.

Drools-solver makes it relatively easy to switch a solver type just by changing the configuration. There's even a benchmark utility which allows you to play out different configurations against each other and report the most appropriate configuration for your problem. You could for example play out tabu search versus simulated annealing, on 4 queens and 64 queens.

A solver has a single Random instance. Some solver configurations use that instance a lot more than others. For example simulated annealing depends highly on random numbers, while tabu search only depends on it to deal with score ties. In any case, during your testing it's advisable to set that Random instance, so your tests are reproducible.

A Solver can only solve 1 problem at a time.

You need to present the problem as a starting Solution instance to the solver.

You need to implement the Solution interface:

public interface Solution {

    Collection<? extends Object> getFacts();

    Solution cloneSolution();

}

For example, an NQueens instance just holds a list of all it's queens:

public class NQueens implements Solution {

    private List<Queen> queenList;

    // ...

}

For 4 queens we use a simple filler algorithm that creates a starting solution with all queens on a different x and on the same y (with y = 0).


Here's how we generate it:

    private NQueens createNQueens(int n) {
        NQueens nQueens = new NQueens();
        nQueens.setId(0L);
        List<Queen> queenList = new ArrayList<Queen>(n);
        for (int i = 0; i < n; i++) {
            Queen queen = new Queen();
            queen.setId((long) i);
            queen.setX(i); // Different column
            queen.setY(0); // Same row
            queenList.add(queen);
        }
        nQueens.setQueenList(queenList);
        return nQueens;
    }

The starting solution will probably be far from optimal (or even feasible). Here, it's actually the worst possible solution. However, we 'll let the solver find a much better solution for us anyway.

For large problems, a simple filler algorithm like createNQueens(int) doesn't suffice. A (local search) solver starting from a bad starting solution wastes a lot of time to reach a solution which an initializer algorithm can generate in a fraction of that time.

An initializer algorithm ussually works something like this:

Such an algorithm is very deterministic: it's really fast, but you can't give it more time to generate an even better solution. In some cases the solution it generates will be feasible, but in most cases it won't. You 'll need a real solver to get to a feasible or more optimal solution. Nevertheless you 'll want to such an initializer to give the real solver a serious head start. You can do this by implementing the StartingSolutionInitializer interface:

public interface StartingSolutionInitializer extends SolverAware {

    boolean isSolutionInitialized(Solution solution);

    void initializeSolution(Solution solution);

}

You'll need to set a (uninitialized) solution on the solver. Once the solver starts, it will first call the StartingSolutionInitializer to initialize the solution. If the StartingSolutionInitializer adds, edits or removes facts it needs to notify the workingMemory about this. It can use score calculation during its intialization process.

Here's an example on how you add the StartingSolutionInitializer to the configuration:

<localSearchSolver>
    ...
    <startingSolutionInitializerClass>org.drools.solver.examples.itc2007.examination.solver.solution.initializer.ExaminationStartingSolutionInitializer</startingSolutionInitializerClass>
    ...
</localSearchSolver>

The score calculation of a planning problem is based on constraints (such as hard constraints, soft constraints, rewards, ...). A rules engine, such as drools, makes it easy to implement those constraints as score rules.

Here's an example of a constraint implemented as a score rule in drools:

rule "multipleQueensHorizontal"
    when
        $q1 : Queen($id : id, $y : y);
        $q2 : Queen(id > $id, y == $y);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensHorizontal", $q1, $q2));
end

This score rule will fire once for every 2 queens with the same y. The (id > $id) condition is needed to assure that for 2 queens A and B, it can only fire for (A, B) and not for (B, A), (A, A) or (B, B). Let's take a closer look at this score rule on the starting solution of 4 queens:


In this starting solution the multipleQueensHorizontal score rule will fire for 6 queen couples: (A, B), (A, C), (A, D), (B, C), (B, D) and (C, D). Because none of the queens are on the same vertical or diagonal line, this starting solution will have a score of -6. An optimal solution of 4 queens has a score of 0.

You need to add your score rules drl files in the solver configuration, for example:

    <scoreDrl>/org/drools/solver/examples/nqueens/solver/nQueensScoreRules.drl</scoreDrl>

You can add multiple <scoreDrl> entries if needed.

It's recommended to use drools in forward-chaining mode (which is the default behaviour), as for solver implementations this will create the effect of a delta based score calculation instead of a full score calculation on each solution evaluation. For example, if a single queen moves from y 0 to 3, it won't bother to recalculate the "multiple queens on the same horizontal line" constraint for queens with y 1 or 2. This is a huge performance gain. Drools-solver gives you this huge performance gain without forcing you to write a very complicated delta based score calculation algorithm. Just let the drools rule engine do the hard work.

Adding more constraints is easy and scalable (once you understand the drools rule syntax). This allows you to add it a bunch of soft constraint score rules on top of the hard constraints score rules with little effort and at a reasonable performance cost. For example, for a freight routing problem you could add a soft constraint to avoid the certain flagged highways at rush hour.

The ScoreDefinition interface defines the score representation. The score must a Score instance and the instance type (for example DefaultHardAndSoftScore) must be stable throughout the solver runtime.

The solver aims to find the solution with the highest score. The best solution is the solution with the highest score that it has encountered during its solving.

Most planning problems tend to use negative scores (the amount of negative constraints being broken) with an impossible perfect score of 0. This explains why the score of a solution of 4 queens is the negative of the number of queen couples which can attack each other.

A ScoreDefinition instance is configured in the solver configuration:

    <scoreDefinition>
        <scoreDefinitionType>SIMPLE</scoreDefinitionType>
    </scoreDefinition>

There are a couple of build-in ScoreDefinition implementations:

You can implement your own ScoreDefinition, although the build-in score definitions should suffice for most needs.

A ScoreCalculator instance is asserted into the working memory as a global called scoreCalculator. Your score rules need to (indirectly) update that instance. Usually you 'll make a single rule as an aggregation of the other rules to update the score:

global SimpleScoreCalculator scoreCalculator;

rule "multipleQueensHorizontal"
    when
        $q1 : Queen($id : id, $y : y);
        $q2 : Queen(id > $id, y == $y);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensHorizontal", $q1, $q2));
end

// multipleQueensVertical is obsolete because it is always 0

rule "multipleQueensAscendingDiagonal"
    when
        $q1 : Queen($id : id, $ascendingD : ascendingD);
        $q2 : Queen(id > $id, ascendingD == $ascendingD);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensAscendingDiagonal", $q1, $q2));
end

rule "multipleQueensDescendingDiagonal"
    when
        $q1 : Queen($id : id, $descendingD : descendingD);
        $q2 : Queen(id > $id, descendingD == $descendingD);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensDescendingDiagonal", $q1, $q2));
end

rule "hardConstraintsBroken"
    when
        $occurrenceCount : Number() from accumulate(
            $unweightedConstraintOccurrence : UnweightedConstraintOccurrence(),
            count($unweightedConstraintOccurrence)
        );
    then
        scoreCalculator.setScore(- $occurrenceCount.intValue());
end

Optionally, you can also weigh your constraints differently, by multiplying the count of each score rule with its weight. For example in freight routing, you can make 5 broken "avoid crossroads" soft constraints count as much as 1 broken "avoid highways at rush hour" soft constraint. This allows your business analysts to easily tweak the score function as they see fit.

Here's an example of all the NQueens constraints written as a single rule, using multi pattern accumulates and making multipleQueensHorizontal constraint outweigh the other constraints 5 times:

// Warning: This currently triggers backwards chaining instead of forward chaining and seriously hurts performance and scalability.
rule "constraintsBroken"
    when
        $multipleQueensHorizontal : Long()
        from accumulate(
            $q1 : Queen($id : id, $y : y)
            and Queen(id > $id, y == $y),
           count($q1)
        );
        $multipleQueensAscendingDiagonal : Long()
        from accumulate(
            $q2 : Queen($id : id, $ascendingD : ascendingD)
            and Queen(id > $id, ascendingD == $ascendingD),
           count($q2)
        );
        $multipleQueensDescendingDiagonal : Long()
        from accumulate(
            $q3 : Queen($id : id, $descendingD : descendingD)
            and Queen(id > $id, descendingD == $descendingD),
           count($q3)
        );
    then
        scoreCalculator.setScore(- (5 * $multipleQueensHorizontal) - $multipleQueensAscendingDiagonal - $multipleQueensDescendingDiagonal);
end

In case you haven't figured it out yet: performance (and scalability) is very important for solving planning problems. What good is a real-time freight routing solver that takes a day to find a feasible solution? Even small and innocent looking problems can hide an enormous problem size. For example, they probably still don't know the optimal solution of the traveling tournament problem for as little as 10 traveling teams.

In number of possible solutions for a planning problem can be mind blowing. For example:

An algorithm that checks every possible solution (even with pruning) can easily run for a couple of years on a single real-life planning problem. Most of the time, we are happy with a feasible solution found in a limited amount of time. Local search tends to find a feasible solution relatively fast. Because it acts very much like a human, it is also pretty natural to program.

Local search solves a problem making a move on the current solution to change it into a better solution. It does that number of times till it is satisfied with the solution. It starts with the starting solution.

A local search algorithm and the drools rule engine turn out to be a really nice combination, because:

Drools-solver's local search implementation combines both. On top of that, it also offers additional support for benchmarking etc.

A move is the change from a solution A to a solution B. For example, below you can see a single move on the starting solution of 4 queens that moves a single queen to another row:


A move can have a small or large impact. In the above example, the move of queen C0 to C2 is a small move. Some moves are the same move type. These are some possibilities for move types in n queens:

  • Move a single queen to another row. This is a small move. For example, move queen C0 to C2.

  • Move all queens a number of rows down or up. This a big move.

  • Move a single queen to another column. This is a small move. For example, move queen C2 to A0 (placing it on top of queen A0).

  • Add a queen to the board at a certain row and column.

  • Remove a queen from the board.

Because we have decided that all queens will be on the board at all times and each queen has an appointed column (for performance reasons), only the first 2 move types are usable in our example. Furthermore, we 're only using the first move type in the example because we think it gives the best performance, but you are welcome to prove us wrong.

Each of your move types will be an implementation of the Move interface:

public interface Move {

    boolean isMoveDoable(EvaluationHandler evaluationHandler);

    Move createUndoMove(EvaluationHandler evaluationHandler);

    void doMove(EvaluationHandler evaluationHandler);

}

Let's take a look at the Move implementation for 4 queens which moves a queen to a different row:

public class YChangeMove implements Move {

    private Queen queen;
    private int toY;

    public YChangeMove(Queen queen, int toY) {
        this.queen = queen;
        this.toY = toY;
    }

    // ... see below

}

An instance of YChangeMove moves a queen from it's current y to a different y.

Drool-solver calls the doMove(WorkingMemory) method to do a move. The Move implementation must notify the working memory of any changes it does on the solution facts:

    public void doMove(WorkingMemory workingMemory) {
        FactHandle queenHandle = workingMemory.getFactHandle(queen);
        workingMemory.modifyRetract(queenHandle); // before changes are made
        queen.setY(toY);
        workingMemory.modifyInsert(queenHandle, queen); // after changes are made
    }

Drools-solver disables shadow facts for increased performance, so you cannot use the workingMemory.update(FactHandle, Object) method, instead you need to call the workingMemory.modifyRetract(FactHandle) method before modifying the fact and the workingMemory.modifyInsert(FactHandle, Object) method after modifying the fact. Note that you can alter multiple facts in a single move and effectively create a big move (also known as a coarse-grained move).

Drools-solver automatically filters out non doable moves by calling the isDoable(WorkingMemory) method on a move. A non doable move is:

  • A move that changes nothing on the current solution. For example, moving queen B0 to row 0 is not doable.

  • A move that is impossible to do on the current solution. For example, moving queen B0 to row 10 is not doable because it would move it outside the board limits.

In the n queens example, a move which moves the queen from it's current row to the same row isn't doable:

    public boolean isMoveDoable(WorkingMemory workingMemory) {
        int fromY = queen.getY();
        return fromY != toY;
    }

Because we won't generate a move which can move a queen outside the board limits, we don't need to check it. A move that is currently not doable can become doable on a later solution.

Each move has an undo move: a move (usually of the same type) which does the exact opposite. In the above example the undo move of C0 to C2 would be the move C2 to C0. An undo move can be created from a move, but only before the move has been done on the current solution.

    public Move createUndoMove(WorkingMemory workingMemory) {
        return new YChangeMove(queen, queen.getY());
    }

Notice that if C0 would have already been moved to C2, the undo move would create the move C2 to C2, instead of the move C2 to C0.

The local search solver can do and undo a move more than once, even on different (successive) solutions.

A move must implement the equals() and hashcode() methods. 2 moves which make the same change on a solution, must be equal.

    public boolean equals(Object o) {
        if (this == o) {
            return true;
        } else if (o instanceof YChangeMove) {
            YChangeMove other = (YChangeMove) o;
            return new EqualsBuilder()
                    .append(queen, other.queen)
                    .append(toY, other.toY)
                    .isEquals();
        } else {
            return false;
        }
    }

    public int hashCode() {
        return new HashCodeBuilder()
                .append(queen)
                .append(toY)
                .toHashCode();
    }

In the above example, the Queen class uses the default Object equal() and hashcode() implementations. Notice that it checks if the other move is an instance of the same move type. This is important because a move will be compared to a move with another move type if you're using more then 1 move type.

It's also recommended to implement the toString() method as it allows you to read drools-solver's logging more easily:

    public String toString() {
        return queen + " => " + toY;
    }

Now that we can make a single move, let's take a look at generating moves.

At each solution, local search will try all possible moves and pick the best move to change to the next solution. It's up to you to generate those moves. Let's take a look at all the possible moves on the starting solution of 4 queens:


As you can see, not all the moves are doable. At the starting solution we have 12 doable moves (n * (n - 1)), one of which will be move which changes the starting solution into the next solution. Notice that the number of possible solutions is 256 (n ^ n), much more that the amount of doable moves. Don't create a move to every possible solution. Instead use moves which can be sequentially combined to reach every possible solution.

It's highly recommended that you verify all solutions are connected by your move set. This means that by combining a finite number of moves you can reach any solution from any solution. Otherwise you're already excluding solutions at the start. Especially if you're using only big moves, you should check it. Just because big moves outperform small moves in a short test run, it doesn't mean that they will outperform them in a long test run.

You can mix different move types. Usually you're better off preferring small (fine-grained) moves over big (course-grained) moves because the score delta calculation will pay off more. However, as the traveling tournament example proves, if you can remove a hard constraint by using a certain set of big moves, you can win performance and scalability. Try it yourself: run both the simple (small moves) and the smart (big moves) version of the traveling tournament example. The smart version evaluates a lot less unfeasible solutions, which enables it to outperform and outscale the simple version.

Move generation currently happens with a MoveFactory:

public class NQueensMoveFactory extends CachedMoveListMoveFactory {

    public List<Move> createMoveList(Solution solution) {
        NQueens nQueens = (NQueens) solution;
        List<Move> moveList = new ArrayList<Move>();
        for (Queen queen : nQueens.getQueenList()) {
            for (int n : nQueens.createNList()) {
                moveList.add(new YChangeMove(queen, n));
            }
        }
        return moveList;
    }

}

But we'll be making move generation part of the drl's soon.

A step is the winning move. The local search solver tries every move on the current solution and picks the best accepted move as the step:


Because the move B0 to B3 has the highest score (-3), it is picked as the next step. Notice that C0 to C3 (not shown) could also have been picked because it also has the score -3. If multiple moves have the same highest score, one is picked randomly, in this case B0 to B3.

The step is made and from that new solution, the local search solver tries all the possible moves again, to decide the next step after that. It continually does this in a loop, and we get something like this:


Notice that the local search solver doesn't use a search tree, but a search path. The search path is highlighted by the green arrows. At each step it tries all possible moves, but unless it's the step, it doesn't investigate that solution further. This is one of the reasons why local search is very scalable.

As you can see, the local search solver solves the 4 queens problem by starting with the starting solution and make the following steps sequentially:

  1. B0 to B3

  2. D0 to B2

  3. A0 to B1

If we turn on INFO logging, this is reflected into the logging:

INFO  Solving with random seed (0).
INFO  Initial score (-6) is starting best score. Updating best solution and best score.
INFO  Step (0), time spend (0) doing next step ([Queen-1] 1 @ 0 => 3).
INFO  New score (-3) is better then last best score (-6). Updating best solution and best score.
INFO  Step (1), time spend (0) doing next step ([Queen-3] 3 @ 0 => 2).
INFO  New score (-1) is better then last best score (-3). Updating best solution and best score.
INFO  Step (2), time spend (15) doing next step ([Queen-0] 0 @ 0 => 1).
INFO  New score (0) is better then last best score (-1). Updating best solution and best score.
INFO  Solved in 3 steps and 15 time millis spend.

Notice that the logging used the toString() method from our Move implementation: [Queen-1] 1 @ 0 => 3.

The local search solver solves the 4 queens problem in 3 steps, by evaluating only 37 possible solutions (3 steps with 12 moves each + 1 starting solution), which is only fraction of all 256 possible solutions. It solves 16 queens in 31 steps, by evaluating only 7441 out of 18446744073709551616 possible solutions.

The local search solver decides the next step with the aid of 3 configurable components:


In the above example the selector generated the moves shown with the blue lines, the accepter accepted all of them and the forager picked the move B0 to B3.

If we turn on DEBUG logging, we can see the decision making in the log:

INFO  Solving with random seed (0).
INFO  Initial score (-6) is starting best score. Updating best solution and best score.
DEBUG     Move ([Queen-0] 0 @ 0 => 0) ignored because not doable.
DEBUG     Move ([Queen-0] 0 @ 1 => 1) with score (-4) and acceptChance (1.0).
DEBUG     Move ([Queen-0] 0 @ 2 => 2) with score (-4) and acceptChance (1.0).
...
DEBUG     Move ([Queen-1] 1 @ 3 => 3) with score (-3) and acceptChance (1.0).
...
DEBUG     Move ([Queen-3] 3 @ 3 => 3) with score (-4) and acceptChance (1.0).
INFO  Step (0), time spend (0) doing next step ([Queen-1] 1 @ 0 => 3).
INFO  New score (-3) is better then last best score (-6). Updating best solution and best score.
...

An accepter is used (together with a forager) to active tabu search, simulated annealing, great deluge, ... For each move it generates an accept chance. If a move is rejected it is given an accept chance of 0.0.

You can implement your own Accepter, although the build-in accepters should suffice for most needs. You can also combine multiple accepters.

When tabu search takes steps it creates tabu's. It does not accept a move as the next step if that move breaks tabu. Drools-solver implements several tabu types:

You can even combine tabu types:

    <accepter>
        <completeSolutionTabuSize>1000</completeSolutionTabuSize>
        <completeUndoMoveTabuSize>10</completeUndoMoveTabuSize>
    </accepter>

If you pick a too small tabu size, your solver can still get stuck in a local optimum. On the other hand, with the exception of solution tabu, if you pick a too large tabu size, your solver can get stuck by bouncing of the walls. Use the benchmarker to fine tweak your configuration.

A tabu search accepter should be combined with a maximum score of all or first best score improving forager.

Sooner or later the local search solver will have to stop solving. This can be because of a number of reasons: the time is up, the perfect score has been reached, ... The only thing you can't depend on is on finding the optimal solution (unless you know the optimal score), because a local search solver doesn't know that when it finds the optimal solution. For real-life problems this doesn't turn out to be much of a problem, because finding the optimal solution would take years, so you 'll want to finish early anyway.

You can configure when a local search solver needs to stop by configuring a Finish. You can implement your own Finish, although the build-in finishes should suffice for most needs.