Score calculation with a rule engine

Rule based score calculation

The score calculation of a planning problem is based on constraints (such as hard constraints, soft constraints, rewards, ...). A rules engine, such as drools, makes it easy to implement those constraints as score rules.

Here's an example of a constraint implemented as a score rule in drools:

rule "multipleQueensHorizontal"
    when
        $q1 : Queen($id : id, $y : y);
        $q2 : Queen(id > $id, y == $y);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensHorizontal", $q1, $q2));
end

This score rule will fire once for every 2 queens with the same y. The (id > $id) condition is needed to assure that for 2 queens A and B, it can only fire for (A, B) and not for (B, A), (A, A) or (B, B). Let's take a closer look at this score rule on the starting solution of 4 queens:

Figure 1.11. Starting solution for the 4 queens puzzle

Starting solution for the 4 queens puzzle

In this starting solution the multipleQueensHorizontal score rule will fire for 6 queen couples: (A, B), (A, C), (A, D), (B, C), (B, D) and (C, D). Because none of the queens are on the same vertical or diagonal line, this starting solution will have a score of -6. An optimal solution of 4 queens has a score of 0.

You need to add your score rules drl files in the solver configuration, for example:

    <scoreDrl>/org/drools/solver/examples/nqueens/solver/nQueensScoreRules.drl</scoreDrl>

You can add multiple <scoreDrl> entries if needed.

It's recommended to use drools in forward-chaining mode (which is the default behaviour), as for solver implementations this will create the effect of a delta based score calculation instead of a full score calculation on each solution evaluation. For example, if a single queen moves from y 0 to 3, it won't bother to recalculate the "multiple queens on the same horizontal line" constraint for queens with y 1 or 2. This is a huge performance gain. Drools-solver gives you this huge performance gain without forcing you to write a very complicated delta based score calculation algorithm. Just let the drools rule engine do the hard work.

Adding more constraints is easy and scalable (once you understand the drools rule syntax). This allows you to add it a bunch of soft constraint score rules on top of the hard constraints score rules with little effort and at a reasonable performance cost. For example, for a freight routing problem you could add a soft constraint to avoid the certain flagged highways at rush hour.

The ScoreDefinition interface

The ScoreDefinition interface defines the score representation. The score must a Score instance and the instance type (for example DefaultHardAndSoftScore) must be stable throughout the solver runtime.

The solver aims to find the solution with the highest score. The best solution is the solution with the highest score that it has encountered during its solving.

Most planning problems tend to use negative scores (the amount of negative constraints being broken) with an impossible perfect score of 0. This explains why the score of a solution of 4 queens is the negative of the number of queen couples which can attack each other.

A ScoreDefinition instance is configured in the solver configuration:

    <scoreDefinition>
        <scoreDefinitionType>SIMPLE</scoreDefinitionType>
    </scoreDefinition>

There are a couple of build-in ScoreDefinition implementations:

  • SIMPLE: The SimpleScoreDefinition defines the Score as a SimpleScore which has a single int value, for example -123.

  • HARD_AND_SOFT: The HardAndSoftScoreDefinition defines the Score as a HardAndSoftScore which has a hard int value and a soft int value, for example -123hard/-456soft.

You can implement your own ScoreDefinition, although the build-in score definitions should suffice for most needs.

A ScoreCalculator instance is asserted into the working memory as a global called scoreCalculator. Your score rules need to (indirectly) update that instance. Usually you 'll make a single rule as an aggregation of the other rules to update the score:

global SimpleScoreCalculator scoreCalculator;

rule "multipleQueensHorizontal"
    when
        $q1 : Queen($id : id, $y : y);
        $q2 : Queen(id > $id, y == $y);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensHorizontal", $q1, $q2));
end

// multipleQueensVertical is obsolete because it is always 0

rule "multipleQueensAscendingDiagonal"
    when
        $q1 : Queen($id : id, $ascendingD : ascendingD);
        $q2 : Queen(id > $id, ascendingD == $ascendingD);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensAscendingDiagonal", $q1, $q2));
end

rule "multipleQueensDescendingDiagonal"
    when
        $q1 : Queen($id : id, $descendingD : descendingD);
        $q2 : Queen(id > $id, descendingD == $descendingD);
    then
        insertLogical(new UnweightedConstraintOccurrence("multipleQueensDescendingDiagonal", $q1, $q2));
end

rule "hardConstraintsBroken"
    when
        $occurrenceCount : Number() from accumulate(
            $unweightedConstraintOccurrence : UnweightedConstraintOccurrence(),
            count($unweightedConstraintOccurrence)
        );
    then
        scoreCalculator.setScore(- $occurrenceCount.intValue());
end

Optionally, you can also weigh your constraints differently, by multiplying the count of each score rule with its weight. For example in freight routing, you can make 5 broken "avoid crossroads" soft constraints count as much as 1 broken "avoid highways at rush hour" soft constraint. This allows your business analysts to easily tweak the score function as they see fit.

Here's an example of all the NQueens constraints written as a single rule, using multi pattern accumulates and making multipleQueensHorizontal constraint outweigh the other constraints 5 times:

// Warning: This currently triggers backwards chaining instead of forward chaining and seriously hurts performance and scalability.
rule "constraintsBroken"
    when
        $multipleQueensHorizontal : Long()
        from accumulate(
            $q1 : Queen($id : id, $y : y)
            and Queen(id > $id, y == $y),
           count($q1)
        );
        $multipleQueensAscendingDiagonal : Long()
        from accumulate(
            $q2 : Queen($id : id, $ascendingD : ascendingD)
            and Queen(id > $id, ascendingD == $ascendingD),
           count($q2)
        );
        $multipleQueensDescendingDiagonal : Long()
        from accumulate(
            $q3 : Queen($id : id, $descendingD : descendingD)
            and Queen(id > $id, descendingD == $descendingD),
           count($q3)
        );
    then
        scoreCalculator.setScore(- (5 * $multipleQueensHorizontal) - $multipleQueensAscendingDiagonal - $multipleQueensDescendingDiagonal);
end

Tips and tricks

  • If you know a certain constraint can never be broken, don't bother writing a score rule for it. For example, the n queens example doesn't have a "multipleQueensVertical" rule because a queen's x never changes and the starting solution puts each queen on a different x. This tends to give a huge performance gain, not just because the score function is faster, but mainly because most solver implementations will spend less time evaluating unfeasible solutions.

  • Be watchfull for score traps. A score trap is a state in which several moves need to be done to resolve or lower the weight of a single constraint occurrence. Some examples of score traps:

    • If you need 2 doctors at each table, but you're only moving 1 doctor at a time, then the solver has no insentive to move a doctor to a table with no doctors. Punish a table with no doctors more then a table with only 1 doctor in your score function.

    • If you only add the table as a cause of the ConstraintOccurrence and forget the jobType (which is doctor or politician), then the solver has no insentive to move a docter to table which is short of a doctor and a politician.

  • If you use tabu search, combine it with a relativeSelection selector. Take some time to tweak the value of relativeSelection.

  • Verify that your score calculation happens in the correct Number type. If you're making the sum of integer values, don't let drools use Double's or your performance will hurt. Solver implementations will usually spend most of their execution time running the score function.

  • Always remember that premature optimization is the root of all evil. Make sure your design is flexible enough to allow configuration based tweaking.

  • Currently, don't allow drools to backward chain instead of forward chain, so avoid query's. It kills scalibilty.

  • Currently, don't allow drools to switch to MVEL mode, for performance. You can avoid this by using eval in the score rules, for example: eval(day.getIndex() == $day1.getIndex() + 3).

  • For optimal performance, use at least java 1.6 and always use server mode (java -server). We have seen performance increases of 30% by switching from java 1.5 to 1.6 and 50% by turning on server mode.

  • If you're doing performance tests, always remember that the JVM needs to warm up. First load your solver and do a short run, before you start benchmarking it.

In case you haven't figured it out yet: performance (and scalability) is very important for solving planning problems. What good is a real-time freight routing solver that takes a day to find a feasible solution? Even small and innocent looking problems can hide an enormous problem size. For example, they probably still don't know the optimal solution of the traveling tournament problem for as little as 10 traveling teams.