SATPlan

Starting from a PDDL representation of the agent’s knowledge base KB (specifying the initial state and the actions) and the goal g, the plan is build as follows:

SATPlan tries to build a plan of length L, starting with L = 0 and then incrementing L by 1 at each attempt (a strategy analogous to the iterative deepening search strategy)
for every value of L = 0, 1, 2, ... a partially different propositional representation of KB ∧ g is generated, and the attempt to build a model of such representation is carried out (for example using the DPLL algorithm)
if for some value of L a model of KB ∧ g is found, a plan of length L is extracted from the model
if the available time and/or space resources are exhausted before a model is found, the planning effort fails

The basic idea of SATPlan is that if the representation for a given L is satisfiable (i.e., if we can find a model), this means that the goal can be achieved from the initial state, and thus the planning problem is solved

Example

We shall see how SATPlan works on a very simple example: picking a ring from an initially closed box, which can be solved with L = 2

Initial state: $Closed(b)$, $In(R,B)$, $Free$

Goal: $Holds(R)$

Action $open(x)$:

Precond: $Closed(x) \land Free$
Effect: $\lnot Closed(x) \land Open(x)$

Action $pickUp(x,y)$:

Precond: $In(x,y) \land Open(y) \land Free$
Effect: $\lnot In(x) \land \lnot Free \land Holds(x)$

From PDDL to PL

Starting from the PDDL representation, we introduce a minimal set of propositional symbols that is sufficient for representing both facts and actions
Every symbol P will take an apex t = 0, 1, ... denoting the time instant (i.e., action step) at which the truth of the symbol is considered; e.g., ¬P^4 will mean that P is false at instant t = 4
Note that PL representations are interpreted according to classical logical semantics, which means that we have no CWA

In our example:

$Open(B) \to OpenB^t $
$Closed(B) \to \lnot OpenB^t $
$In(R,B) \to InRB^t $
$Holds(R) \to HoldsR^t $
$Free(B) \to \lnot HoldsR^t $
$open(B) \to openB^t $ open B at time t (with effects at t+1)
$pickUp(B) \to pickUpB^t $ pick up R from B at time t (with effects at t+1)

Execution

L = 0

We now build a propositional representation for L = 0.

In so doing, we assume that the goal is achieved at instant t = 0, i.e., by executing a plan of length zero, or an empty plan, which actually means doing nothing

The representation includes:

A representation of the initial state, in our example: $\lnot OpenB^0$, $InRB^0$, $\lnot HoldsR^0$
A representation of the goal, on the assumption that it is already achieved at t = 0. In our example: $HoldsR^0$
nothing else, because at t = 0 no action has been performed yet

Considering our example to the human eye, it is obvious that the set of propositional sentences is unsatisfiable, because we immediately see the logical contradiction. An artificial agent may reach the same conclusion by applying DPLL. Given that we found no plan for L = 0, SATPlan will proceed and build the representation for L = 1.

L = 1

The representations for L > 0 are more complex than the one for L = 0, because we must consider the possibility of executing actions at every t < L

Let us start with the preconditions of an action that is executed at a generic instant t: In SATPlan the preconditions of an action are considered as necessary conditions for the execution of the action, i.e., as conditions that must hold at t if the action is executed at t (with effects at t+1); e.g:

if the action of opening x is executed at t then x is closed and the gripper is free at t

We therefore represent the action preconditions with the following propositional sentences: $$openB^t \implies \lnot OpenB^t \land \lnot HoldsR^t$$ $$pickUpRB^t \implies InRB^t \land OpenB^t \land \lnot HoldsR^t$$

In view of the application of DPLL, the precondition axioms must be converted in CNF: $$openB^t \implies \lnot OpenB^t \land \lnot HoldsR^t \ \to \lnot openB^t \lor (\lnot OpenB^t \land \lnot HoldsR^t) \to ¬openB^t ∨ ¬OpenB^t ¬openB^t ∨ ¬HoldsR^t$$