step toward a framework for parameter tuning applied generally
to AI Planning and proposes a preliminary set of features.
The Learn-and-Optimize (LaO) framework consists
of the combination of optimizing (i.e., parameter tuning)
and learning, i.e., nding the mapping between features and
best parameters. Furthermore, the results of learning will already
be useful to further the optimization phases, using the
learned model as in standard surrogate-model based techniques
(see e.g., [1] for a Gaussian-process-based approach).
LaO can of course be applied to any target optimization
methodology that requires parameter tuning. In this paper,
the target optimization technique is an Evolutionary
Algorithm (EA), more precisely the evolutionary AI planner
called Divide-and-Evolve (DaE). However, DaE will be
here considered as a black-box algorithm, without any modi-
cation for the purpose of this work compared to its original
version described in [17].
The paper is organized as follows: AI Planning Problems are
brie
y introduced in section 2. Section 3 describes the and
the classical YAHSP solver and the evolutionary Divide-and-
Evolve algorithm. Section 4 introduces the original, top level
parameter tuning method, Learn-and-Optimize. The case
study presented in Section 5 applies LaO to DaE, following
the rules of the International Planning Competition 2011 {
Learning Track. Finally, conclusions are drawn and further
directions of research are proposed in Section 6.
2. AI PLANNING
An Articial Intelligence (AI) planning problem is dened
by the triplet of an initial state, a goal state, and a set
of possible actions. An action modies the current state
and can only be applied if certain conditions are met. A
solution plan to a planning problem is an ordered list of
actions, whose execution from the initial state achieves the
goal state. The quality criterion of a plan depends on the
type of available actions: in the simplest case (e.g. STRIPS
domain), it is the number of actions; it may also be the
total cost of the pan for actions with cost; and it is the total
duration of the plan, aka makespan, for temporal problems
with so called durative actions.
Domain-independent planners rely on the Planning Domain
Denition Language PDDL2.1 [8]. The history of PDDL is
closely related to the dierent editions of the International
Planning Competitions (IPCs http://ipc.icaps-conference.
org/), and the problems submitted to the participants, written
in PDDL, are still the main benchmarks in AI Planning.
The description of a planning problem consists of two separate
parts usually placed in two dierent les: the generic
domain on the one hand and a specic instance scenario
on the other hand. The domain le species object types
and predicates, which dene possible states, and actions,
which dene possible state changes. The instance scenario
declares the actual objects of interest, gives the initial state
and provides a description of the goal. A state is described
by a set of atomic formulae, or atoms. An atom is de-
ned by a predicate followed by a list of object identiers:
(PREDICATE NAME OBJ1 ... OBJN).
The initial state is complete, whereas the goal might be a
partial state. An action is composed of a set of preconditions
and a set of eects, and applies to a list of variables given
as arguments, and possibly a duration or a cost. Preconditions
are logical constraints which apply domain predicates
to the arguments and trigger the eects when they are satis-
ed. Eects enable state transitions by adding or removing
atoms.
A solution plan to a planning problem is a consistent schedule
of grounded actions whose execution in the initial state
leads to a state that contains the goal state, i.e., where all
atoms of the problem goal are true. A planning problem
dened on domain D with initial state I and goal G will be
denoted in the following as PD(I;G).
3. DIVIDE-AND-EVOLVE
Early approaches to AI Planning using Evolutionary Algorithms
directly handled possible solutions, i.e. possible
plans: an individual is an ordered sequence of actions see
[25, 20, 27, 28, 6]. However, as it is often the case in Evolutionary
Combinatorial optimization, those direct encoding
approaches have limited performance in comparison to
the traditional AI planning approaches. Furthermore, hybridization
with classical methods has been the way to success
in many combinatorial domains, as witnessed by the
fruitful emerging domain of memetic algorithms [11]. Along
those lines, though relying on an original memetization"
principle, a novel hybridization of Evolutionary Algorithms
(EAs) with AI Planning, termed Divide-and-Evolve (DaE)
has been proposed [23, 24]. For a complete formal description,
see [16].
The basic idea of DaE in order to solve a planning task
PD(I;G) is to nd a sequence of states S1; : : : ; Sn, and to use
some embedded planner to solve the series of planning problems
PD(Sk; Sk+1), for k 2 [0; n] (with the convention that
S0 = I and Sn+1 = G). The generation and optimization of
the sequence of states (Si)i2[1;n] is driven by an evolutionary
algorithm. The tness (makespan or total cost) of a list
of partial states S1; : : : ; Sn is computed by repeatedly calling
the external 'embedded' planner to solve the sequence of
problems PD(Sk; Sk+1), fk = 0; : : : ; ng. The concatenation
of the corresponding plans (possibly with some compression
step) is a solution of the initial problem. Any existing planner
can be used as embedded planner, but since guaranty
of optimality at all calls is not mandatory in order for DaE
to obtain good quality results [16], a sub-optimal, but fast
planner is used: YAHSP [26] is a lookahead strategy planning
system for sub-optimal planning which uses the actions
in the relaxed plan to compute reachable states in order to
speed up the search process.
A state is a list of atoms built over the set of predicates and
the set of object instances. However, searching the space of
complete states would result in a rapid explosion of the size
of the search space. Moreover, goals of planning problem
need only to be dened as partial states. It thus seems
more practical to search only sequences of partial states,
and to limit the choice of possible atoms used within such
partial states. However, this raises the issue of the choice
of the atoms to be used to represent individuals, among all
possible atoms. The result of the previous experiments on
dierent domains of temporal planning tasks from the IPC
benchmark series [3] demonstrates the need for a very careful
choice of the atoms that are used to build the partial states.
The method used to build the partial states is based on
an estimation of the earliest time from which an atom can
become true. Such estimation can be obtained by any admissible
heuristic function (e.g h1; h2::: [12]). The possible
start times are then used in order to restrict the candidate
atoms for each partial state. A partial state is built at a