Instance-Based Parameter Tuning
for Evolutionary AI Planning
Mátyás Brendel
Projet TAO, INRIA Saclay & LRI
Université Paris Sud
Orsay, France
matthias.brendel@lri.fr
Marc Schoenauer
Projet TAO, INRIA Saclay & LRI
Université Paris Sud
Orsay, France
marc.schoenauer@inria.fr
ABSTRACT
Learn-and-Optimize (LaO) is a generic surrogate based
method for parameter tuning combining learning and optimization.
In this paper LaO is used to tune Divide-and-
Evolve (DaE), an Evolutionary Algorithm for AI Planning.
The LaO framework makes it possible to learn the relation
between some features describing a given instance and
the optimal parameters for this instance, thus it enables
to extrapolate this knowledge to unknown instances in the
same domain. Moreover, the learned relation is used as a
surrogate-model to accelerate the search for the optimal parameters.
It hence becomes possible to solve intra-domain
and extra-domain generalization in a single framework. The
proposed implementation of LaO uses an Arti_cial Neural
Network for learning the mapping between features and optimal
parameters, and the Covariance Matrix Adaptation
Evolution Strategy for optimization. Results demonstrate
that LaO is capable of improving the quality of the DaE results
even with only a few iterations. The main limitation of
the DaE case-study is the limited amount of meaningful features
that are available to describe the instances. However,
the learned model reaches almost the same performance on
the test instances, which means that it is capable of generalization.
Categories and Subject Descriptors
I.2.6 [Computing Methodologies]: Arti_cial Intelligence
Learning Parameter learning
General Terms
Theory
Keywords
parameter tuning, AI Planning, evolutionary algorithms
1. INTRODUCTION
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
GECCO’11, July 12–16, 2011, Dublin, Ireland.
Copyright 2011 ACM 978-1-4503-0690-4/11/07 ...$10.00.
Parameter tuning is basically a general optimization problem
applied o_-line to _nd the best parameters for complex
algorithms, for example for Evolutionary Algorithms (EAs).
Whereas the e_ciency of EAs has been demonstrated on
several application domains [29, 18], they usually need computationally
expensive parameter tuning. Consequently, one
is tempted to use either the default parameters of the framework
he is using, or parameter values given in the literature
for problems that are similar to his one.
Being a general optimization problem, there are as many
parameter tuning algorithms as optimization techniques [7,
19]. However, several specialized methods have been proposed,
and the most prominent ones today are Racing [5],
REVAC [21], SPO [2], and ParamILS [14]. All these approaches
face the same crucial generalization issue: can a
parameter set that has been optimized for a given problem
be successfully used for another one? The answer of course
depends on the similarity of both problems. However, even
in an optimization domain as precisely de_ned as AI Planning,
there are very few results describing meaningful similarity
measures between problem instances. Moreover, until
now, su_ciently precise and accurate features have not been
speci_ed that would allow the user to accurately describe the
problem, so that the optimal parameter-set could be learned
from this feature-set, and carried on to other problems with
similar description. To the best of our knowledge, no design
of a general learning framework has been proposed, and no
general experiments have been carried out yet with some
representative domains of AI planning.
In the SAT domain, however, one work must be given as an
example of what can be done along those lines. In [13], many
relevant features have been gathered based on half a century
of SAT-research, and hundreds of papers. Extensive parameter
tuning on several thousands of instances has allowed
the authors to learn, using function regression, a meaningful
mapping between the features and the running-time of
a given SAT solver with given parameters. Optimizing this
model makes it possible to choose the optimal parameters
for a given (unknown) instance. The present paper aims at
generalizing this work made in AI planning, with one major
di_erence: the target will be here to optimize the _tness
value for a given runtime, and not the runtime to solution {
as the optimal solution is generally not known for AI planning
problems.
Unfortunately, until now, nobody has yet proposed a set of
features for AI Planning problems in general, that would
be su_cient to describe the characteristics of a problem,
like was done in the SAT domain [13]. This paper makes a
591
step toward a framework for parameter tuning applied generally
to AI Planning and proposes a preliminary set of features.
The Learn-and-Optimize (LaO) framework consists
of the combination of optimizing (i.e., parameter tuning)
and learning, i.e., _nding the mapping between features and
best parameters. Furthermore, the results of learning will already
be useful to further the optimization phases, using the
learned model as in standard surrogate-model based techniques
(see e.g., [1] for a Gaussian-process-based approach).
LaO can of course be applied to any target optimization
methodology that requires parameter tuning. In this paper,
the target optimization technique is an Evolutionary
Algorithm (EA), more precisely the evolutionary AI planner
called Divide-and-Evolve (DaE). However, DaE will be
here considered as a black-box algorithm, without any modi-
_cation for the purpose of this work compared to its original
version described in [17].
The paper is organized as follows: AI Planning Problems are
briey introduced in section 2. Section 3 describes the and
the classical YAHSP solver and the evolutionary Divide-and-
Evolve algorithm. Section 4 introduces the original, top level
parameter tuning method, Learn-and-Optimize. The case
study presented in Section 5 applies LaO to DaE, following
the rules of the International Planning Competition 2011 {
Learning Track. Finally, conclusions are drawn and further
directions of research are proposed in Section 6.
2. AI PLANNING
An Arti_cial Intelligence (AI) planning problem is de_ned
by the triplet of an initial state, a goal state, and a set
of possible actions. An action modi_es the current state
and can only be applied if certain conditions are met. A
solution plan to a planning problem is an ordered list of
actions, whose execution from the initial state achieves the
goal state. The quality criterion of a plan depends on the
type of available actions: in the simplest case (e.g. STRIPS
domain), it is the number of actions; it may also be the
total cost of the pan for actions with cost; and it is the total
duration of the plan, aka makespan, for temporal problems
with so called durative actions.
Domain-independent planners rely on the Planning Domain
De_nition Language PDDL2.1 [8]. The history of PDDL is
closely related to the di_erent editions of the International
Planning Competitions (IPCs http://ipc.icaps-conference.
org/), and the problems submitted to the participants, written
in PDDL, are still the main benchmarks in AI Planning.
The description of a planning problem consists of two separate
parts usually placed in two di_erent _les: the generic
domain on the one hand and a speci_c instance scenario
on the other hand. The domain _le speci_es object types
and predicates, which de_ne possible states, and actions,
which de_ne possible state changes. The instance scenario
declares the actual objects of interest, gives the initial state
and provides a description of the goal. A state is described
by a set of atomic formulae, or atoms. An atom is de-
_ned by a predicate followed by a list of object identi_ers:
(PREDICATE NAME OBJ1 ... OBJN).
The initial state is complete, whereas the goal might be a
partial state. An action is composed of a set of preconditions
and a set of e_ects, and applies to a list of variables given
as arguments, and possibly a duration or a cost. Preconditions
are logical constraints which apply domain predicates
to the arguments and trigger the e_ects when they are satis-
_ed. E_ects enable state transitions by adding or removing
atoms.
A solution plan to a planning problem is a consistent schedule
of grounded actions whose execution in the initial state
leads to a state that contains the goal state, i.e., where all
atoms of the problem goal are true. A planning problem
de_ned on domain D with initial state I and goal G will be
denoted in the following as PD(I;G).
3. DIVIDE-AND-EVOLVE
Early approaches to AI Planning using Evolutionary Algorithms
directly handled possible solutions, i.e. possible
plans: an individual is an ordered sequence of actions see
[25, 20, 27, 28, 6]. However, as it is often the case in Evolutionary
Combinatorial optimization, those direct encoding
approaches have limited performance in comparison to
the traditional AI planning approaches. Furthermore, hybridization
with classical methods has been the way to success
in many combinatorial domains, as witnessed by the
fruitful emerging domain of memetic algorithms [11]. Along
those lines, though relying on an original memetization"
principle, a novel hybridization of Evolutionary Algorithms
(EAs) with AI Planning, termed Divide-and-Evolve (DaE)
has been proposed [23, 24]. For a complete formal description,
see [16].
The basic idea of DaE in order to solve a planning task
PD(I;G) is to _nd a sequence of states S1; : : : ; Sn, and to use
some embedded planner to solve the series of planning problems
PD(Sk; Sk+1), for k 2 [0; n] (with the convention that
S0 =