Le 18/05/2018 par Webmaster ROADEF :
Nous proposons un sujet de thèse intitulé « Learning and Validation of a Multi-Criteria Decision Model Representing Interacting Criteria » – cf. sujet ci-joint.
Cette thèse CIFRE se déroulera à Thales Research & Technology à Palaiseau, et sera co-dirigée par Michèle Sebag (LRI - Université Paris-Sud Orsay) et Eyke Hüllermeier (Université de Paderborn en Allemagne). Plusieurs déplacements à Paderborn seront organisés durant la thèse.
Je vous remercie par avance pour sa diffusion auprès de vos étudiants.
Cordialement,
Christophe LABREUCHE
THALES RESEARCH & TECHNOLOGY
Join Thales, a global leader in safety and security technologies for the Aerospace,
Transportation, Defense and Security markets. With 62,000 employees in 56 countries, the
Group benefits from an international presence that allows it to act as close as possible to its
customers, anywhere in the world.
Located on the École Polytechnique campus, at the heart of Paris-Saclay's world-class
scientific and technological center, the Palaiseau site is one of the Group's research centers.
You join the Information Sciences and Techniques Research Group, one of whose main
missions is to solve complex problems in the many areas of expertise of the Thales group
(Space, Transport, Defense ...).
DESCRIPTION OF THE SUBJECT
Summary
The aim of the PhD thesis is to validate a model constructed from a Preference Learning
algorithm for safety critical applications. A Machine Learning approach will be developed to
construct a Multi-Criteria Decision Aiding model representing interacting criteria. The
validation will be based on worse-case situations rather than on standard cross validation
techniques.
Context
The context of the PhD thesis is Multi-Criteria Decision Aid (MCDA) [KR76] in safety-critical
application domains (e.g. the supervision of a metro line). The MCDA model is learned at
design time from operator expertise, and then executed at runtime (e.g. during the
exploitation of the metro line). During the exploitation phase, the model can be adapted
from given operator feedback. Preference Learning (PL), which is an emerging subfield of
Machine Learning (ML), can be used to learn a MCDA model and make recommendations
to the user from observed preference data [FH10]. The models used in PL are often of
generic nature (for example, linear or kernel functions). In safety-critical applications, the
users are experts, and the model shall capture and reproduce their subtle decision strategies.
This yields the use of rich and versatile models such as the Choquet integral [C53,GL10]
representing in particular interaction among the decision attributes. PL has been extended to
such models [TCDH12,TLH14]. The main benefit of these approaches compared to traditional
ML techniques is twofold. Firstly, these models have been specifically designed for decision
purposes and are easier to interpret and trust for a decision-maker. Interpretability has
regained importance, because ML algorithms are nowadays used to make decisions in
many domains, and people would like to (and sometimes have a right to) understand the
outcomes of the algorithms. Secondly, MCDA models naturally capture important properties
such as monotonicity condition (which mean that the larger the values of criteria the better).
Such properties are important (but are not sufficient) to ensure the safety of the algorithm.
Safety is difficult to obtain from ML algorithms, and this subject has recently emerged as an
important topic in ML [VA17].
The validation of the MCDA model that has been learned is arguably important as no MCDA
algorithm will be integrated in a safety-critical system without any validation. It also increases
trust and confidence of the operator. One needs to ensure for instance that the system will
not produce terrible decisions in extreme situations. It is also important to have an assessment
of the quality of the solution. In particular, in which situation the decision produced by the
algorithm will be good, and in which situation is the algorithm decision not satisfactory.
The problem of validation of a MCDA model has multiple facets. The data provided by the
operator is of limited quantity. But it might be (partly) erroneous or too sparse to validate the
model. In this latter case, more training data is needed. Finally, there is a variety of possible
model classes (e.g. considering interaction among no, 2, 3, ... attributes).
In the literature, there are interesting works partly (but not fully) addressing the previous
facets.
One primary focus in ML is the generalization accuracy, which is given by the empirical error
(error on the training set) and the complexity of the model class. It is assessed in practice by
the statistical empirical accuracy through cross validation. This is not satisfactory in safety-
critical applications, as one is interested in worst-case scenarios rather than the average
error.
Most learning algorithms are not robust to adversarial examples [GSS15]. Particularly in the
domain of computer vision, malignant modifications of modest amplitude can be designed
such that, although undetected by human expert's eyes, these modifications would result in
a wrong classification with high confidence. The process for generating such adversarial
examples essentially relies on the local linearity of the classifiers in high-dimensional spaces.
These techniques cannot be directly applied in MCDA as the number of attributes is small. In
adversarial learning, the test set is chosen so as to obtain the worst possible error rate of the
classifier [LC10].
The lack of data might be qualified thanks to the distinction between epistemic and aleatory
uncertainty [SBD13]. The epistemic uncertainty comes from the lack of data (relatively to the
complexity of the model class), whereas the aleatory uncertainty is originated from the noise
in the dataset (together with the relevance of the choice of the model class).
In [D17], the problem of model class selection is handled in the framework of belief functions,
where the user has to provide a probability to each learning instance. The credibility that a
model class represents the dataset is the Belief mass at the empty set. This can be used to
choose the correct model class.
Workplan
The PhD thesis will develop a validation approach for classes of MCDA models representing
interaction among criteria, constructed by PL approaches such as in [TCDH12,TLH14]. In
particular, the following points will be addressed:
> To which extent are we sure that the learned model is representative of the
preferences of the expert?
> A related point is on the validity domain: the model may not be valid in the whole
decision space, but can we identify in which area of the decision space we are sure
it is valid?
The implemented algorithms will be tested on datasets from the machine learning
community, and applied on a real Thales use-case.
CONTACTS AND LOCATION
The thesis is located at the Thales site at Palaiseau and will be co-supervised by University of
Saclay and University of Paderborn (Germany). Several trips at Paderborn will be organized.
Contacts:
> Christophe Labreuche, Thales Research & Technology, Palaiseau, France
(christophe.labreuche@thalesgroup.com)
> Eyke Hüllermeier, Univ. Paderborn, Germany (eyke@upb.de)
> Michèle Sebag, University of Saclay, France (Michele.Sebag@lri.fr)
HOW TO APPLY FOR THE PHD THESIS?
Send a resume and a motivation letter to Christophe Labreuche
(christophe.labreuche@thalesgroup.com).