Open TURNS, an Open source initiative to Treat ... - Bruno Sudret

or industry-specific tools) or industrial computing capabilities .... Here under are resumed some of the graphs pro- posed by .... 110(3), 357-366. Breitung K.
83KB taille 11 téléchargements 323 vues
Open TURNS, an Open source initiative to Treat Uncertainties, Risks’N Statistics in a structured industrial approach G. Andrianov, S. Burriel, S. Cambier, A. Dutfoy, I. Dutka-Malen, E. de Rocquigny, B. Sudret EDF Research & Development, Clamart, France

P. Benjamin, R. Lebrun, F. Mangeant EADS Innovation works, Suresnes, France

M. Pendola PhiMECA, Clermont-Ferrand, France

ABSTRACT: The need to assess robust performance for complex systems have led to a considerable rise of industrial interest in the simulation challenges: to treat and propagate uncertainty through complex physical and numerical simulation frameworks. The industrial stakes require that both the methodology and the numerical methods for uncertainty treatment have to be openly validated and enriched both by the academic world and the certification authorities. A general methodology has emerged from the joint effort of both industrial companies and academic institutions, and a list of well-established numerical methods have been selected to support this methodology. As a part of this joint effort, EDF R&D, EADS Innovation Works and PhiMECA have started a collaboration at the beginning of year 2005 for the development of an Open Source software platform dedicated to uncertainty treatment by probabilistic methods, named OpenTURNS for Open source Treatment of Uncertainty, Risk ‘N Statistics. In this paper, we present the Open TURNS initiative and the platform in its initial version, after having recalled the context which motivated the three partners to develop this new open source software. Then, we describe the wider objectives of EDF R&D, EADS Innovation Works and PhiMECA regarding Open TURNS and the open source community.

1 INDUSTRIAL CONTEXT AND RATIONALE FOR AN OPEN SOURCE INITIATIVE Uncertainty, along with sensitivity analysis, is the subject of a long-standing concern in the academic community and of substantial on-going research (for instance [Knight, 1921], [Granger et al., 1996], [Helton, 2004]). Although large-scale industrial uncertainty studies have been developed already in the 1990s (notably in the American nuclear industry, for instance [Helton, 1993]), there has been a considerable rise of interest in many industries in the recent decade as evidenced by the number of international conferences. Industry needs generally to answer tighter regulatory processes (security, safety, environmental control, health impacts …) gradually requiring explicit uncertainty treatment, and to better optimize the design, operation and maintenance of industrial processes or products including uncertainty. While considerable literature has discussed the nature of uncertainty and appropriate treatments, and later developed many specific numerical algorithms (mostly, although not exclusively, probabilistic), many attempts to treat uncertainty in large industrial applications have involved domain-specific approaches or standards, for instance : the control of pure metrological uncertainty, generally Gaussian

and involving simple analytical models [G.U.M]); the concepts of structural reliability (often mechanical) involving rare threshold exceedance probabilities and related computing challenges [Ditlevsen O. 1996]; differential-based approaches for large physical models [Cacuci et al., 1980], more elaborate on the numerical (e.g. adjoints) than the statistical side, and generally devoted to variance-based local sensitivity ; global sensitivity approaches involving more elaborate sampling techniques (including response surfaces and most recent stochastic developments), generally focused onto variance decomposition[Saltelli et al., 2004]. Facing the questioning of their institutional control or certification authorities in an increasing number of different domains or businesses, the large industrial companies to which the authors belong have felt that domain-specific approaches are no more appropriate : in spite of the diversity of terminologies, most of these methods do share in fact many common algorithms, and notably a mixture of estimation of probabilistic quantities involving also the computation of often high-CPU deterministic physicalnumerical models. Beyond the leverage effects that can be achieved through the merging of those approaches, the peculiar industrial challenges attached to the recent uncertainty concerns are :

− transparency : a key requirement to reach an open consensus on the basis of an accountable and hypothesis-explicit treatment of uncertainty, that can be understood and challenged by outside authorities and experts − genericity : the need to address in a consistent and generic way a growingly multi-physical / multi-domain issue, that is not limited anymore to a particular item (metrology, pure mechanics, economics …), and that involves elaborate management processes and various actors or companies along the supply chain − industrial computing capabilities : the need to involve high performance computing (distributed, multi-environment etc.) in an efficient, qualityassured way to secure the challenging number of simulations generated by uncertainty treatment and the reproducibility of the methodologies in successive industrial studies Although a wealth of commercial softwares or dedicated industrial applications are available to address uncertainty studies, the authors felt that none fully answered the challenges mentioned hereabove: either lacking full transparency (being rather closed commercial softwares), genericity (domain-specific or industry-specific tools) or industrial computing capabilities (architecture or environment limitations). All these concerns have motivated the launching of the Open Turns initiative, with the following key features : − an open source initiative, rather than a commercial software, to secure the transparency of the approach, and its openness to on-going R&D development and expert challenging − generic to the physical or industrial domains, to allow a consistent treatment of multi-physical problems, or of the various regulations to be met by a single industrial project − including the largest variety of qualified algorithms: many methodologies and numerical algorithms have been developed for uncertainty treatment in the recent years and there is still a large on-going R&D : industrial practice evidences that there is no single ever-relevant methodology, but that it is preferable to rely on a portfolio of algorithms, keeping up-to-date with ongoing R&D: this is the rationale of the Open Source format. − structured in a « practitioner-guidance » methodological approach : the question in an industrial study is not firstly « how » to compute the uncertainty treatment applying the most recently published mathematical algorithm on a fixed physical model ; it is rather « what » the specific criterion or goal to be answered by the uncertainty study is, according to the regulation or decision-making specificities is ; and which best chain of models, given the data practically avail-

able on uncertainty, should be selected before applying the uncertainty treatment. For a practitioner facing some academic discussions on the theoretical advantages of numerical algorithms, confusion may linger for instance : by not specifying explicitly whether the uncertainty study has to ascertain the compliance with a variancederived criterion or rather in quantiles (distribution tails) ; by ignoring the practical characteristics of the industrial problems, such as the true availability of data or the physical regularity of the deterministic model etc. − with advanced industrial computing capabilities, enabling the use of massive distribution & high performance computing, various engineering environments (such as generic or domain-specific interfaces), large data models etc. To start up the Open Source Initiative, a partnership was set by the R&D centers of two large industrial companies together with an engineering consultancy : Electricité de France (EDF) : a leading European energy operator, European Aerospace Defense and Space (EADS) : a leading European aerospace compagny and PhiMECA : a French R&D and consultancy company specialized in uncertainty assessment. The following paragraphs will detail the key characteristics of Open TURNS in its initial version (§2.) as well as the wider objectives envisioned by the initiative (§3.). 2 OPEN TURNS V1.0 KEY CHARACTERISTICS The partnership has developed the open source software Open TURNS, for Treatment of Uncertainties, Risk’N Statistics, since the beginning of 2004. 2.1 Open TURNS : a multiple-access software Open TURNS is a Unix/Linux software that presents itself as three ways :

1 a scientific C++ library proposing a internal data model and algorithms dedicated to the treatment of uncertainties. The principal expected use of that library is to give to specific applications all the functionalities to treat uncertainties in studies. Targeted users are all engineers who want to introduce the probabilistic dimension in their so far deterministic studies. 2 an independent application with an graphical IHM. The principal expected use of that independent application is to become the work environment for the specialist of the treatment of uncertainties. Targeted users are once again industrial practitioners: those that identify the

treatment of uncertainties as a full task, generic to more than one engineering domain. 3 a python module proposing high level operators in the probabilistic and statistical field. The principal expected use of that language is to offer all the scientific power of the C++ library with the conviviality of an interpreted language like Matlab’s one. The principal expected use of that python module s to facilitate the development of prototypes to try new algorithms or methods, to become an easy to use support for educational TDs, … Its vocation is to become a natural environment to integrate new developments within the field of uncertainty and sensitivity analysis. The public targeted is here research centers and the academic world. 2.2 Open TURNS v1.0 : an open source software Open TURNS is an open source software, under the LGPL licence. It can be loaded under the following address : http://www.openturns.org. It works under the Unix / Linux environment, but all the technology choices have been made not to hinder the passage to the Windows environment. It relies on standard formats as : − Qt technology for the graphical IHM, − xml technology for the description of external models − DRMAA for the standard of distribution, − STL for the C++ language and any structure belonging to the standard C++. − Open TURNS uses reputed and widespread open source technologies, as : − Xerces to read the xml format, − Qt for the graphical IHM, − Python for the textual command language, − R language for the statistical methods, − LAPACK and BLAS for all the linear algebra algorithms. 2.3 Scientific methods of Open TURNS v1.0 The scientific algorithms of Open TURNS have been selected within the most accepted ones and largely used in the uncertainties treatment field. The choice have been made following several discussions between industrials, including within the European Safety Reliability and Data Association (ESReDA). Beyond the standard methodologies, some of them are scientifically more recent algorithms and an innovative way of thinking (see § 2.5 Innovative aspects of Open TURNS). In its presentation, Open TURNS aims at being consistent with the methodological recommenda-

tions that are being discussed within the ESReDA Project Group “Uncertainty”. Its methodology notably decomposes an uncertainty study into three steps, that we detail hereafter. 1 Step A : Specify the decision criterion and the output variable of interest. − the criterion may be deterministic or probabilistic. In the deterministic case, Open TURNS proposes a min / max study : the objective is to determinate the extremum of the interest variable given the uncertainties of the input probabilistic variables. In the probabilistic case, Open TURNS proposes to study the central dispersion of the interest variable given the uncertainties of the input probabilistic variables, its distribution or its probability to exceed a given threshold. − the output variable of interest is the result of a model that can be an analytical formula or a more complex one (e.g. finite element model or even coupled models). Open TURNS has the capability to take into account the gradients of the output variable with respect to the uncertain model inputs if they are evaluated by the model. In the other cases, the finite differences technique is used. The model may work under the Unix environment or the Windows’. 2 Step B : Quantify/model the sources of uncertainty (model the probabilistic density function of the random vector composed by the input probabilistic variables). − In the case where we dispose of numerical samples for the input probabilistic variables, Open TURNS offer the following statistical functionalities : • probabilistic density function fitting methods : based on the maximum likelihood principle for parametrical method and based on the kernel smoothing technique (with gaussian kernel) for non parametrical method. • quantitative and qualitative validation tests : the χ2, Kolmogorov-Smirnov and Anderson Darling tests and a gradation of all these validation tests with the BIC criterion; Henry line for the gaussian test, the pp-plot graph and the superposition on a same graph the empirical and fitted cumulative density function. • methods to estimate the dependence between two scalar variables : through their correlation coefficient thanks to the independence Pearson, Spearman and χ2 tests, or through a linear regression analysis whith some indicators of the quality of the regression (estimation of the R2 coefficient, superposition

on a same graph the 2D sample and the regression line, Fisher test on coefficients). − In the case where no sample is available, Open TURNS offer the possibility to model the probabilistic density function by three ways : • by assigning a nD probabilistic density function within usual ones : the first version of Open TURNS proposes the nD gaussian density function if n>1, and gives the choice within a varied panel if n=1. • by assigning an usual probabilistic density function to each of the 1D input probabilistic variable, and by defining the dependence structure of the random vector : the first version of Open TURNS gives the choice between the independent copula or the gaussian one, which correlation matrix is evaluated from the Spearman rank matrix of the random vector. by defining the density function as a linear combination of usual probabilistic density functions. 3 Step C : Propagate uncertainties − In the case where the criterion is deterministic, Open TURNS proposes to sample the interest variable by sampling the probabilistic input variables according to deterministic experience plans (factorial, composite or star like one) or probabilistic experience plans evaluated according to the probabilistic density function of the input probabilistic vector. − In the case where the criterion is probabilistic, Open TURNS proposes : • to evaluate the mean and standard deviation of the variable of interest by the quadratic cumul method or by simulation in order to analyze the central dispersion, • to evaluate the probability to exceed a given threshold by the FORM and SORM approximations, where the result may be validated by the Strong Maximum Test, or by simulation with standard and accelerated algorithms (Monte Carlo, Hypercube Latin Simulation, Directional Simulation, Importance Sampling) with the estimation of a confidence interval. 4 Step C’ : Analyze sensitivity and rank the importance of the sources of uncertainty. − In the case where we study the central dispersion of the interest variable, Open TURNS proposes the importance factors and sensitivity indices evaluated from the Taylor quadratic variance approximation or the correlation coefficients (Pear-

son, Spearman, SRC, SRRC, PCC, PRCC). More advanced variance decomposition techniques (such as Fast or Sobol) should come up within later versions. − In the case where we study the probability to exceed a given threshold, Open TURNS proposes the importance factors evaluated from the FORM and SORM methods. Moreover, aside these four steps, Open TURNS gives the possibility to build a response surface that will be used in the uncertainty analysis instead of the model, for CPU problem generally. Open TURNS proposes to build : either a global information response surface from a second degree polynomial expression, either a local information response surface from a one or second degree Taylor approximation of the model. Here under are resumed some of the graphs proposed by Open TURNS, illustrating some of the previous algorithms.



− −





Theoretical Basics, Examples. The objective of these headings is to give the reader the maximum information about the mathematical principles of the algorithm described, its use limitations, the position of the algorithm in the global methodology and some theoretical references for the reader who wants to know more on the subject. The Example heading enables the reader to apprehend the algorithm through a sample example he can models with Open TURNS, an Example Guide which uses through one example most of the algorithms proposed by open TURNS. This example is more complex than those presented inside the reference guide and is related to the evaluation of the height of a dike to protect from a flood. a TUI Guide that presents the reader all the python commands that Open TURNS proposes, with their particular syntax, a Use Cases Guide that regroups numerous examples of studies, modeled with the python TUI : this guide facilitates the apprenticeship of the TUI langage, a Conception Documentation that presents the architecture of Open TURNS, in order to facilitate the integration of new developments by the open source community. This documentation is provided by Doxygen, numerous examples of external model encapsulations, in order to facilitate the link with Open TURNS. If the external model is an analytical expression, the link is practically automatically made thanks to a tool supplied with Open TURNS.

2.5 Innovative aspects of Open TURNS v1.0 Open TURNS presents some innovative aspects : − in its probabilistic model :

2.4 Documentation of Open TURNS v1.0 Open TURNS provides a rich documentation, in order to facilitate its first using, both by the communities of uncertainty users and contributors. The documentation is composed of : − a Reference Guide that proposes one data record for each method proposed by Open TURNS. The reference guide is designed in consistency with the methodological recommendations of the ESReDA Project Group “Uncertainty”. Each data record is arranged according to the following headings : Mathematical description, Link with Open TURNS Methodology, References and

• Open TURNS works directly in the nD context by placing the probabilistic density function of the nD random vector composed by the 1D probabilistic input variables in the center of all its algorithms. All the methods implemented in Open TURNS are developed in the multidimensional context. • Open TURNS uses an elaborate classification of probabilistic data : the distributions, for example, are classified according to the characteristics : continuous or discrete, elliptic, … in order to take the best benefit of their properties in algorithms. • Open TURNS offers some recent statistical methods as kernel smoothing methods, and the possibility to define a probabilistic density function as a linear combination of probabilistic density functions.

• Open TURNS uses the copula theory to model the dependence between variables. • Open TURNS offers some innovative algorithms as the Strong Maximum Test for FORM, proposed by EDF R&D and used in some of its uncertainty studies. − in its functional model :

hydraulic, mechanical, or environmental field, … Furthermore, EDF R&D aims to extend the uncertainty treatment methodology to other fields that do not perform probabilistic studies yet, enjoying Open TURNS as a user-friendly support for a consistent uncertainty treatment methodology, integrating the multi-disciplinary experience of other fields.

• Open TURNS uses the function algebra until the second order • all the functions defined in Open TURNS are defined in the nD context : functions go from Rp to Rn, with n and p ≥1, • if the external model evaluates its gradients and its hessian matrix with regards the input probabilistic variables, Open TURNS take them into account, • Open TURNS is able to compose functions as well as their gradients and hessian matrix. • with its distribution service integrated to the plate-form that enables to communicate with infrastructures of high performance calculus (muilti-processors plate-forms, clusters, …), In any way, let’s note that Open TURNS’s architecture makes it a software platform for easy integration of innovations from the open source community in the field of uncertainty treatment. 2.6 Logo of Open TURNS : why the Galton board? Open TURNS's logo refers to the well-known Galton board experiment, created in the 19th century. It is significant of Open TURNS field: it consists in a rigorous mathematical model fed-up with random inputs, and illustrates graphically the magic of statistics as a very regular collective pattern emerges from random individual evolutions. It is also significant of Open TURNS spirit: all the tools needed for its creation are Free technology (Open Dynamic Engine, POVRAY, imagemagic). 3 NATIVE PARTNERS OBJECTIVES : INDUSTRIAL USES AND CHALLENGES SCIENTIFIC OPPORTUNITIES

3.1 Industrial uses envisioned by the native partners EDF-R&D internal goal is to use Open TURNS in studies where the uncertainty treatment has already been undertaken for some time, but in a rather domain-specific way using commercial softwares or research applications, particularly in the thermo-

EADS-Innovation Works goal is first to use Open TURNS for demonstrative studies where uncertainty treatment is forecast: structural mechanics, electromagnetic interferences, tolerancing… The main goal is to spread an uncertainty methodology treatment

enabling to change the design rules applicable within the aeronautical business. The Open TURNS environment will enable to promote such a general approach at different levels for different domains. It also represents a very efficient way to begin a collaborative work with the software industry to introduce these functions in commercial numerical tools. PhiMECA goal is to promote and capitalize some of the innovative uncertainty methods developed by its R&D team in order to use and propose them in an efficient way for its customers. The proposal may concern personalized services such as uncertainty consultancy, effective deployment of the tool, theoretical and applicative training courses or software developments using Open TURNS. Applications concern various fields in advanced mechanics for automotive, aeronautics, nuclear, civil engineering or defense industries. 3.2 External objectives The three native partners welcome the initiative of the open source community to use Open TURNS for their uncertainty treatment studies or research developments. In particular, they whish to increase the open community of users both industrials and from universities, in order to facilitate the uses and sharing of innovations in the uncertainty field. They are also motivated to enrich the open TURNS v1.0 into a new one, including not only more complex algorithms dedicated to specific needs (like other dependence structures : Archimedean copulas or recent sensitivity algorithms for example), but also new notions (like Bayesian aspects for example). At last, they want to increase the open source community of contributors to enrich the software with developments of different public. To give them the possibility to reach these objectives, a Consortium has been created around Open TURNS, regrouping the three native partners but open to others. 4 CONCLUSION This paper aimed at presenting the Open TURNS initiative and the platform in its initial version. It explained why the three partners EDF R&D, EADS Innovation Works and PhiMECA started such a collaboration in 2005 : in order to master industrial and environmental risks and uncertainties, with open practices, they launched the development of an open

source software dedicated to the treatment of uncertainty, which would be generic (multi-physical / multi-domain issue) and consistent with a global methodology for the treatment of uncertainty. The wider objectives envisioned by the initiative are to make Open TURNS a reference software in the field of uncertainty, shared both by the industrial world and the academic one. May Open TURNS help innovative investigations in the field of uncertainty! REFERENCES Breitung K., 1984, Asymptotic Approximation for multinormal Integrals, Journal of Engineering Mechanics, ASCE, 110(3), 357-366. Breitung K., 1989, Asymptotic Approximation for Probability Integrals, Probabilistic Engineering Mechanics, Vol 4, No 4, 187-190. Cambanis S. , S. Huang, G. Simons, 1981, On the theory of elliptical contoured distributions, University of North Carolina, Journal of multivariate analysis vol 11, pp 368-385. Cacuci, D.G., et al., 1980, Sensitivity Theory for General Systems of Nonlinear Equations, Nucl. Sc. & Eng. 75. Ditlevsen O. & Madsen H.O., 1996, Structural reliability Methods, John Wiley & Sons. Granger Morgan M., Henrion M., 1990, A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis, Cambridge University Press. GUM, Guide for the Expression of Uncertainty of Measurement, ISO standard. Helton, J., 1993, Uncertainty and sensitivity analysis techniques for use in performance assessment for radioactive waste disposal, Rel. Eng. & Syst. Saf., 42, 327-367 Helton, J.C., Oberkampf W.L., 2004, Alternative Representations of Epistemic Uncertainty, Special Issue of Rel. Eng. & Syst. Saf., vol. 85 n°1-3. Knight, F.H. 1921, Risk, Uncertainty and Profit. Hart, Schaffner & Marx. Madsen H. O., Krenk, S., Lind N. C., 1986, Methods of Structural Safety, Prentice Hall. Robert C.P., Casella G., 2004. Monte-Carlo Statistical Methods, Springer, ISBN 0-387-21239-6, 2nd ed. Saltelli, A., Tarantola, S., Campalongo, F., Ratto, M., 2004, Sensitivity analysis in practice: a guide to assessing scientific models, Wiley.