Task 5 – CliMAF: a framework for climate models evaluation and analysis

Leader: J. Servonnat (LSCE-CEA) and S. Sénési (CNRS-GAME)

Contributors: IPSL-LSCE, CNRS-GAME, CERFACS, IPSL-LOCEAN, IPSL-LMD, IPSL-LATMOS

Objectives:

CliMAF (for Climate Model Assessment Framework) is the French climate community operational solution that will be developed in this task for efficient evaluation and monitoring of climate model outputs. It is a framework for a simplified (from the user point of view) and efficient access to both model simulations and reference datasets, data pre-processing (data subsetting, period selection, regridding), application of a set of diagnostics and metrics for model evaluation and finally, a web-oriented visualization solution to explore the evaluation results through an “atlas”, i.e. organized set of figures. CliMAF is a framework for climate model developers. Its flexibility also makes it suitable for researchers wanting to develop more specialized diagnostics while taking benefit from its data access and visualization tools. Our goal is that CliMAF can be used on a workstation, and on the computing centers and servers used at GAME, IPSL and CERFACS. An option will allow parallel computation of the diagnostics to face the huge amounts of data generated by ensembles and high resolution models. We plan close interactions with the EMBRACE WP4, IS-ENES and ExArch so that parallel projects can take benefits from other ones and to favour a community approach for evaluation and analysis tools for climate simulations.

  • Task 5.1: General driver and upstream user interface

The general driver of CliMAF is the core of the framework. It actually launches scripts implementing the diagnostics and metrics (see e.g. Gettelman et al., 2012) that will generate the various atlas. An upstream user interface (UI) will allow the user to describe the detailed content of each atlas, in a flexible way. A subset of “standard” evaluation and monitoring atlas will be available for the user. We consider two levels of sophistication for the UI, text file and point-and-click user interface. The general driver will be supported by a thorough analysis of the framework for ensuring genericity and evolution. The driver will be suitable to implement additional scripts in Ncl, R and Python, notably developed within task 5.4. The UI will allow batch submission of a CliMAF job. It will thus be possible to submit CliMAF jobs automatically during an on-going simulation for updating some monitoring atlas, or submit other ones,manually or not, when a simulation is completed, for a more comprehensive evaluation atlas. For both general driver and UI, interactions with the EMBRACE WP4 are emphasized to both use their framework as a basis and in return provide them with improvements developed in this task. The general driver will also describe the expected monitoring and evaluation data (produced with CliMAF) with structured metadata to be published in catalogs for later searches. To reduce CliMAF execution time, the driver will prepare a parallel implementation of a subset of the core diagnostics from task 5.4 to be run on a Cluster or an HPC. We plan to take advantage from the Swift (or similar) library as data-intensive task-oriented workflow.

  • Task 5.2: Services layer

Easing the development of diagnostic scripts is a basic requirement of CliMAF; this is achieved by providing ‘services’ to these scripts, in the form of a library of high-level functions, which include and combine the following functionalities: locating the data (including on ESGF datanodes), fetching the data (if requested), selecting the data in the space-time domain, caching the data on local disk, computing derived variables (using compute nodes facilities when available), re-gridding the data and forwarding all metadata available in the original data. Typically, upon a function call for getting data, the Service Layer will check step-by-step if the required derived quantity or data file is already existing in a local cache memory. If not, it will manage the required operations from the data access to the calculation of derived quantities or regridding to provide the calling script with the data ready for analysis. This upstream checking will avoid multiple calculation of the same derived variable, particularly when using CliMAF for monitoring.

  • Task 5.3: Visualization tools

CliMAF will produce extensive data through evaluation and monitoring diagnostics, both qualitative (maps, figures) and quantitative (scores, metrics) for every model run. Thorough analysis of these results using standard hands-on approaches is tedious for individual scientists and eventually becomes impossible as the number of simulations increases.Task 5.3 will provide a semi-automated solution to facilitate comparison of diagnostic and validation data across dozens of simulations. It will develop the means to search, organize and represent these numerous post-processing data. Among other fundamental tools, Task 5.3 will develop a web application to steer the search and organize results to facilitate subsequent, more synthetic analysis, notably in response to the standard versions of CliMAF evaluation and monitoring atlas (see task 5.4). Furthermore, diagnostic and validation data will be brought together and used in high-level joint graphical and statistical representations while exploiting the most recent web graphics technologies such as the Data-Driven Documents library (D3.js)

  • Task 5.4: Evaluation and monitoring diagnostics

This sub-task will consist in implementing in CliMAF a collection of evaluation and monitoring diagnostics used at GAME, IPSL and CERFACS, using the basic features of the general driver and the functions of the Service Layer. We will also collaborate with the DRAKKAR community to implement their ocean model monitoring and evaluation tools. As well, evaluation packages of the Is-ENES project can be integrated. A core set of diagnostics and metrics will define initial “standard” versions of CliMAF evaluation and monitoring atlas. It will be based on the classical atlas already in use in the partner’s centers for model development. In addition to this core set of diagnostics, the contributors will feed the standard collection with diagnostics that are relevant for model evaluation or monitoring from their expertise.

Success criteria:

A success criteria is that the climate model developers of the French community use CliMAF routinely during model development phases. The initial objectives will be overcome if climate scientists use CliMAF for climate research.

Risks and envisaged solutions:

  • Dealing with the variety of languages that potential script developers may wish to use. A fall-back solution should be designed, which could rely on using files to deliver the data to non-supported languages.
  • Difficulty in putting few constraints on diagnostic scripts language while providing advanced utilities to the developer. A trade-off has to be reached. Also, shortcomings in the overall design would hinder extensibility. This risk will be addressed by requiring validation by a large number of engineers and scientists in partner institutes.
  • Contextual descriptions of the diagnostic and validation data and their indexation catalogue are the conditions to make searches. ESGF Publishing or alternatives will be tested for that purpose.
  • There is no particular risk linked with the development of evaluation and monitoring diagnostics. The main risk of task 5.4 is the possibility to actually implement the diagnostics in the driver, notably for the DRAKKAR tools and IS-ENES packages, in a sustainable way. From this point of view, success of Task 5.4 is closely linked with the success of Task 5.1 and 5.2.