Task 3 – Runtime Environment

Leader: A. Caubel (LSCE), M.-A. Foujols (IPSL)

Contributors: IPSL, CERFACS, IDRIS, CNRS-GAME

Objectives:

Climate modelling community is targeting larger ensembles simulations performed with higher resolution models and more advanced earth system components, both in terms of physical processes and parallel programming complexity. A robust and reliable runtime environment is needed to run such simulations together with efficient use of computing resources. The libIGCM runtime environment handles the production of climate simulations – i.e the orchestration of about hundreds batch scheduled jobs per simulation (both model execution tasks and output data post-processing tasks) – on different computing centres. The aim of this task is to perform developments in libIGCM workflow in order to have a runtime environment handling complexity and load balancing of parallel programming of coupled model (MPMD mode, hybrid MPIOpenMP parallelization, IO tasks), ensuring robustness and reliability to all Earth System Models users as well as portability on different High Performance Computing centres.

  • Task 3.1: Process assignment (IPSL, IDRIS, CNRS-GAME)

Components of climate model are codes, which use MPI context only or hybrid MPI-OpenMP context to manage parallel computations and data communications. On most of our European ESM, from which IPSL and CNRMCM, these components are coupled together with OASIS on a so-called “MPMD (Multiple Program Multiple Data) “application. MPMD applications require a specific system manager with different levels of parallelism, which is not a standard for computing centres. The aim of this sub-task is to work with computing centres to find the best way to address computing processes and tasks on cores and nodes of the supercomputer. The appropriate method will be implemented in libIGCM runtime environment.

  • Task 3.2: Optimization, Load balancing (IPSL, CERFACS, CNRS-GAME)

As any coupled system, the complexity of a coupled climate application imposes us to pay attention to the location of the tasks on the cores and nodes of the supercomputers, and to analyse deeply the performances of every model task and every independent component to expect to use efficiently computing resources.

Every independent component has to be analyzed in terms of computing tasks and I/O tasks. Using XIOS in every component (see Task 4) will allow reducing the elapsed time of the simulation thanks to the asynchronous parallel writing. We propose to develop an analyse tool in order to find the optimal number of XIOS servers needed to make every component model run as balanced as possible between computing tasks and I/O tasks.

Load-balancing between components that are running in parallel also strongly influences performances of a coupled model. An analyse tool, easily expandable to any coupler, has been developed by CERFACS (Maisonnave, 2012) to measure performances of every component in the whole coupled model, to evaluate their scalabilities and to find the optimal number of process (1) to balance each independent component durations and (2) to speed-up the whole system. We propose to extend its use to any IPSL-CM and CNRM-CM configuration and to provide additional information such as OASIS coupling cost (interpolation, communications).

Besides, a study will be done on the effect of the location of computing (and IO) process on different cores and nodes of the supercomputer to understand the best way to map the different tasks of the whole Earth System model. This study will be done on machines of different HPC centre (IDRIS and TGCC) centre. The use of these analysis tools and the conclusions of the study on the optimization of the process assignment, that will be useful for any coupled model on high end supercomputers, will find a realization in the libIGCM runtime environment

  • Task 3.3: Climate Simulations Supervision (IPSL, IDRIS, CNRS-GAME)

Typical climate simulation keeps running for weeks within HPC centres; in particular they must be able to span over HPC maintenance period. During his running time of three weeks one single typical simulation will produce and manage around 100 000 files representing 25 To of data. Orchestration of about hundreds batch scheduled jobs (model and post-processing tasks) is necessary to achieve one simulation. Climate simulation workflow, such as IPSL simulation control environment (libIGCM) have an execution model that is extremely data-flow oriented and dynamically generated depending on a set of user defined rules.

While statically configured workflows are sufficient for many applications, the notion of dynamic, event-driven workflows enable a fundamentally different class of applications. Climate simulations are part of this class of applications, events that occur after the workflow has been started can alter the workflow and what it is intended to accomplish. We may wish to change a workflow for a variety of reasons, and the events that trigger these changes may come from any source possible. Events may occur in external, physical systems, in the application processes themselves, or in the computational infrastructure. Clearly the ability to dynamically manage workflows is an important capability that will enable dynamic application systems and also improve reliability.

As well as dynamic behaviour, the autonomic aspect is strongly needed in climate simulation control environment. At the most general level, autonomic computing is intended to be “self defining and self healing”. To realize these behaviours on a practical level we will develop a “supervisor agent” to accomplish precisely the behaviours required for dynamic workflow management, (1) detect and understand failure event, (2) understand the ultimate goals of the workflow, and (3) be able to re-plan, reschedule, and remap the workflow. Fault detection is essential and in this case, we will rely on the service call tree to propagate the cancellation events. This service call tree would be “short-circuited” and rescheduled in specific cases. This must be done even though the agent may have imperfect knowledge of the environment and limited control over it. Under these conditions, we must realize that dynamic workflows may only be capable of approximately accomplishing their goals. That is why a tight integration and cooperation with HPC centres are essential to ensure portability. The supervisor agent will provide the following features and will be included in the IPSL simulation control environment (libIGCM):

  • All events logged in a comprehensive call tree (job submission, work to be done, each copy, etc.)
  • Reliable lightweight communication channel between client agents and server agents (very likely to be based on RabbitMQ implementation of the Advanced Message Queuing Protocol, AMPQ)
  • Call tree traversal capabilities so as to determine checkpoint restart
  • Autonomous rescheduling of necessary jobs
  • Monitoring capabilities, for instance colored graph with all jobs and their status
  • Regression tests handling capabilities

Success criteria:

Number of developments from task 2.1 to 2.4 included in reference version of IPSL andCNRMCM coupled models . OASIS3-MCT implemented and used in both coupled models. Throughput of 2 simulated years per day for the IPSL coupled model at 1/3th° resolution and 4 simulated years per day for high-resolution version of CNRM-CM (~50 km for the atmosphere, 1/4th° for the ocean).

Identified risks:

Difficulty to hire trained computer scientist with skills in HPC. May increase the time needed for some developments. Consequences: the target throughput may not be reached. Parallel weight generation and interpolation may not be operational (low risk, as previous methods will be still usable).For use of OASIS3- MCT, the risk is low as it is already used in few coupled model and has already shown ease of use and good performances.