Horaire -

Lieu LISN Site Belvédère

Science responsable

Some experiences in Machine Learning and Data Science for High-Performance Computing resource management

Orateur : Danilo Carastan Dos Santos (UGA)

High-Performance Computing (HPC) platforms are growing in size and complexity. In an adversarial manner, the power demand of such platforms rapidly grows as well, and current top supercomputers require power at the scale of an entire power plant. To make a more responsible usage of such power, researchers are devoting a significant amount of effort to devise algorithms and techniques to improve different aspects of performance, such as scheduling and resource management.
In this seminar, I will present some experiences using Machine Learning and Data Science techniques for efficient HPC resource management. I will show a method that exploits HPC simulation and regression to create novel HPC scheduling policies. I will show our current work on combining multiple data sources to classify and predict the energy behavior of HPC applications. Then I will finish by showing some foreseeable ways to extend these experiences to other computing platforms and other dimensions of the problem.

Lieu de l'événement