This workshop introduces students to state-of-the-art software methodologies and technologies to handle, manage and analyze the large amounts of data coming from extensive simulations using innovative algorithms. It covers three main areas of interest: data visualization in scientific contexts, statistical methods for data analysis, machine learning and deep learning with a special focus on neural networks. It includes lectures in which the mathematical principles and the relevant methodologies are presented in detail, and hands-on training is provided on the topics covered. A fraction of the workshop is devoted to the development of projects assigned to each student on one of the three scientific areas covered including methods for dealing with model comparison.

 

Workshop Material

 

Recorded Lectures

Introduction

An (gentle) introduction to Machine Learning (Part 1)

An (gentle) introduction to Machine Learning (Part 2)

Fundamentals of Deep Learning

Data Storage and Management for HPC Scientific Applications

Scientific Data Format: The search of the holy GRAIL Scientific Data Format

Foundation of Parallel I/O

Scientific Data Format: The HPC FTW codes OR Climate Models

Scientific Production Best Practices on a Real Case Application

Scientific Data Format: The Incredible Life of Scientific Data

Quantum Technologies for Future Scientific Research

Scalable analysis of ultra-terabyte brain images: from low-level data management to deep learning

ECMWF's Extreme Data Challenges: Path to Exascale Numerical Weather Prediction

The Challenge of Big Data