Lecture 2

Machine learning to fill spatiotemporal data gaps in hydrological monitoring networks

Description: This lecture focuses on the use of Machine Learning (ML) methods to fill spatiotemporal data gaps in hydrological monitoring networks. The session explains how ML can enhance the accuracy and completeness of groundwater and surface water observations by modeling spatial and temporal dependencies that conventional statistical methods often overlook. Participants will learn how spatially adapted ML algorithms, such as Random Forest (RF), Random Forest for Spatial Data (RFsp), and Random Forest for Spatial Interpolation (RFSI), address issues of uneven sampling, missing observations, and data clustering in hydrological datasets. The lecture will also explore how spatial features, neighborhood information, and autocorrelation can be integrated into ML frameworks to improve predictive reliability. In addition, it will introduce spatial cross-validation and reinforcement learning for optimizing sampling strategies and reducing model bias. By the end of the session, participants will understand how ML-driven approaches can strengthen hydrological monitoring, support better resource management, and enable scalable, data-driven modeling of groundwater systems.

Objective: To develop a clear understanding of how Machine Learning techniques can be applied to address spatial and temporal data gaps in hydrological monitoring networks, improving the accuracy and reliability of groundwater assessments.

Duration: 2.25 hrs

Lecture Flow (content):

Machine learning to fill spatiotemporal data gaps in hydrological monitoring networks (20 min)
Machine learning as a generic framework for spatial prediction (40 min)
Challenges in Implementing Random Forest and Integrating Spatial Data (20 min)
Discussion Questions (40 min)
Summary and Final Thoughts (20 min)

Outcomes: Participants will be able to identify suitable ML models for spatial and spatiotemporal prediction, integrate spatial features into model design, and apply spatial cross-validation techniques to enhance model performance