PhD: Unlocking understanding of floods and droughts through data assimilation and exascale computing

Lancaster University

Lancaster, UK 🇬🇧

Overview and Background

​​Floods and droughts are increasingly impacting society, ecosystems, and the environment, yet predicting when they will occur and their effects remains a major scientific challenge. Soil–water interactions are at the heart of this challenge, as they are pivotal in storing and releasing water in landscapes, determining plant growth, and nutrient cycling. However, these interactions are highly complex and we currently rely on computationally intensive process-based models to help understand these processes and predict their influence on ecosystem services. With recent advances in satellite imagery and sensing, a wealth of soil moisture data and other relevant data products are now available that could transform these models and our understanding of the risks and impacts of floods and droughts. This studentship focuses on taking advantage of new exascale computing approaches to facilitate data assimilation, exploring how the fusion of big-data with hydrological and biogeochemical soil-water process models can help unlock new insights and understanding. 

Methodology and Objectives 

Teaser Project 1: The role of soil water storage in drought risk 

Objective: Estimate the contribution of soil water storage to mitigating or increasing drought risk in a case study catchment by combining remote sensed soil moisture data, along with meteorological, hydrological and hydrogeological data with hydrological models. 

​Soil water holding capacity can play a significant role in buffering droughts by storing moisture and supporting groundwater recharge. However, the interactions among precipitation, soil processes, surface flow, and groundwater are complex. Using data-driven methods to explore these relationships could improve our understanding of drought propagation. 

​Soil moisture is an important component in semi-distributed or distributed hydrological models. However, it is often poorly represented, and it is not routinely updated dynamically throughout the process of a simulation. If we can build data-driven models which relate precipitation to groundwater through soil water interactions in droughts, we could improve our hydrological models by including hybrid processes.

​In this teaser, the student will begin to explore different approaches to data assimilation of remote sensing and in-situ monitoring data into hydrological models, focusing on the Tweed to enhance their representation of soil-water-groundwater interactions during droughts.  

Teaser Project 2: The effects of droughts on long-term soil carbon cycling   

Objective: Improve process-based model representation of the long-term effects of droughts on plant growth and soil carbon through remote sensing data assimilation. 

​A lack of water can have large effects on plants, especially on annual crops where water conditions can severely affect the plant’s growth and survival. With changing water patterns and increasing frequency of prolonged dry periods, the effects on plant productivity are expected to be large, and there will be knock-on effects for soil carbon storage in the longer-term. 

​Remote sensing offers many data products that can provide us with data-based insights into plant productivity and soil moisture conditions. However, remote sensing of soil carbon is much more difficult, and understanding of the long-term response to changes in plant productivity still requires process-based models. 

​In this teaser, the process-based model N14CP, which simulates plant-soil carbon cycling will be adapted to assimilate (Gross or Net) Primary Productivity (GPP and NPP) and soil moisture data remote sensing products during a known period of drought in the UK. Freely available datasets for example from MODIS and SMAP that match the spatial resolution of the model will provide a starting point. This model will be used to explore the long-term effects of droughts on soil carbon. 

Shared methods and the pathway to PhD 

  • ​Both teasers have a common focus on droughts and involve data assimilation into process-based models. The student will explore a range of approaches: working up from direct insertion to traditional data assimilation approaches (e.g. Kalman filter or particle filtering approaches) to ML-supported approaches (e.g. combining ensemble Kalman filtering with machine learning to reduce compute times) and using ML-based surrogate modelling to speed up process-based model simulation, using for example, Recurrent Neural Network (RNN) methods such as Long Short Term Memory approaches that have been shown to be a promising approach to emulating hydrological systems.      
  • ​The two teasers can be developed into two full chapters focused on the use of data assimilation in determining drought risk and knock-on impacts for carbon cycling. 
  • ​The PhD can be further developed in a number of other directions, depending on the student’s interests by: i) expanding the focus to floods; ii) developing scaling approaches to move up to catchment and national scales; iii) exploring two-way learning between data and models, iv) trialling real-time assimilation approaches that help move towards a digital twin.  
  • ​Exascale/GPU computing will be fundamental in supporting ML data assimilation approaches and hybrid model simulations. For instance, the development of a generalisable ML surrogate model capable of simulating hydrological fluxes and storage processes at the land surface requires training on large spatially and temporally explicit datasets comprising satellite imagery, model outputs, and other relevant data sources. During training, the model must be exposed to as much information as possible to accurately learn the system’s responses to various inputs. A limited dataset reduces the likelihood that the model will capture the full spectrum of system behaviours. This limitation is particularly significant in non-linear systems, such as hydrological systems, where extrapolation beyond the training range becomes unreliable. GPUs will be vital to handling data volumes needed to achieve this, enabling parallelisation of matrix operations on large training data. Greater computational capacity permits the use of larger datasets during training, thereby improving the robustness and generalisability of the surrogate model. 

​The student’s research will be connected to the Floods and Droughts Research Infrastructure at UK Centre for Ecology & Hydrology, helping connect the student with relevant research and data resources: https://www.ceh.ac.uk/our-science/projects/floods-and-droughts-research-infrastructure-fdri  

References & Further Reading

29 days remaining

Apply by 9 January, 2026

POSITION TYPE

ORGANIZATION TYPE

EXPERIENCE-LEVEL

DEGREE REQUIRED

IHE Delft - MSc in Water and Sustainable Development