# Research Opportunities

## Postdoctoral Researchers

I do not have vacancies for postdoctoral researchers at present but they will appear here when available.

## Postdoctoral Fellowships

I very much encourage young researchers to apply for postdoctoral fellowships. I am happy to support and assist strong candidates that would like to apply for fellowships with MSSL as the host institution.

If you are interested in discussing this further then please email me, including '[Fellowship enquiry]' in the subject of your email. I receive many enquires and so will only reply if your expertise are well matched to my research interests and there is a high chance of submitting a successful application.

More information on various fellowships can he found here:

- Royal Society (RS) University Research Fellowship (URF)
- Royal Society (RS) Newton International Fellowships (for researchers coming from abroad)
- Royal Society (RS) Dorothy Hodgkin Fellowships
- STFC Ernest Rutherford Fellowships (ERF)
- Royal Astronomical Society (RAS) Fellowships
- Leverhulme Trust Early Career Fellowships (ECF)
- Royal Commission for the Exhibition of 1851 Fellowships
- Marie Curie Fellowships (for researchers coming from Europe)
- Daphne Jackson Fellowships (for researchers returning from a career break)

## PhD Students

The PhD projects that I offer are typically multi-disciplinary and include a combination of cosmology, statistics, and informatics (e.g. machine learning, signal processing, harmonic analysis, etc.). A relatively strong mathematical background is usually required for these types of projects. Strong programming skills are also an advantage.

If you are interested in discussing PhD projects further then please email me, including '[PhD enquiry]' in the subject of your email, and attach a CV. I receive many enquires and so will only reply if your expertise are well matched to my research interests and there is a high chance of submitting a successful application.

Further information on how to submit an official application to MSSL can be found here.

Further information on how to submit an official application via the UCL CDT in Data Intensive Science can be found here.

Brief overviews of current projects on offer are given below

### PhD project: Generative and statistical AI for simulation-based inference (SBI) for Euclid dark energy

The current evolution of our Universe is dominated by the influence of dark energy and dark matter, which constitute 95% of its content. However, an understanding of the fundamental physics underlying the dark Universe remains critically lacking. Forthcoming experiments have the potential to revolutionalise our understanding of the dark Universe. In particular, ESA’s Euclid satellite launched successfully in July 2023 and is currently collecting data, with the first data-release (DR1) scheduled for 2026. Sensitive and robust AI techniques are required to extract cosmological information from weak observational signatures of dark energy and dark matter that will be made by Euclid, in order better understand the nature of our Universe. In particular, Euclid data should allow us to answer the fundamental question of whether dark energy can be described by Einstein’s cosmological constant or whether a varying equation of state is required.

In this project we will develop a next-generation analysis pipeline for weak lensing (cosmic shear) based on field-level simulation-based inference (SBI) for both parameter estimation and model selection. Field-level SBI has recently been demonstrated to provide the tightest cosmological constraints to date, although existing approaches are not suitable for upcoming Euclid data. Our pipeline will: (i) extract field-level information that contains a wealth of cosmological information beyond two-point statistics; (ii) not make any assumptions regarding the likelihood; (iii) capture all uncertainties; and (iv) accurately model systematic effects.

To achieve this we will develop statistical AI models based on neural density estimators (e.g. normalizing flows, diffusion models), AI models to compress field-level information down to a low-dimensional latent space, and generative AI models to accelerate the creation of simulations used for training. Due to the wide field of Euclid observations, we will integrate geometric AI components into these models to accurately model the spherical geometry of the celestial sphere on which Euclid observations are made.

The project is extremely timely and will exploit cutting-edge AI models to maximise the scientific return from imminent observations from the Euclid satellite. Some further details can be found in the slides here and references therein.

The student should have a strong mathematical background and be proficient in coding, particularly in Python. The student will gain extensive expertise during the project in AI, going far beyond the straightforward application of existing AI techniques, instead focusing on novel foundational AI approaches—including statistical, generative and geometric AI—and their application to novel problems in cosmology and beyond. The expertise gained in foundational AI will prepare the student well for a future career either in academia or industry. In particular, generative, statistical and geometric AI are specialities highly sought after in industry by the big tech companies and many others.

### PhD project: Generative, geometric AI for global weather prediction

Numerical weather prediction has historically focussed on the simulation of atmospheric physics across the Earth. Classical numerical weather forecasting methods are physically motivated, highly interpretable but are prohibitively computationally expensive, and can induce parameterisation biases. These biases can often be severe, particularly in forecasting of extreme precipitation events, which can lead to flash flooding. Recently, AI techniques have emerged as an alternative approach that is far more efficient computationally, avoids parameterisation biases, and can model non-linear dynamics in a data-driven manner. Importantly, AI approaches also facilitate the generation of prediction ensembles, from which one may consider probabilistic forecasting and the construction of digital twins. However, existing AI approaches to global weather prediction are based on standard planar AI techniques and do not account for the spherical geometry of the Earth.

AI has been remarkably successful in the interpretation of standard (Euclidean) data, such as 1D time series data, 2D image data, and 3D video or volumetric data, now exceeding human accuracy in many cases. However, standard AI techniques fail catastrophically when applied to data defined on other domains, such as data defined over networks, 3D objects, or other manifolds such as the sphere. This has given rise to the field of geometric AI (Bronstein et al. 2017; Bronstein et al. 2021).

Geometric AI techniques constructed natively on the sphere are essential for next-generation global weather prediction models. McEwen and collaborators have recently developed efficient generalised spherical convolutional neutral networks (Cobb et al. 2021) and spherical scattering networks (McEwen et al. 2022) that have shown exceptional performance. Recently, they have developed the DISCO framework that is for the first time scalable to high resolution data (Ocampo et al. 2022) opening up dense prediction tasks like weather prediction. Their DISCO framework provides dramatic computational savings, while also achieving state-of-the-art accuracy in all benchmark problems considered to date. More recently, they developed generative models based on spherical scattering covariances (Mousset et al. 2024).

In this project we will develop AI models that can forecast weather systems natively over the spherical globe, without the need for projections, leveraging the very recent developments discussed above for the construction of scalable geometric AI approaches on the sphere, coupled with other recent developments in Euclidean generative AI (e.g. generative conditional diffusion models). Such networks will be geographically unbiased, scalable to sub-kilometre resolution, and robust, with the potential to dramatically improve weather predictions. Due to the changing climate, extreme weather events are becoming increasingly common. While we must address the root causes of climate change, it is also essential to better predict extreme weather events in order to reduce their harmful impact on the planet and humanity. Often it is societies in the developing work that are least responsible for climate change and least able to deal with its effect that are most impacted. Given the importance of weather prediction, these next-generation geometric AI techniques will have significant societal and scientific impact in years to come.

The student should have a strong mathematical background and be proficient in coding, particularly in Python. The student will gain extensive expertise during the project in AI, going far beyond the straightforward application of existing AI techniques, instead focusing on novel foundational AI approaches—including statistical, generative and geometric AI—and their application to climate science. The expertise gained in foundational AI will prepare the student well for a future career either in academia or industry. In particular, generative, statistical and geometric AI are specialities highly sought after in industry by the big tech companies and many others.

### PhD project: Differentiable probabilistic deep learning with generative denoising diffusion models

This project is offered through UCL’s new CDT in Collaborative Computational Modelling at the Interface.

#### Background

Generative AI models for images, such as denoising diffusion models (e.g. Stable Diffusion), have recently demonstrated remarkable performance (Romback et al. 2022). Such generative models can be adapted to solve scientific inverse problems, such as recovering maps of the dark matter of the Universe. However, current approaches typically recover a single prediction, e.g. recover a single image. For robust scientific studies, however, single estimates are not sufficient and a principled statistical assessment is critical in order to quantify uncertainties. Embedding denoising diffusion models in a principled statistical framework for solving inverse problems remains a topical open problem in the field. A number of approximate solutions have been proposed (e.g. Chung et al 2023). McEwen and collaborators have recently developed the proximal nested sampling framework (Cai et al. 2022) for principled statistical inference for high-dimensional inverse imaging problems with convex likelihoods (initial code available at proxnest). Not only is the correct underlying posterior distribution targeted but the framework also supports computation of the marginal likelihood for principled Bayesian model comparison. Recently, the framework has been extended to support deep learned data-driven priors based on simple denoisers (McEwen et al. 2023), although not denoising diffusion models.

#### Main objectives

In this project we will develop a principled statistical framework to sample the posterior distribution of scientific inverse imaging problems that integrates the generative power of denoising diffusion models. This will be achieved by integrating denoising diffusion models into the proximal nested sampling framework. The resulting framework is expected to result in superior reconstruction performance due to the power of generative diffusion models, targets the correct underlying posterior distribution and also allows for Bayesian model comparison to assess different data-driven priors. The framework will be extended beyond convex likelihoods to handle general non-linear models by leveraging automatic differentiation and gradient-based likelihood constraints. Automatic differentiation will also be exploited to accelerate inference. While the focus will be mostly on methodological and code developments, the methods developed will be demonstrated on a number of inverse imaging problems in a range of fields.#### Details of Software/Data deliverables

The main deliverable with be an open-source code implementing the framework developed. Development will involve differentiable programming, generative denoising diffusion models, and Markov chain Monte Carlo (MCMC) techniques. A number of articles will be prepared as the research progresses, targeting the main deep learning venues (e.g. ICLR, ICML, NeurIPS).## Masters Students

### MSc project: Generative models of anisotropies induced in the cosmic microwave background (CMB) by cosmic strings

Cosmic strings are linear topological defects that may have been produced during symmetry-breaking phase transitions in the very early Universe. In an expanding Universe the existence of causally separate regions prevents such symmetries from being broken uniformly, with a network of cosmic string inevitably forming as a result. Cosmic string networks would induce secondary anisotropies in the cosmic microwave background (CMB). To faithfully generate observables of such processes requires highly computationally expensive numerical simulations, which prohibits many types of analyses.

Generative models to instead rapidly emulate cosmic strings observables based on scattering representations were developed in Price et al. (2023), although these are restricted to the planar setting and thus to small fields-of-view only. Recently, in Mousset et al. (2024) generative models on the celestial sphere based on scattering representations were developed and implemented in the s2scat code. In this project we will extend the generative models of cosmic strings of Price et al. (2023) to emulate fields over the full celestial sphere using the s2scat code. The student should have a strong background in mathematics, AI, and Python.

### MSc project: Robustness of statistical AI methods for Bayesian model comparison

Bayesian model comparison is a critical component of modern cosmological analyses to compare different underlying physical models. However, Bayesian model comparison requires computation of the model evidence, which involves computing a multi-dimensional integral that is highly computationally challenging. In McEwen et al. (2021) we developed the learned harmonic mean estimator to compute the Bayesian evidence, where AI techniques are integrated to solve the exploding variance of the original harmonic mean. Recently, in Polanska et al. (2024) we extended this framework to use normalizing flows as the internal AI model. More recently, in Piras et al. (2024) we demonstrate how the flexibility of such an approach allows a highly accelerated cosmological Bayesian inference pipeline.

In this project the primary aim will be to explore the sensitivity of the learned harmonic mean to training and test splits. In typical AI settings it is necessary to split data into training and test sets. However, there is no fundamental reason that such splits are are necessary for statistical inference. We will perform an empirical study of the impact of training-test splits on final inference performance. The student should have a strong background in mathematics, statistics, AI, and Python.

## Internship Students

I am not offering any specific internship projects at present. However, if you are interested in discussing internship possibilities further then please email me, including '[Internship enquiry]' in the subject of your email, and attach a CV. I receive many enquires and so will only reply if your expertise are well matched to my research interests and there is a high chance of a placement.