Why ADD-TREES? Why Now?


Hi, I’m Danny Williamson, the Principal Investigator for ADD-TREES. I thought I’d kick off this blog with an overview of why we applied for this project and what we really aim to do (and why it needs to be done). Hopefully, as this blog evolves we can include technical ideas, demos of mathematics and code as well as walkthroughs of some of our latest decision support tools.

ADD-TREES was funded by a UKRI “AI for Net Zero” call that aimed to get AI and Net Zero researchers together to meet the challenge posed by Net Zero. I was already working with a fantastic team of Net Zero researchers on the Net Zero Plus project (you’ll spot Net Zero Plus in our web address, as we are part of that set up). That project (NZP) has a lot of funding, mostly for data collection and for modelling of trees, soils and ecosystem services. I lead the AI part of it, but that part was only really me, a PDRA and a year of Co-I time for Deyu Ming (Work Package 1 leader for ADD TREES). The idea, to “bring the modelling together” and help users with decision support. It soon became clear that this was a mammoth task more suited to the label “a life’s work” than “work package 7” and a few years of funding.

I read the AI for Net Zero call as “what would you do if you could get a whole team of AI researchers working on these problems?”. Let’s just say it took longer to restrict the ideas to one manageable project than it did to come up with the initial list.

Before I come to what we said we’d do, it’s worth putting some context around Net Zero and what it means to me. We all identify as different things and a big part of my identity is “statistician” (or “mathematician” or “Bayesian”). The academic in me knows that, despite working on climate modelling and impacts problems my whole career, I am not an expert in “Net Zero”, nor why its important, nor how we should get there or the impacts of not making it. And then there is that other part of my identity: “daddy”. That part knows, better than most, but not as much as my colleagues, just what we’re risking by not getting this done, and is, frankly, terrified. My opinion, the routes of which lie somewhere between father and academic, is that, if we are to pull off Net Zero, getting Carbon Capture and Storage (CCS) right and funded with a solid profitable business model is long-term essential. But the timescales, the interim damage to climate and the risks of society, government and business not coming together in time to ensure the right environment for successful global-scale deployment of it, means that we need solutions that abate/remove greenhouse gas emissions right now.

There are many excellent ideas and technologies for Greenhouse Gas Removal and many of them are supported by the other AI for Net Zero projects. Peatland restoration, biogas, enhanced rock weathering are all part of the solution. We’re looking at Trees (right now), and these are essential and, in principle ready for mass roll-out. The UK needs to accommodate further tree planting on 2.5% of its land mass (500,000 hectares), but this alone doesn’t get us to Net Zero. I put it to a colleague who also happens to be an eminent environmental economist only yesterday that trees are like a bridging loan for Net Zero whilst we hope CCS comes online. I do see it like that, and he didn’t laugh my idea away, so that’s something.

If planting trees is like a bridging loan to Net Zero, then we need it pretty quickly, as is the nature of bridging loans. However, the where, what, when and how is fiercely complicated, and all of these questions can’t be addressed without the “what if?”. Just this week, our JULES-emulator technology has shown that planting conifers in parts of the south east, under all climate scenarios, leads to forests that appear to die off (in the simulations) without realising 20% of their carbon storage potential, seemingly because it becomes too hot and dry there in the next 20 years for that species (note caveats on not being a modeller and that the processes underlying these simulations are being studied by our modelling team). We know that planting in peaty soil can actually emit carbon. Policies to incentivise planting might accidentally export CO2, by forgoing important food production that requires more carbon intensive imports to be sought to replace it.

There’s more to it than just CO2 as well, with biodiversity, flood risk, water quality, income generation and more affected by any decision to plant trees. The expertise and the modelling for all these things is out there but it’s isolated, requires expertise to run and is expensive, in some cases requiring years of supercomputer time. That’s really how I became involved in Net Zero problems in the first place. Surrogate modelling or emulation, particularly with uncertainty quantification using Gaussian processes, lies somewhere in almost all of the research I do. The linking of models with surrogates (with uncertainty propagation) is something very recent in our field and something we’ve been working on theory for for a few years now, and that seemed ideally suited to these problems. All seemed well set up for a project (Net Zero Plus) to create and link surrogates for the trees/ecosystem services mentioned, and then to develop efficient decision support using the reported uncertainty. And we have worked on a lot of that, but only by working on a problem do you realise all of the things you really want to do but didn’t ask for the money to do them.

Field parcel scale

Most decisions about planting and land use happen on the scale of the individual field parcel. From the farmer deciding how best to manage a farm, the landowner planning planting (and other land-use decisions across their estate) to policy makers, who might offer a subsidy in per hectare form and whose subsidy, whilst impacting the whole country, still translates to decision makers of the first 2 types changing land use at parcel-scale. But our models are to expensive to run at field parcel scale and the forcings required to run them range anything from 12km^2 to 1km^2 (and in the case of the latter, the national products are not actual model simulations nor statistical representations of them).

ADD-TREES, at its core, is all about trying to deliver decision support at field parcel scale. There are several AI innovations required here. The first and most obvious perhaps is a form of downscaling for climate projections and a means to put uncertainty on them. Downscaling is an active area in AI/UQ, but it’s been an active area in climate science and ecology for even longer. We’ve brought together experts in mechanistic downscaling, where known physical laws such as atmospheric lapse rate will perform much better than AI that needs to be trained, with experts in data fusion and spatio-temporal modelling, to enable flexible, local downscaling under uncertainty, so we can simulate trees, crops and soil at ultra-high resolution.

Self-learning linked surrogates

Imagining the whole UK broken into more than 2.5M land parcels, and with expensive models needed to understand the response of trees, soil, crops and ecosystem services on those parcels, quickly leads to two thoughts:

  1. Surrogates/emulators are essential (and why would I be involved if they weren’t),
  2. Perhaps individual use cases would need bespoke surrogates.

Point 2 was particularly striking to me. If we are working with a network of farms in Scotland, we need downscaling for that area in Scotland and we need to understand ecosystem service response to planting (or policy) in that area. Rather than try to train surrogates everywhere (which presumably would need a UK-wide parcel-scale downscaling), it makes sense to downscale and fit emulators only to the region of interest. From there, it’s not far until you realise that you want to be able to fit surrogates anywhere a user might be interested and from there that you don’t want to have to do that by hand. Our second big set of innovations is to develop automatic emulation and active learning for ecosystem service models. I will probably write a separate post about this, but I would say that I don’t think its generally possible to emulate any expensive model without supervision (I’ve been doing this 15 years and all models are different and throw different challenges). However, when it’s the same model with the same structures, but with different forcings and decisions, then this seems not only possible but probably a good use of time to develop it.

Calibration to evolving data streams: digital twinning

Some things that are important for trees, crops and other ecosystem services are not available to us (or anyone else) in existing data sets. Soil moisture, soil carbon and existing nutrients in the soil as a result of former land use are all mostly unavailable for individual fields, and yet yields and any carbon calculations are extremely sensitive to these. Surrogates lend themselves to calibration and data assimilation, which, coupled with unique access to field parcel level yield data from Defra and with the abundance of satellite observations, represents an opportunity to turn our decision support systems into digital twins that can adapt to data as it comes in, constraining the uncertainty due to these initial conditions in individual fields and hence offering unique tailored decision support. Our goal here is to use our innovations in training bespoke surrogates automatically to be able to learn key initial conditions, particularly for parcels currently used for farming, from data as it becomes available.

Rethinking decision support

Armed with a fast digital twin that keeps updating itself to data and reports uncertainty in response to decisions you might make on the land (here tree planting), how does one engage the user in exploring their options within an app? A strong and motivating idea of this project and my involvement in decision support for Net Zero is the idea that using surrogates to optimise and so find the “best” policy or “best” planting strategy was both natural and obviously bad (after 5 mins of thinking). Separate post coming about this too, but optimisation can only be right if you have the true objective function. Be it utility or an acurate consideration of all possible consequences of a decision that mean anything to you. Our decision support systems, no matter how sophisticated, nor how many things we include will or could ever have all of that. So, once you feel optimisation is not a great idea, how about finding all decisions that would meet certain targets (like Net Zero), then showing those to the decision maker and taking it from there? This is what we want to do, but then how to help users navigate that decision space once you can find it. The two main tasks here are both the novel ideas for navigation and the finding of the space in real time in the first plac

Concluding remarks

ADD-TREES is a short project containing goals that might represent a life’s work. I don’t mind that. Reaching Net Zero itself, if we can, needs to be a short project. I already see us taking big strides in this space and I look forward to sharing our ideas, successes and some of our problems in this blog as we move forward.