On this page

Not long ago, had I been asked about using Artificial Intelligence as a tool for generating weather forecasts, I would have been deeply sceptical. Pattern matching, basically, but over-simplistically, the starting point for AI has been tried in the past, without success. It seems that I was wrong, AI is far more complex. I had taken my eye off the ball.

This page was initially published in the Cruising Association Magazine.


AI studies

The Met Office, in the forefront of numerical weather predictions (NWP) since the 1960s, is now working with the Alan Turing Institute to develop an AI model. It says this has the potential to revolutionise forecasting. ECMWF, part-owned by the Met Office, is further down the track. It has been working with the Google GraphCast system and is now publishing, routinely, experimental forecasts, known as AIFS, for comparison with their operational IFS. The ECMWF Chartspage has links to IFS and the experimental AIFS products.

These are early days but results suggest that AI could out-perform the NWP operational models. These physical models work by simulating, mathematically, the processes that drive the atmosphere.

Developing, or training, an AI system requires large amounts of past data. For this purpose, ECMWF are using the ERA5 dataset comprising their own re-analyses of historical data for the period 1979 to 2018 with a further 2 years for fine tuning. ERA5 goes back to 1940 but the early years had little data above the earth’s surface and no satellite data until the mid-1970s.

ECMWF AI trials

For the first trial period, each AI forecast starts from the latest operational IFS analysis. These analyses use data from many sources with quite different characteristics. This standard test of model performance shows that AI might outperform the current physical modelling. Whether it will do so when starting with raw data rather than the ECMWF pre-processed analyses has yet to be seen.

Root-Mean-Square-Errors of the height of the 500 hPa level. ©ECMWF

The data

There are relatively small amounts of accurate, in situ, data from terrestrial based fixed and moving sources. The main platforms are land stations, ships on passage, oil rigs, tethered and drifting buoys, radio-sondes and aircraft. Before the computer and satellite age, these were all that we had from which to produce forecasts. Since the mid-1970s, thanks to satellites, there are vast amounts of data with highly variable characteristics in terms of what they actually measure, their spatial and temporal resolutions. There are, broadly speaking, two main groups of satellite data.

First, there are measurements of wind, globally at high levels and at sea level over the sea only. Tracking high level cloud and areas of high humidity detected by infrared sensors provides wind data at near jet-stream levels. Cloud temperature, effectively an indicator of height, is measured using satellite infrared sensing. Surface level winds are measured in broad swathes over the sea. These use the molecular scattering of radar beams fired at the sea from LEOs.

Secondly, there are three ways of getting information related to temperature and humidity. Unfortunately, none measures air temperature and humidity directly. Since the mid-1970s, satellite sensors have measured the effects of temperature and humidity on radiances in the infrared and microwave electro-magnetic spectra. However, many different temperature or humidity profiles can have the same effect on radiation. A third, newer technique used since 2007, measures the occultation of radio transmissions, specifically GPS, or Global Navigation Satellite Systems, more generally. Satellites in LEO measure the bending angles, usually less than one degree, of GPS signals passing through our atmosphere. The bending depends on the refractive index of the air, i.e. the density of the air. In the stratosphere, which is very dry, the air density is a good measure of temperature. Unfortunately, in the troposphere, the relative effects on air density of water vapour and temperature cannot be separated. Despite claims by some providers of forecasts, GPSRO cannot simulate radio-sonde profiles of temperature and humidity.

The ECMWF Geographical Coverage pages give a good indication of the horizontal distribution of data being used. They are grossly misleading about data volumes. A wind observation from whatever source provides two numbers, west/east and south/north components. One microwave or infrared sounding can contain many tens or hundreds of numbers.

Sample page of ECMWF’s geographical data coverage ©ECMWF

Data analysis needed to initialise all NWP models involves combining these markedly different data types with forecast data from the last computer forecast, 12 hours ago for ECMWF and 6 for most other global models. One of the problems is how to weight the different data types to avoid the large amounts of space-based data from swamping the accurate, time and location specific in situ data. Weightings are kept under continual review in order to optimise model performance.

To put data impacts into perspective, roughly 50% of the value of NWP derives from infrared and microwave soundings in nearly equal proportions. The high level wind data account for a little over 10%, GPSRO a little under 10% and in situ data a little over 20%.

What next?

So far, the ECMWF trials have only produced a subset of the operational IFS output. The next stages will be along three lines. First will be to use the raw data as input rather than the IFS analyses. Secondly, will be to produce all the data required by ECMWF. Thirdly, will be the use of ensembles in order to generate probabilities.

There must be uncertainty in how well the machine learning copes with the upgrades to satellite instrumentation and consequential changes in observational characteristics. There is also the ever-present problem, GPSRO excepted, of calibration drift.

As far as users are concerned, the main impact is likely to be that it should be possible to issue forecasts nearer the base data times instead of the 5 hour delay for most global models and 7 hours for ECMWF. Presumably, there could be AI limited area modelling using fine scale radar and satellite data as well as detailed topographic input.

Forecast texts

Some years ago, the Met Office attempted to generate Shipping Forecast texts automatically. However, the “rules” for that and, indeed, any marine forecast were a major stumbling block. AI could have a role here and it would be surprising if the idea was not being explored. The Inshore Waters texts might be an easier nut to crack than the much-maligned shipping forecast. Of course, it is a fair question the need for worded texts given the mass of computer output.

Finally

Implementation of an AI system as trialled by ECMWF would reduce significantly computer resources used in the NWP analysis process. That would be a partial AI service. The bigger prize of a total AI forecast will be more difficult to achieve.

With my background as a scientist, and my experience as a meteorologist, I do not like the idea of depending on what would be, in effect, a black box system. The realist, practical forecaster says that if it works, them so be it! Watch this space!