IBM Research | Weather Modelling Group

Home | Products & services | Support & downloads | My account

Weather Modelling
Select a country
IBM Research Home
Deep Thunder
·	Details
·	Results and Applications
·	Frequently Asked Questions
·	What the Press Says
Weather Data Visualization

Contact Us

IBM Research

Deep Thunder Forecast for New York

Evaluation of Recent IBM Deep Thunder Weather Predictions for New York City.
The images below represent one of a number of different ways that the forecasts produced by Deep Thunder can be evaluated and verified.   To evaluate the model forecasts, one must consider appropriate metrics.   Traditionally, these are defined from a meteorological perspective.  That is, how well do the forecasts correspond to reality and how do they compare to other forecasts?   Therefore, the first type of evaluation will be along those lines.   Later, additional metrics will be defined and presented here, which are based upon how the forecasts are used and the weather sensitivity of the particular business problems for which the model is being applied.
The first type of evaluation considers the comparison of the Deep Thunder results as well as the NCEP North American Model (NAM) continental-scale model forecasts to near-surface observations at weather stations operated by the National Weather Service, known as metars ("METeorological Aviation Routine reports").   The observations are available at essentially random locations roughly every hour, but with variation in the time of measurement by up to 20 minutes.   The data are made available courtesy of the National Weather Service via their NOAAport data transmission system.   The NOAAport system used for this project was developed by Planetary Data, Inc.
This process turns out to be not as straightforward as one would imagine due to a variety of inconsistent samplings in space, time and observables (e.g., precision and error).   From the Deep Thunder forecasts, there are standard weather variables at the surface at specific grid points at 16, 4 and 1 km resolution available every 10 minutes of forecast time.   From the NAM forecasts, although computed frequently at 12 km resolution, surface data are only available at 40 km resolution every three hours.   So, the first step is that the results of both models are bilinearly interpolated to the locations of the observing stations.   Since the measurements are actually taken above the surface (2m for temperature, humidity, etc. and 10m for wind), and the model topography is only an approximation of the actual station elevation, a simple correction is applied.   Adjustments to pressure and temperature are based upon the lapse rate difference between actual and model elevations.   Later, corrections to temperature and winds will be made by invoking similarity theory.   Then the Deep Thunder results are averaged for every hour and interpolated to the time of the measurement.   The NAM results are processed in a similar fashion but for every three hours.   Then a variety of statistics are computed, for each model forecast in total, by time and by location.   Only a handful of those statistics are shown herein.  There are also occasional problems with the quality and available of the observations, as well as noise due to the measurement process, which impact the results.   Although simple quality control is used to eliminate measurements which are clearly out of range, that is often insufficient.
In addition, this approach to verification is more suited to models at a synoptic scale or even a global scale as well as traditional forecast analysis.   For example, small errors in the phase, timing or location of weather "events" in the model forecast (i.e., high-amplitude "features") can often be manifested as significant error in the model results when using these techniques, even when the model provides good skill at forecasting events realistically.    This situation is exacerbated by the limited sampling of data from metar sites in space (55 within the 4km nest and only 9 within the 1km nest) and time (roughly once per hour).    To address some of these limitations, a small "mesonet" (network of weather stations) is being installed at a number of locations within the 4km and 1km nest to enable nearly continuous observations of local weather.    Some of the issues are discussed further in some papers that outline on-going efforts to evaluate and verify the forecasts that are available for you to read.  The first paper focuses on specific events and long-term performance.   Another paper describes the performance for snowstorm forecasts during the 2002-2003 winter.   The folllowing papers may also be of interest:
A paper that describes the ability of Deep Thunder to predict convective events and talk.

A paper that discusses forecasting issues for airport terminal operations

A paper that discusses how we process data from the National Weather Service that we use as input to the forecast model

A paper that evaluates Deep Thunder forecasts for the greater Washington, DC metropolitan area for applications of interest to the US Air Force, presentation slides and talk.

A paper that describes the ability of Deep Thunder to predict extratropical events and talk.

Verification of the Most Recent Temperature and Dew Point Forecast
The first example shows temperature and dew point results only for those model results and observations that are within the 4km and 1km nests of the Deep Thunder forecasts.   Each curve represents one of the variables from either Deep Thunder or NAM, each in a different color.   Each curve is plotted as a function of forecast time along the x-axis with two statistics being shown simultaneously.   The 24-hour model forecast is compared against the most recently available observations.   Typically, the current contents will reflect an evaluation of a model completed about a day ago.
The y-axis is bias while the z-axis is root mean square error (both in kelvin or degrees C.).   Hence, a negative (or positive) bias for temperature implies that the model is too cool (or too warm).   A negative (or positive) bias for dew point implies that the model is too dry (or too moist).   Root mean square error is a common metric for forecast accuracy.   Thus, better results are toward the bottom of the z-axis (i.e., 0), implying closer correspondence to the observations.   The combination of these metrics enables one to see correlation between bias and accuracy as a function of forecast time.

Verification of the Most Recent Wind Speed Forecast
The second example shows wind speed results only for those model results and observations that are within the 4km and 1km nests of the Deep Thunder forecasts.   One curve represents the Deep Thunder results while the other is for NAM, each in a different color.   Each curve is plotted as a function of forecast time along the x-axis with two statistics being shown simultaneously.   The 24-hour model forecast is compared against the most recently available observations.   Typically, the current contents will reflect an evaluation of a model completed about a day ago.
The y-axis is bias while the z-axis is root mean square error (in knots).   Hence, a negative (or positive) bias for implies that the model is too slow (or too fast).   Root mean square error is a common metric for forecast accuracy.   Thus, better results are toward the bottom of the z-axis (i.e., 0), implying closer correspondence to the observations.   The combination of these metrics enables one to see correlation between bias and accuracy as a function of forecast time.

Instructions
For each of the three-dimensional plots above (and below), you can interact with them by clicking and dragging your mouse inside the image in a limited fashion. If you are having problems viewing or interacting with this animation, make sure your browser has Javascript enabled. Typical Deep Thunder operations are producing forecasts usually only two times per day (e.g., 0Z and 12Z) while NAM results are received and processed four times per day (0Z, 6Z, 12Z and 18Z).   However, the former may be generated more often or at other times.   Hence, there may be times when only NAM results are presented in these visualization.

Verification of the Past Week of Temperature and Dew Point Forecasts
The first example shows temperature and dew point results only for those model results and observations that are within the 4km and 1km nests of the Deep Thunder forecasts.   Each curve represents one of the variables from either Deep Thunder or NAM, each in a different color.   Each curve is plotted as a function of forecast time along the x-axis with two statistics being shown simultaneously.   All of the model results generated in the last week are compared against the appropriate observations.
The y-axis is bias while the z-axis is root mean square error (both in kelvin or degrees C.).   Hence, a negative (or positive) bias for temperature implies that the model is too cool (or too warm).   A negative (or positive) bias for dew point implies that the model is too dry (or too moist).   Root mean square error is a common metric for forecast accuracy.   Thus, better results are toward the bottom of the z-axis (i.e., 0), implying closer correspondence to the observations.   The combination of these metrics enables one to see correlation between bias and accuracy as a function of forecast time.

Verification of the Past Week of Wind Speed Forecasts
The second example shows wind speed results only for those model results and observations that are within the 4km and 1km nests of the Deep Thunder forecasts.   One curve represents the Deep Thunder results while the other is for NAM, each in a different color.   Each curve is plotted as a function of forecast time along the x-axis with two statistics being shown simultaneously.   All of the model results generated in the last week are compared against the appropriate observations.
The y-axis is bias while the z-axis is root mean square error (in knots).   Hence, a negative (or positive) bias for implies that the model is too slow (or too fast).   Root mean square error is a common metric for forecast accuracy.   Thus, better results are toward the bottom of the z-axis (i.e., 0), implying closer correspondence to the observations.   The combination of these metrics enables one to see correlation between bias and accuracy as a function of forecast time.

Additional Instructions
If the forecast information presented on this page does not seem to be current and you have visited this site recently, the results of the previous visit may have been saved in your web browser's cache.   If so, you should change your cache settings (e.g., File->Preferences->Advanced->Cache in Netscape and set the document comparison to "Every time").   When you restart your browser, the problem should be solved. For your current session, you should manually clear the cache and reload the page.
Currently, visualizations showing statistics accumulated for time are shown.    They will be augmented with similar results accumulated geographically.    Later, verification results for precipitation will be shown.

More Visualizations of the Current Forecast

Learn More about These Forecasts

Recent High-Resolution Local Satellite Observations

Learn More about Deep Thunder

Learn More about how Deep Thunder Visualizes the Data Generated by the Weather Model

Current Weather Information and Predictions for New York City (from the National Weather Service)

Recent High-Resolution Local Radar Observations

Current Model Results from the National Weather Service

About IBM | Privacy | Legal | Contact