A Methodology for Improved, Data-Oriented, Air Quality Forecasting

  • Ioannis Kyriakidis

    Student thesis: Doctoral Thesis


    Air quality has emerged as an acute environmental problem, especially in densely populated areas, causing amongst other things negative effects on health. Air quality forecasting system can potentially facilitate amelioration of the situation by providing alerts regarding potential high air pollution levels to the public (to enable them to minimise their personal air pollution exposure) and to the authorities (supporting the decision making process and allowing them to take emergency measures).

    Creating an Air Quality (AQ) forecasting system with the aid of data-driven models presupposes the availability of historical data, which has to be appropriately pre-processed before it can be used. This pre-processing is required because environmental datasets often include measurement errors, noise, outliers and missing data. Moreover, it is important that the efficacy of any forecasting model be determined. This can be achieved by using appropriate indices to compare the model’s forecasts with the actual situational values that transpire.

    This thesis documents the research undertaken to:

    1) Investigate the process for developing computational intelligence and statistical methods to perform forecasting of environmental parameters;

    2) Develop a methodology that identifies the optimum data pre-processing methods and model characteristics that leads to the highest forecasting accuracy (i.e. the best combination of methods);

    3) Identify an optimum methodology to evaluate the forecasting performance of the data-driven models on an operational basis.

    The models developed to achieve the first aim are able to predict the values of the environmental parameters with a superior forecasting accuracy in comparison to previously published results. Moreover, a semi-automatic procedure was created to perform forecasting via data-driven models, which can be generalized and applied to other locations and is thus expected to be useful in developing and implementing operational air quality management and forecasting systems for environmental parameters.

    To address the second aim, the Daphne Optimization Methodology was introduced for optimising the selection of data pre-processing methods and data modelling algorithms in a comparatively shorter time than with the traditional use of the optimization algorithms. The Daphne Optimization Methodology can be applied at each stage of the process; from selecting the input as well as the target dataset (e.g. air quality and meteorological parameters) to the forecasting of the target parameter, taking into account specific performance optimization criteria. Such a holistic optimisation procedure appears in the literature for the first time as a result of this thesis.

    In order to achieve the third aim, two new forecasting performance indices were developed, which combine the characteristics of existing indices. The new indices are suitable for use in an automated operational forecasting system. In addition, a methodology to increase the confidence in the estimation of the forecasting performance of different indices by using confidence intervals was introduced, which use relative weights referred to as "penalties". When the new forecasting performance indices are combined with the use of penalties, the confidence for the estimation of the forecasting performance of a model is higher than any studied single measure.

    The proposed new forecasting performance indices and the Daphne Optimization Methodology provide the necessary framework to support the creation of an automated online air quality forecasting system.
    Date of AwardJun 2018
    Original languageEnglish
    SupervisorAndrew Ware (Supervisor)

    Cite this