An AI-Enabled ensemble method for rainfall forecasting using Long-Short term memory
Peer reviewed, Journal article
Published version
View/ Open
Date
2023Metadata
Show full item recordCollections
Original version
Mathematical Biosciences and Engineering. 2023, 20 (5), 8975-9002. 10.3934/mbe.2023394Abstract
Rainfall prediction includes forecasting the occurrence of rainfall and projecting the amount of rainfall over the modeled area. Rainfall is the result of various natural phenomena such as temperature, humidity, atmospheric pressure, and wind direction, and is therefore composed of various factors that lead to uncertainties in the prediction of the same. In this work, different machine learning and deep learning models are used to (a) predict the occurrence of rainfall, (b) project the amount of rainfall, and (c) compare the results of the different models for classification and regression purposes. The dataset used in this work for rainfall prediction contains data from 49 Australian cities over a 10-year period and contains 23 features, including location, temperature, evaporation, sunshine, wind direction, and many more. The dataset contained numerous uncertainties and anomalies that caused the prediction model to produce erroneous projections. We, therefore, used several data preprocessing techniques, including outlier removal, class balancing for classification tasks using Synthetic Minority Oversampling Technique (SMOTE), and data normalization for regression tasks using Standard Scalar, to remove these uncertainties and clean the data for more accurate predictions. Training classifiers such as XGBoost, Random Forest, Kernel SVM, and Long-Short Term Memory (LSTM) are used for the classification task, while models such as Multiple Linear Regressor, XGBoost, Polynomial Regressor, Random Forest Regressor, and LSTM are used for the regression task. The experiment results show that the proposed approach outperforms several state-of-the-art approaches with an accuracy of 92.2% for the classification task, a mean absolute error of 11.7%, and an R2 score of 76% for the regression task.