Hybrid Forecasting Model Based Data Mining and Cuckoo Search: A Case Study of Wind Speed Time Series
Xiangdong Xu1, Xi Song1, *, Qian Wang1, Zhiyuan Liu1, Jing Wang1, Zhiru Li2
Identifiers and Pagination:Year: 2016
First Page: 65
Last Page: 76
Publisher Id: TOEFJ-9-65
Article History:Received Date: 30/11/2015
Revision Received Date: 03/07/2016
Acceptance Date: 18/07/2016
Electronic publication date: 20/10/2016
Collection year: 2016
open-access license: This is an open access article licensed under the terms of the Creative Commons Attribution-Non-Commercial 4.0 International Public License (CC BY-NC 4.0) (https://creativecommons.org/licenses/by-nc/4.0/legalcode), which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.
Wind energy has been part of the fastest growing renewable energy sources that is clean and pollution-free, which has been increasingly gaining global attention, and wind speed forecasting plays a vital role in the wind energy field, however, it has been proven to be a challenging task owing to the effect of various meteorological factors. This paper proposes a hybrid forecasting model, which can effectively make a preprocess for the original data and improve forecasting accuracy, the developed model applies cuckoo search(CS) algorithm to optimize the parameters of the wavelet neural network (WNN) model. The proposed hybrid method is subsequently examined on the wind farms of eastern China and the forecasting performance shows that the developed model is better than some traditional models.
Along with the continuous increase in world energy consumption and vigorous development in traditional energy resources, the storage of fossil fuel is on the decrease, and the worldwide energy crisis is gradually becoming significant. Wind energy has been part of the fastest growing renewable energy sources that is clean and pollution-free, which has been successfully adopted in many countries, and wind energy represents about 10% of energy consumption in Europe, over 15% in America and Germany . In China, the installed wind power capacity was 75324 MW in 2012, with year on year growth rate of 24.1%, and in 2013, the installed wind power capacity was 91413 MW, with year on year growth rate of 21.4%, ranking first in the world .
In recent years, the wind speed forecasting becomes more and more important for the wind energy application and many researchers are attracted to do studies on wind speed forecasting, according to the time scopes of wind speed forecasting, which can be divided into short-term forecasting (from 30 min to six hours ahead), medium-term forecasting (from six hours to one day ahead) and long-term forecasting (from one day to one week or more ahead). While short-term wind speed forecasting has made a great influence on grid reliability , medium-term and long-term forecasting is mainly used for system maintenance, planning of windmills and provide important references for site location . According to the computational mechanism, the wind speed forecasting methods can be divided into four categories : (a) artificial intelligence methods; (b) statistical methods; (c) spatial correlation methods; (d) physical methods, while artificial intelligence methods are the most popular in wind speed forecasting field.
The artificial intelligence methods adopt the artificial intelligence theories or evolutionary algorithms to do the wind speed forecasting and with the rapid development of artificial technique, some artificial intelligent forecasting methods have been growing fast, artificial neural networks have been widely applied in the wind speed time series due to their ability to deal with nonlinearities, which mainly including back propagation neural network, fuzzy neural network, support vector machine [6-12], etc. However, the fact they need many neurons to solve the mixed problems is the main trouble with neural networks. To remedy these problems, wavelet technology is introduced into neural networks.
Recently, wavelet neural network (WNN) based on the combination of wavelet decomposition and feed-forward neural networks has attracted much attention and WNN has grown up to be a popular tool for function learning . The selection of wavelet transforms is the main problem of WNN with fixed wavelet bases because the translation and dilation parameters of the wavelet basis are fixed and just the weights are adjustable. The suitable wavelet transforms will lead to the accuracy of improvement and there are several different methods proposed solving the problems [14-16] and their algorithms are given.
Recently, a cuckoo search (CS) has been a new meta-heuristic approach proposed by Yang and Deb in 2009 . Recent studies show that CS is potentially far more efficient than PSO and GA [18, 19]. Moreover, the number of parameters in CS to be tuned is less than GA and PSO, and thus it is potentially more generic to adapt to a wider class of optimization problems. In the light of the advantages of CS technique, in the presented paper the five benchmark problems of reliability-redundancy allocation have been solved and it has been observed that the results of the new approach are all superior to the existing results in the literature.
As we all know, removing the noise of data, as an important method of data preprocessing, is significant and meaningful. The methods such as empirical mode decomposition [20-26] and wavelet decomposition [27-29] can be used to remove noisy data. But, wavelet decomposition technology is sensitive to the choice of the threshold and empirical mode decomposition will appear the phenomenon of mode mixing in the general case. Luckily, fast ensemble empirical mode decomposition can overcome the deficiency of empirical mode decomposition. The basic idea of fast ensemble empirical mode decomposition is to use the statistical characteristic of noise to effectively avoid the phenomenon of mode mixing.
In this paper, the hybrid fast ensemble empirical mode decomposition-cuckoo search-wavelet neural network(FEEMD-CS-WNN) model is developed, which combines fast ensemble empirical mode decomposition method, CS algorithm and WNN model. The fast ensemble empirical mode decomposition method is utilized to decompose the original wind speed time series into one residual series and a group of intrinsic mode function, and then establish WNN model and use the CS algorithm to optimize the parameters of WNN, finally apply the CS-WNN model to forecast the data that have excepted the highest frequency and the obtained consequences show that the proposed FEEMD-CS-WNN model can effectively improve the forecasting accuracy.
This paper is structured as follows. Section 2 summarizes our contribution. Section 3 outlines a detailed discussion of required mathematical tools. The proposed model is presented in Section 4. Experimental results in three real case studies and comparison with previous methods is provided in Section 5. Section 6 contains conclusions.
2. OUR CONTRIBUTIONS
An effective hybrid approach which is named after FEEMD-CS-WNN model is proposed to forecast wind speed. Based on the intrinsic characteristics of WNN, a series of suitable concepts, which include data de-noising, data selection and optimized algorithms, were used to improve forecasting accuracy. A case study shows that hybrid FEEMD-CS-WNN model performs better than single models and other hybrid models. Finally, we analyzed the forecasting errors based on statistical theory, which showed that hybrid model is a suitable forecasting model of wind speed in eastern China from Shan Dong province Penglai city.
Before the hybrid FEEMD-CS-WNN forecasting model, it is quite necessary to describe the required mathematical tools.
3.1. Brief Reviews of WNN
WNN based on the wavelet transform theory is effective as an alternative to feed forward neural networks for approximating arbitrary nonlinear functions . The WNN is designed to be a three-layered structure with an input layer, a hidden layer and an output layer.
The network output of hidden layer is expressed as:
where N is the number of hidden nodes, is the input to the network, wji and wkj are the weights, and the basis function chosen to be the Morlet wavelet activation function, aj and bj are the dilation and translation factor of wavelet activation function. Morlet wavelet is widely utilized as the activation function, which is expressed by Equation 3.
Therefore, the network output is described as:
Thus, substituting (3) into (4), we may formulate input-output mapping realized by WNN as follows:
3.2. Fundamental Concepts of Ensemble Empirical Mode Decomposition
Ensemble empirical mode decomposition can overcome the deficiency of empirical mode decomposition, which is proposed by Wu and Huang . The basic principle of ensemble empirical mode decomposition is to use the statistical characteristic of noise to effectively avoid the phenomenon of mode mixing. The observed data are mixed noise and true time series, therefore, white noise is added into the raw data to help extract the real signals.
An evaluating experiment was carried out by scientist Y. Wang in 2014  to prove that the time complexity of the EEMD is almost equivalent to that of the Fourier transform and a group of parameters were compared to validate the fast ensemble empirical mode decomposition, which will be actually adopted to do the raw wind speed time series.
3.3. Cuckoo Search Algorithms
Cuckoo search (CS)  is one of the latest optimization algorithms and was developed from the inspiration that the obligate brood parasitism of some cuckoo species lay their eggs in the nests of other host birds which are of another species.
CS algorithm is based on three idealized rules:
(1) Each cuckoo lays one egg at a time, and dumps its egg in a randomly chosen nest.
(2) The best nests with high quality of eggs will carry over to the next generations
(3) The number of available host nests is fixed and the egg laid by a cuckoo is discovered by the host bird with a probability in the range 0–1. In this case, the host bird can either throw the egg away or abandon the nest, and build a completely new nest.
Based on the above mentioned rules, the basic steps of the Cuckoo search can be summarized as the pseudo code as follows :
4. HYBRID FEEMD-CS-WNN MODEL
The proposed FEEMD-CS-WNN approach for predicting short-term wind speed was based on fast ensemble empirical mode decomposition method, CS algorithm and WNN model and the process of the proposed model is illustrated as follows. Firstly, fast ensemble empirical mode decomposition decomposes the raw data into a number of independent intrinsic mode function. Remove the first intrinsic mode function and aggregate into the new data. Secondly, establish WNN model and use the CS algorithm to optimize the parameter of WNN. Finally, apply the CS-WNN model to forecast the data that have removed the highest frequency.
The process of the algorithm is described below and flow diagram of the FEEMD-CS-WNN model is shown in Fig. (1).
Step 1: De-noising. Utilize fast ensemble empirical mode decomposition to decompose the raw wind speed time series into a number of the intrinsic mode function and one residual series. Then remove the highest frequency and aggregate into the new data.
Step 2: Building model. Establish WNN model and use the CS algorithm to optimize the parameter of WNN.
Step 3: Forecasting. Apply the CS-WNN model to forecast the new data that have removed the highest frequency.
5. EXPERIMENTATION DESIGN AND RESULTS
In this section, three sites of wind speed will be predicted by the proposed model and it is necessary to introduce the source of data, then the evaluation criteria of forecast performance are introduced.
5.1. Data Sets
Hybrid FEEMD-CS-WNN algorithm is examined using three case studies of forecasting wind speed times series. The wind speed data which are collected from Shan Dong province Penglai city in China were gathered in just ten minutes. The locations of the three selected stations (Site 1: N37.48 E120.45, Site 2: N38.46 E118.5, Site 3: N36.75 E119.34) located offshore are shown in Fig. (3).
|Fig. (1). Flow diagram of the FEEMD-CS-WNN model.|
In this paper, there are three sites, and we select 2930 observation values and the initial 2900 values were used as the training sample, and the remaining 30 values were used as the test sample for each site. The data sample is illustrated in Fig. (3B), as seen from Fig. (2B), it presents the fluctuation, irregularity and instability (Fig. 2).
|Fig. (2). The geographic location of the study area and real values of three sites.|
5.2. Evaluation Metrics
The mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square error (MSE) are used to assess the forecasting performance of the proposed model. The three indices are defined as follows:
where at is the actual value and is the forecast value. T means the total number of the data-set. Smaller values of these measures indicate more accurately forecasting consequences.
5.3. Study Cases
In this paper, there are three cases and each study case was divided into two parts, experiment 1 and experiment 2. Experiment 1 used the original wind speed data and experiment 2 used data decomposed by the fast ensemble empirical mode decomposition technique. Two group data were predicted by single WNN, GA-WNN, PSO-WNN, and CS-WNN model, respectively. Experiment 1 aims to compare the CS-WNN with the other three models, single WNN, GA-WNN, and PSO-WNN and the purpose of experiment 2 illustrates the effectiveness of fast ensemble empirical mode decomposition method.
We performed our experiments on a standard workstation based on an Intel Xeon W3680 CPU with 6 cores, 24 GB of memory and 12 MB of cache, and is denoted ‘C1’. We run 64 bit Linux OS; Matlab R2014b (184.108.40.206421) on C1. To avoid OS noise and caching effects, all tests were performed 20 times and the results were determined as the arithmetic mean.
5.3.1. Study of Case 1
In this case, each step will be clearly presented by tables and figures to show the performance of all models. Firstly, wind speed forecasting consequences of three sites are shown in Table 1.
Forecasting effectiveness for site 1 is shown in Fig. (3A). The left of the figure is the performance of experiment 1, which is predicted by models without utilizing ensemble empirical mode decomposition method, we can see that the WNN,GA-WNN, PSO-WNN, especially CS-WNN model follows the real values; the right of the figure is the performance of experiment 2, which is predicted by models utilizing ensemble empirical mode decomposition method, we can see the FEEMD-WNN,FEEMD-GA-WNN,FEEMD-PSO-WNN, especially FEEMD-CS-WN N model is succeeded to follow the real values. When comparing experiment 1 and experiment 2, as seen from Table 1 and Fig. (3), the latter has improved the performance of the former obviously.
Fig. (3B) displays the comparison of the predicted values by models with the real values and the corresponding errors of site 1. As seen from Fig. (3B), the error of the FEEMD-WNN, FEEMD-GA-WNN, FEEMD-PSO-WNN, FEEMD-CS-WNN model are smaller than the error of the WNN,GA-WNN,PSO-WNN,CS-WNN model, particularly, it is clear that the FEEMD-CS-WNN model is succeeded to follow the actual wind speed closely and performs much better than other models, its corresponding error is close to zero.
|Fig. (3). Forecasting performance for site 1.|
5.3.2. Results of Analysis
In this section, the comparison and forecasting effectiveness for another two sites are shown in Figs. (4A and 5A) while the comparison of the predicted values by all models with the actual values and the corresponding errors for another two sites are illustrated in Figs. (4B and 5B).
Similarly, as also seen from Figs. (4A and 5A), the WNN, GA-WNN, PSO-WNN, especially CS-WNN model follows the real values while the FEEMD-WNN, FEEMD-GA-WNN,FEEMD-PSO-WNN, especially FEEMD-WNN models are succeeded to follow the real values. We can make a decision that experiment 2 outperforms better than experiment 1; As seen from Figs. (4B and 5B), the error of the FEEMD-WNN, FEEMD-GA-WNN, FEEMD-PSO-WNN, FEEMD-CS-WNN model are smaller than the error of the WNN,GA-WNN,PSO-WNN,CS-WNN model, particularly, it is very clear that the FEEMD-CS-WNN model agrees with the actual wind speed exceptionally well and performs much better than other models, its corresponding error is close to zero.
|Fig. (4). Forecasting performance for site 2.|
|Fig. (5). Forecasting performance for site 3.|
Overall, it is not difficult to determine that FEEMD-CS-WNN has the best effectiveness among the models of single WNN, GA-WNN, PSO-WNN, CS-WNN, FEEMD-WNN, FEEMD-GA-WNN and FEEMD-PSO-WNN. We can conclude that the proposed FEEMD-CS-WNN model can more accurately forecast the wind speed time series and discussing with these forecasting models will demonstrate in the next part.
5.3.3. Comparison and Discussion
Many different models have been proposed to handle the wind speed forecasting. To reflect the superiority of the proposed model, it is quite necessary to build additional models to compare with the FEEMD-CS-WNN model. In this section, to verify the proposed FEEMD-CS-WNN method, we compared our model with other models and the results of comparison are given in Table 2 and Fig. (6).
At first, conventional single model forecasting values are given in Table 2, as seen from Table 2, the three sites of MAE, MAPE and MSE of the WNN model are 0.2925, 6.02%, 0.1389 and 0.2859, 5.48%, 0.1185 and 0.3593, 4.77%, 0.2203, respectively and the performance of single WNN model is superior to another single model. As seen from Figs. (3-5), it is clear that single WNN model can approximately describe the characteristics of the selected wind speed time series and these single forecasting results reveal that selecting single WNN model is a reasonable choice.
Then, the GA-WNN model, PSO-WNN model and CS-WNN model are also given in Table 1 and performance of all hybrid model of three sites are shown in Fig. (6). As seen from Table 1 and Fig. (6), the predicted performance of these model is better than the single WNN, What’s more, the three criteria of CS-WNN model are the lowest with the MAE, MAPE and MSE which are 0.214,4.14% and 0.0648,0.2249,4.91% and 0.0831,0.2639,3.6% and 0.1129, respectively.
|Fig. (6). Performance of three sites.|
Finally, comparison between Experiment 1 and Experiment 2, Table 1 and Figs. (3-5) indicate that the fast ensemble empirical mode decomposition de-noising processing is necessary to forecast the wind speed and the three sites of MAE, MAPE, MSE of the hybrid FEEMD-CS-WNN model are 0.1128, 2.22%, 0.0167 and 0.1479, 3.15%, 0.0402 and 0.1143, 2.24%, 0.0176, respectively, which are the smallest in the Experiment. Moreover, it is also reported that the FEEMD-CS-WNN model that includes de-noising processing constitutes a noteworthy improvement in wind speed forecasting and the proposed hybrid model adequately makes use of the advantages of the ensemble empirical mode decomposition methods, CS algorithm and WNN model.
Wind energy has been part of the fastest growing clean and renewable energy sources and wind speed forecasting plays a vital role in the wind energy field, however, wind speed forecasting has been proven to be a challenging task owing to the effect of various meteorological factors. In this paper, a hybrid approach which combines fast ensemble empirical mode decomposition method, CS algorithm and WNN model is proposed to solve these troublesome problem.
Based on evaluating criteria of forecast performance that include the MAE, MAPE, and MSE, a statistical analysis that refers to a group of tables and figures, the advantages of the proposed model are summarized as follows. Firstly,WNN based on the combination of wavelet decomposition and feed-forward neural networks has become a very popular tool, as seen conventional single forecasting models from Table 2, the performance of WNN gets a good result and just needs a very short time. Secondly, CS algorithm is proposed to optimize the parameters of WNN,as seen from Table 1 and Fig. (6), the results in the two experiments show that: CS algorithm is far more efficient than PSO and GA and the FEEMD algorithm can improve the forecasting performance of WNN considerably, it indicates that fast ensemble empirical mode decomposition as an efficient decomposition method can decrease the non-stationary of the original wind speed data for the WNN to obtain high-precision forecasting results. Finally,the performance of the proposed FEEMD-CS-WNN model get the best results, the three sites of MAE, MAPE, MSE of the hybrid FEEMD-CS-WNN model are 0.1128, 2.22%, 0.0167 and 0.1479, 3.15%, 0.0402 and 0.1143, 2.24%, 0.0176, respectively.
CONFLICT OF INTEREST
The authors confirm that this article content has no conflict of interest.
|||Salcedo-Sanz S, Pérez-Bellido AM, Ortiz-García EG, Portilla-Figueras A, Prieto L, Paredes D. Accurate short-term wind speed forecasting by exploiting diversity in input data using banks of artificial neural networks. Neurocomputing 2009; 72: 1336-41.
|||China and Power Installed Capacity in 2013. Available at: http://news.bjx.com.cn/html/20140328/500349.shtml|
|||Kavasseri RG, Seetharaman K. Day-ahead wind speed forecasting using f-ARIMA models. Renew Energy 2009; 34: 1388-93.
|||Mohandes M, Halawani T, Rehman S, Hussain AA. Support vector machines for wind speed prediction. Renew Energy 2004; 29: 939-47.
|||Cadenas E, Rivera W. Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA-ANN model. Renew Energy 2010; 35: 2732-8.
|||Yun B, Yong L, Xiaoxue W, Jingjing X, Chuan L. Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmos Pollut Res 2016; 7: 557-66.
|||Jing Z, Zhen-Hai G, Zhong-Yue Su, Zhi-Yuan Z, Xia X, Feng L. An improved multi-step forecasting model based on WRF ensembles and creative fuzzy systems for wind speed. Appl Energy 2016; 162: 808-26.
|||Flores P, Tapia A, Tapia G. Application of a control algorithm for wind speed prediction and active power generation. Renew Energy 2005; 30: 523-36.
|||Santamaria-Bonfil G, Reyes-Ballesteros A, Gershenson C. Wind speed forecasting for wind farms: A method based on support vector regression. Renew Energy 2016; 85: 790-809.
|||Mabel MC, Fernández E. Analysis of wind power generation and prediction using ANN: A case study. Renew Energy 2008; 33(5): 986-92.
|||Monfared M, Rastegar H, Kojabadi HM. A new strategy for wind speed forecasting using artificial intelligent methods. Renew Energy 2009; 34: 845-8.
|||Wang JZ, Wang Y, Jiang P. The study and application of a novel hybrid forecasting model-A case study of wind speed forecasting in China. Appl Energy 2015; 143: 472-88.
|||Ho DW, Zhang PA, Xu J. Fuzzy wavelet networks for function learning. IEEE Trans Fuzzy Syst 2001; 9: 200-11.
|||Zhang Q, Benveniste A. Wavelet networks. IEEE Trans Neural Netw 1992; 3(6): 889-98.
|||Sanner MR, Slotine JJ. Structurally dynamic wavelet networks for adaptive control of robotic systems. Int J Control 1998; 70: 405-21.
|||Doucoure B, Agbossou K, Cardenas A. Time series prediction using artificial wavelet neural network and multi-resolution analysis: Application to wind speed data. Renew Energy 2016; 92: 202-11.
|||Rajabioun R. Cuckoo optimization algorithm. Appl Soft Comput 2011; 11: 5508-18.
|||Wang J, Jiang H, Wu Y, Dong Y. Forecasting solar radiation using an optimized hybrid model by Cuckoo Search. Energy 2015; 81: 627-44.
|||Yang XS, Deb S. Cuckoo search via levy flights. In: Proc of World Congress on Nature & Biologically Inspired Computing (NaBIC 2009). USA: IEEE Publications 2009; pp. 210-4.
|||Guo ZH, Zhao WG, Lu HY, Wang JZ. Multi-step forecasting for wind speed using a modified EMPIRICAL MODE DECOMPOSITION-based artificial neural network model. Renew Energy 2012; 37: 241-9.
|||Huang NE, Shen Z, Long SR. A new view of nonlinear water waves: The Hilbert spectrum 1. Annu Rev Fluid Mech 1999; 31: 417-57.
|||Zhang C, Wei H, Zhao J, Liu T, Zhu T, Zhang K. Short-term wind speed forecasting using empirical mode decomposition and feature selection. Renew Energy 2016; 96: 727-37.
|||Huang NE, Wu ML, Qu W, Long SR, Shen SS. Applications of Hilbert–Huang transform to non-stationary financial time series analysis. Appl Stochastic Models Data Anal 2003; 19: 245-68.
|||Liang-Ying W. A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting. Appl Soft Comput 2016; 42: 368-76.
|||Lai RJ, Huang N. Investigation of vertical and horizontal momentum transfer in the gulf of Mexico using empirical mode decomposition method. J Phys Oceanogr 2005; 35: 1383-402.
|||Yeh JR, Sun WZ, Shieh JS, Huang NE. Intrinsic mode analysis of human heartbeat time series. Ann Biomed Eng 2010; 38(4): 1337-44.
|||Haven E, Liu XQ, Shen L. De-noising option prices with the wavelet method. Eur J Oper Res 2012; 222: 104-12.
|||Dennis C, Kiplangat K, Asokan K, Kumar S. Improved week-ahead prediction of wind speed using simple linear models with wavelet decomposition. Renew Energy 2016; 93: 38-44.
|||Liu H, Hong-qi T, Di-fu P, Yan-fei L. Forecasting models for wind speed using wavelet, wavelet packet, time series and Artificial Neural Networks. Appl Energy 2013; 107: 191-208.
|||Wu Z, Huang NE. EEMD: A noise-assisted data analysis method. Adv Adapt Data Anal 2009; 1: 1-41.
|||Wang Y-H, Yeh C-H, Young H-W, Hu K, Lo M-T. On the computational complexity of the empirical mode decomposition algorithm. Physica A 2014; 400: 159-67.