Which Methodology is Better for Combining Linear and Nonlinear Models for Time Series Forecasting?

Document Type: Research Paper


Department of Industrial and Systems Engineering, Isfahan University of Technology, Isfahan, Iran


Both theoretical and empirical findings have suggested that combining different models can be an effective way to improve the predictive performance of each individual model. It is especially occurred when the models in the ensemble are quite different. Hybrid techniques that decompose a time series into its linear and nonlinear components are one of the most important kinds of the hybrid models for time series forecasting. Several researches in the literature have been shown that these models can outperform single models. In this paper, the predictive capabilities of three different models in which the autoregressive integrated moving average (ARIMA) as linear model is combined to the multilayer perceptron (MLP) as nonlinear model, are compared together for time series forecasting. These models are including the Zhang’s hybrid ANNs/ARIMA, artificial neural network (p,d,q), and generalized hybrid ANNs/ARIMA models. The empirical results with three well-known real data sets indicate that all of these methodologies can be effective ways to improve forecasting accuracy achieved by either of components used separately. However, the generalized hybrid ANNs/ARIMA model is more accurate and performs significantly better than other aforementioned models.


Main Subjects

[1] Armano G., Marchesi M., Murru A. (2005), A hybrid genetic-neural architecture for stock indexes forecasting; Information Sciences 170; 3–33.
[2] Aryal D.R., Yao-Wu W. (2003), Neural network Forecasting of the production level of Chinese construction industry; Journal of Comparative International Management 6(2); 45–64.
[3] Balkin S.D., Ord J.K. (2000), Automatic neural network modeling for univariate time series; International Journal of Forecasting 16; 509–515.
[4] Bates J.M., Granger W.J. (1969), The combination of forecasts; Operation Research 20; 451–468.
[5] Berardi V.L., Zhang G.P. (2003), An empirical investigation of bias and variance in time series forecasting: modeling considerations and error evaluation; IEEE Transactions on Neural Networks 14(3); 668–679.
[6] Bollerslev T. (1986), Generalized autoregressive conditional heteroscedasticity; Journal of Econometrics 31; 307–327.
[7] Box P., Jenkins G.M. (1976), Time Series Analysis: Forecasting and Control; Holden-day Inc, San Francisco, CA.
[8] Brace M.C., Schmidt J., Hadlin M. (1991), Comparison of the forecasting accuracy of neural networks with other established techniques; Proceedings of the First Forum on Application for weight elimination, IEEE Transactions on Neural Networks of Neural Networks to Power Systems; Seattle, WA, 31– 35.
[9] Chakraborty K., Mehrotra K., Mohan C.K., Ranka S. (1992), Forecasting the behavior of multivariate time series using neural networks; Neural Networks 5; 961–970.
[10] Chappel D., Padmore J., Mistry P., Ellis C. (1996), A threshold model for the French franc/Deutschmark exchange rate; Journal of Forecasting 15(3); 155–164.
[11] Chen K.Y., Wang C.H. (2007), A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan; Expert Systems with Applications 32; 254–264.

[12] Chen Y., Yang B., Dong J., Abraham A. (2005), Time-series forecasting using flexible neural tree model; Information Sciences 174(3–4); 219–235.
[13] Clemen R. (1989), Combining forecasts: a review and annotated bibliography with discussion; International Journal of Forecasting 5; 559–608.
[14] Cornillon P., Imam W., Matzner E. (2008), Forecasting time series using principal component analysis with respect to instrumental variables; Computational Statistics & Data Analysis 52; 1269–1280.
[15] Cottrell M., Girard B., Girard Y., Mangeas M., Muller C. (1995), Neural modeling for time series: a statistical stepwise method for weight elimination; IEEE Transactions on Neural Networks 6(6); 1355–1364.
[16] De Groot C., Wurtz D. (1991), Analysis of univariate time series with connectionist nets: a case study of two classical examples; Neurocomputing 3; 177–192.
[17] Denton J.W. (1995), How good are neural networks for causal forecasting?; The Journal of Business Forecasting 14(2); 17–20.
[18] Engle R.F. (1982), Autoregressive conditional heteroskedasticity with estimates of the variance of UK inflation; Econometrica 50; 987–1008.
[19] Fishwick P.A. (1989), Neural network models in simulation: A comparison with traditional modeling approaches; Proceedings of Winter Simulation Conference, Washington D.C.; 702–710.
[20] Foster W.R., Collopy F., Ungar L.H. (1992), Neural network forecasting of short, noisy time series; Computers and Chemical Engineering 16(4); 293– 297.
[21] Ghiassi M., Saidane H. (2005), A dynamic architecture for artificial neural networks; Neurocomputing 63; 397–413.
[22] Ginzburg I., Horn D. (1994), Combined neural networks for time series analysis; Adv. Neural Inf. Process. Systems 6; 224–231.
[23] Giordano F., Rocca M., Perna C. (2007), Forecasting nonlinear time series with neural network sieve bootstrap; Computational Statistics and Data Analysis 51; 3871–3884.
[24] Goh W.Y., Lim C.P., Peh K.K. (2003), Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach; IEEE Transactions on Neural Networks 14(2); 459–463.
[25] Granger C.W.J. (1989), Combining forecasts—Twenty years later; Journal of Forecasting 8; 167–173.
[26] Granger C.W.J., Anderson A.P. (1978), An Introduction to Bilinear Time Series Models; Vandenhoeck and Ruprecht, Go¨ttingen.
[27] Hann T.H., Steurer E. (1996), Much ado about nothing? Exchange rate forecasting: neural networks vs. linear models using monthly and weekly data; Neurocomputing 10; 323–339.
[28] Haseyama M., Kitajima H. (2001), An ARMA order selection method with fuzzy reasoning; Signal Process 81; 1331–1335.
[29] Hipel K.W., McLeod A.I. (1994), Time Series Modelling of Water Resources and Environmental Systems; Amsterdam, Elsevier.
[30] Hsieh D.A. (1991), Chaos and nonlinear dynamics: application to financial markets; Journal of Finance 46; 1839–1877.
[31] Hurvich C.M., Tsai C.-L. (1989), Regression and time series model selection in small samples; Biometrica 76 (2); 297–307.
[32] Hwang H.B. (2001), Insights into neural-network forecasting time series corresponding to ARMA (p; q) structures; Omega 29; 273–289.

[33] Jain A., Kumar A.M. (2007), Hybrid neural network models for hydrologic time series forecasting; Applied Soft Computing 7; 585– 592.
[34] Jones R.H. (1975), Fitting autoregressions; J. Amer. Statist. Assoc. 70 (351); 590–592.
[35] Khashei M., Bijari M. (2010), An artificial neural network (p, d, q) model for time series forecasting; Expert Systems with Applications 37; 479–489.
[36] Khashei M., Bijari M. (2011), A novel hybridization of artificial neural networks and ARIMA models for time series forecasting; Applied Soft Computing 11; 2664–2675.
[37] Khashei M., Bijari M., Raissi GH.A. (2009), Improvement of Auto-Regressive Integrated Moving Average Models Using Fuzzy Logic and Artificial Neural Networks (ANNs); Neurocomputing 72; 956– 967.
[38] Khashei M., Hejazi S.R., Bijari M. (2008), A new hybrid artificial neural networks and fuzzy regression model for time series forecasting; Fuzzy Sets and Systems 159; 769–786.
[39] Kim H., Shin K. (2007), A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets; Applied Soft Computing 7; 569–576.
[40] Lapedes A., Farber R. (1987), Nonlinear signal processing using neural networks: prediction and system modeling; Technical Report LAUR-87-2662; Los Alamos National Laboratory, Los Alamos, NM.
[41] Ljung L. (1987), System Identification Theory for the User; Prentice-Hall, Englewood Cliffs, NJ.
[42] Luxhoj J.T., Riis J.O., Stensballe B. (1996), A hybrid econometric-neural network modeling approach for sales forecasting; Int. J. Prod. Econ. 43; 175–192.
[43] Makridakis S. (1989), Why combining works?; International Journal of Forecasting 5; 601–603.
[44] Meese R.A., Rogoff K. (1983), Empirical exchange rate models of the seventies: do they fit out of samples?; J. Int. Econ. 14; 3–24.
[45] Minerva T., Poli I. (2001), Building ARMA models with genetic algorithms; Lecture Notes in Computer Science 2037; 335–342.
[46] Mizrach B. (1992), Multivariate nearest-neighbor forecasts of EMS exchange rates’; Journal of Applied Econometrics 7; 151–164.
[47] Ong C.-S., Huang J.-J., Tzeng G.-H. (2005), Model identification of ARIMA family using genetic algorithms; Appl. Math. Comput. 164(3); 885–912.
[48] Pai P.F., Lin C.S. (2005), A hybrid ARIMA and support vector machines model in stock price forecasting; Omega 33; 497–505.
[49] Panda C., Narasimhan V. (2007), Forecasting exchange rate better with artificial neural network; Journal of Policy Modeling 29; 227–236.
[50] Pelikan E., de Groot C., Wurtz D. (1992), Power consumption in West-Bohemia: improved forecasts with decorrelating connectionist networks; Neural Network World 2; 701–712.
[51] Poli I., Jones R.D. (1994), A neural net model for prediction; Journal of American Statistical Association 89; 117–121.
[52] Ragulskis M., Lukoseviciute K. (2009), Non-uniform attractor embedding for time series forecasting by fuzzy inference systems; Neurocomputing 72, 2618–2626.
[53] Reid M.J. (1968), Combining three estimates of gross domestic product; Economica 35; 431–444.
[54] Santos A., da Costa Jr N., Coelho L. (2007), Computational intelligence approaches and linear models in case studies of forecasting exchange rates; Expert Systems with Applications 33; 816–823.

[55] Shibata R. (1976), Selection of the order of an autoregressive model by Akaike’s information criterion; Biometrika AC-63 (1); 117–126.
[56] Stone L., He D. (2007), Chaotic oscillations and cycles in multi-trophic ecological systems; Journal of Theoretical Biology 248; 382–390.
[57] Subba Rao T., Sabr M.M. (1984), An Introduction to Bispectral Analysis and Bilinear Time Series Models; Lecture Notes in Statistics 24; Springer-Verlag, New York.
[58] Tang Y., Ghosal S. (2007), A consistent nonparametric Bayesian procedure for estimating autoregressive conditional densities; Computational Statistics & Data Analysis 51; 4424–4437.
[59] Tang Z., Almeida C., Fishwick P.A. (1991), Time series forecasting using neural networks vs. Box-Jenkins methodology; Simulation 57(5); 303–310.
[60] Tang Z., Fishwick P.A. (1993), Feedforward neural nets as models for time series forecasting; ORSA Journal on Computing 5(4); 374–385.
[61] Taskaya T., Ahmad K. (2005), Are ARIMA neural network hybrids better than single models?; Proceedings of International Joint Conference on Neural Networks (IJCNN 2005); July 31–August 4, Canada.
[62] Taskaya T., Casey M. C. (2005), A comparative study of autoregressive neural network hybrids; Neural Networks 18; 781–789.
[63] Timmermann A., Granger C.W.J. (2004), Efficient market hypothesis and forecasting; Int. J. Forecasting 20; 15–27.
[64] Tong H., Lim K.S. (1980), Threshold autoregressive, limit cycles and cyclical data; Journal of the Royal Statistical Society Series B 42(3); 245–292.
[65] Tsaih R., Hsu Y., Lai C.C. (1998), Forecasting S&P 500 stock index futures with a hybrid AI system; Decision Support Systems 23; 161–174.
[66] Tseng F.M., Yu H.C., Tzeng G.H. (2002), Combining neural network model with seasonal time series ARIMA model; Technological Forecasting & Social Change 69; 71–87.
[67] Voort M.V.D., Dougherty M., Watson S. (1996), Combining Kohonen maps with ARIMA time series models to forecast traffic flow; Transportation Research Part C: Emerging Technologies 4; 307–318.
[68] Wedding D.K., Cios K.J. (1996), Time series forecasting by combining RBF networks, certainty factors, and the Box–Jenkins model; Neurocomputing 10; 149–168.
[69] Weigend A.S., Gershenfeld N.A. (1993), Time Series Prediction: Forecasting the Future and Understanding the Past; Addison-Wesley, Reading, MA.
[70] Wold H. (1938), A Study in the Analysis of Stationary Time Series; Almgrist & Wiksell, Stockholm.
[71] Wong C.S., Li W.K. (2000), On a mixture autoregressive model; J. Roy. Statist. Soc. Ser. B 62(1); 91–115.
[72] Yu L., Wang S., Lai K.K. (2005), A novel nonlinear ensemble forecasting model incorporating GLAR and ANN for foreign exchange rates; Computers and Operations Research 32; 2523–2541.
[73] Yule G.U. (1926), Why do we sometimes get nonsense-correlations between time series? A study in sampling and the nature of time series; J. R. Statist. Soc. 89; 1–64.
[74] Zhang G., Patuwo B.E., Hu M.Y. (1998), Forecasting with artificial neural networks: The state of the art; International Journal of Forecasting 14; 35– 62.
[75] Zhang G.P. (2007), A neural network ensemble method with jittered training data for time series forecasting; Information Sciences 177; 5329–5346.
[76] Zhang G.P. (2003), Time series forecasting using a hybrid ARIMA and neural network model; Neurocomputing 50; 159–175.

[77] Zhou Z.J., Hu C.H. (2008), An effective hybrid approach based on grey and ARMA for forecasting gyro drift, Chaos; Solitons and Fractals 35; 525–529.
[78] Zou H.F., Xia G.P., Yang F.T., Wang H.Y. (2007), An investigation and comparison of artificial neural network and time series models for Chinese food grain price forecasting; Neurocomputing 70; 2913–2923.