A review of methods for estimating coefficients of objective functions and constraints in mathematical programming models

Document Type : Review articles

Authors

Department of Industrial Management, Faculty of Management, University of Allameh Tabatabai, Tehran, Iran

Abstract

Considering the high importance of the optimization problem, this study evaluated mathematical programming models by considering various methods of estimating model coefficients. Correct and accurate data must be entered into the model to get accurate and robust result from the model. Most input data to the presented model are technical and objective function coefficients. Therefore, it is necessary to determine the information related to these coefficients with the utmost precision and, as much as possible, to develop a suitable scientific method to estimate the value of these coefficients [5]. Finding the best method for estimating the coefficients of mathematical programming models can significantly optimize the final values of the variables extracted from the mathematical programming model. For this reason, it is essential to study the methods used so far in this field and examine their advantages and disadvantages. This review study investigated various methods of estimating technical coefficients of mathematical planning models in the conditions of possible decision-making and uncertainty after reviewing 117 articles published between 1955 and 2022. These methods include fuzzy methods, statistical methods, and data analysis methods. Statistical methods such as regression methods, time series methods, exponential smoothing, and linear non-linear and non-parametric, machine learning and data mining methods were investigated in this article. The methods of data-driven analysis explained in this article can be referred to as decision trees, random forests and the Lasso methods. After evaluating and comparing these methods, suggestions for choosing the best method were provided.

Keywords

[1] R. Alikhani, A. Azar, and A. Rashidi Kamijan, Stochastic planning model for allocation of gas resources in Iran with energy security cost approach, Future Stud. Manag. 23 (2012), no. 96, 25–36.
[2] R. Alikhani and M. Sadegh Amal Nik, The fuzzy-stochastic multi-objective programming model for supplier selection problem, Standard Qual. Manag. 4 (2013), no. 12, 96–101.
[3] M. Amiri and S.A. Ayazi, Decision Making in Conditions of Uncertainty, Allameh Tabatabayi University, 2017.
[4] M. Amiri, A. Darestani Farahani, and M. Mehboob Ghodsi, Multi-Criteria Decision Making, Kian University Press, 2015.
[5] A. Azar, R. Farhi Bailoyi, and A. Rajabzadeh, Comparative comparison of deterministic and fuzzy mathematical models in production planning (Case: Shiraz Oil Refining Company), Modares Human Sci. J. 12 (2017), no. 1.
[6] A. Azar and M. Momeni, Statistics and its Application in Management, In Persian, Samt publication, 2006.
[7] D. Bertsimas and A. Thiele, Robust and data-driven optimization: modern decision making under uncertainty, Models, methods, and applications for innovative decision making, INFORMS, 2006, pp. 95–122.
[8] M. Biggs, R. Hariss, and G. Perakis, Optimizing objective functions determined from random forests, Available at SSRN: https://ssrn.com/abstract=2986630, (2018).
[9] G.E. Box and D.A. Pierce, Distribution of residual autocorrelations in autoregressive-integrated moving average time series models, J. Amer. Stat. Assoc. 65 (1970), no. 332, 1509–1526.
[10] S.P. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
[11] L. Breiman, Random forests, Machine Learn. 45 (2001), no. 1, 5–32.
[12] F. Campanella, L. Serino, A. Crisci, and A. D’Ambra, The role of corporate governance in environmental policy disclosure and sustainable development. Generalized estimating equations in longitudinal count data analysis, Corporate Soc. Responsibility Environ. Manag. 28 (2021), no. 1, 474–484.
[13] S. Corsaro, V. De Simone and Z. Marino, Fused lasso approach in portfolio selection, Ann. Oper. Res. 299 (2021), no. 1–2, 47–59.
[14] S.F. Crone, S. Lessmann and R. Stahlbock, The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing, Eur. J. Operat. Res. 173 (2006), no. 3, 781–800.
[15] G.B. Dantzig, Linear programming under uncertainty, Manag. Sci. 1 (1955), no. 3–4, 197–206.
[16] J.G. De Gooijer and R.J. Hyndman, 25 years of time series forecasting, Int. J. Forecast. 22 (2006), no. 3, 443–473.
[17] N. Deng, Y. Tian and C. Zhang, Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions, CRC Press, 1979.
[18] P. Diggle, K.Y. Liang and S.L. Zeger, Longitudinal Data Analysis, New York: Oxford University Press, 1994.
[19] C.F. Gauss, Theoria Motus Corporum Coelestum, Werke, 1809.
[20] J.W. Hardin and J.M. Hilbe, Generalized Estimating Equations, CRC Press, 2012.
[21] W. Hardle and E. Mammen, Comparing nonparametric versus parametric regression fits, Ann. Statist. 21 (1993), no. 4, 1926–1947.
[22] T.K. Ho, The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Machine Intell. 20 (1998), no. 8, 832–844.
[23] M. Hollander, D.A. Wolfe, and E. Chicken, Nonparametric Statistical Methods, John Wiley & Sons, 2013.
[24] P. Howitt and D. Mayer-Foulkes, R&D, implementation and stagnation: A Schumpeterian theory of convergence clubs, Brown University, 2002.
[25] Y. Jiang, Variable selection with prior information for generalized linear models via the prior lasso method, J. Amer. Stat. Assoc. 111 (2016), no. 513, 355–376.
[26] Y. Jin, H. Wang, T. Chugh, D. Guo and K. Miettinen, Data-driven evolutionary optimization: An overview and case studies, IEEE Trans. Evol. Comput. 23 (2018), no. 3, 442–458.
[27] E.M. Kleinberg, On the algorithmic implementation of stochastic discrimination, IEEE Trans. Pattern Anal. Machine Intell. 22 (2000), no. 5, 473–490.
[28] J. Lee and J.Y. Choi, Texas hospitals with higher health information technology expenditures have higher revenue: Longitudinal data analysis using a generalized estimating equation model, BMC Health Serv. Res. 16 (2016), no. 1, 1–8.
[29] A.M. Legendre, Memoire Sur Les Operations Trigonom´etriques: Dont les R´esultats D´ependent de la Figure de la Terre, F. Didot, 1805.
[30] J. Lever, M. Krzywinski, and N. Altman, Points of significance: model selection and overfitting, Nature Meth. 13 (2016), no. 9, 703–705.
[31] X. Li, C. Liang and F. Ma, Forecasting stock market volatility with many predictors: New evidence from the MS-MIDAS-LASSO model, Ann. Oper. Res. (2022). https://doi.org/10.1007/s10479-022-04716-1
[32] K.Y. Liang and S.L. Zeger, Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), no. 1, 13–22.
[33] K.Y. Liang, S.L. Zeger and B. Qaqish, Multivariate regression analyses for categorical data, J. Royal Stat. Soc. Ser. B (Method.) 54 (1992), no. 1, 3–24. [34] A. Liaw and M. Wiener, Documentation for R package randomForest, https://www.rdocumentation.org/packages/randomForest/versions/4.6-12, 2013.
[35] P. Louis and B. Baesens, Do for-profit microfinance institutions achieve better financial efficiency and social impact? A generalized estimating equations panel data approach, J. Dev. Effect. 5 (2013), no. 3, 359–380.
[36] R. Mazumder, P. Radchenko and A. Dedieu, Subset selection with shrinkage: Sparse linear modeling when the SNR is low, Oper. Res. 71 (2022), no. 1, 129–147.
[37] S.H. Naseri and S. Bavandi, A proposed approach for solving multi-objective fuzzy stochastic linear programming problems with fuzzy probability, Fuzzy Syst. Appl. 1 (2018), no. 2, 133–119.
[38] A.  Ozmen, Sparse regression modeling for short-and long-term natural gas demand prediction, Ann. Operat. Res. 322 (2021), no. 2, 1–26.
[39] A. Painsky and S. Rosset, Cross-validated variable selection in tree-based methods improves predictive performance, IEEE Trans. Pattern Anal. Machine Intell. 39 (2017), no. 11, 2142–2153.
[40] H. Rasouli, M. Imanipour and A. Khatami Firouzabadi, A comprehensive guide to linear programming modeling, Marandiz, Todays Managers, 2017.
[41] J.W. Rocks and P. Mehta, Memorizing without overfitting: Bias, variance, and interpolation in overparameterized models, Phys. Rev. Res. 4 (2022), no. 1, 013201.
[42] F. Santosa and W.W. Symes, Linear inversion of band-limited reflection seismograms, SIAM J. Sci. Stat. Comput. 7 (1986), no. 4, 1307–1330.
[43] D.J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, Chapman and Hall/CRC, 2003.
[44] F. Smarra, A. Jain, T. De Rubeis, D. Ambrosini, A. D’Innocenzo and R. Mangharam, Data-driven model predictive control using random forests for building energy optimization and climate control, Appl. Energy 226 (2018), 1252–1272.
[45] S.A. Smith, N. Agrawal and S.H. McIntyre, A discrete optimization model for seasonal merchandise planning, J. Retail. 74 (1998), no. 2, 193–221.
[46] K.A. Smith and J.N. Gupta, Neural networks in business: techniques and applications for the operations researcher, Comput. Oper. Res. 27 (2000), no. 11–12, 1023–1044.
[47] C. Strobl, A. Boulesteix and T. Augustin, Unbiased split selection for classification trees based on the Gini index, Comput. Stat. Data Anal. 52 (2007), 483–501.
[48] Y. Sun, H. Wang, B. Xue, Y. Jin, G.G. Yen and M. Zhang, Surrogate-assisted evolutionary deep learning using an end-to-end random forest-based performance predictor, IEEE Trans. Evolut. Comput. 24 (2019), no. 2, 350–364.
[49] C.W. Tan, C. Bergmeir, F. Petitjean and G.I. Webb, Time series extrinsic regression: Predicting numeric values from time series data, Data Min. Knowledge Discov. 35 (1994), 1032–1060.
[50] P.N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Pearson Education India, 2016.
[51] H. Tanaka, T. Okuda and K. Asai, Fuzzy mathematical programming, Trans. Soc. Instrument control Engin. 9 (1973), no. 5, 607–613.
[52] J.W. Taylor, Short-term electricity demand forecasting using double seasonal exponential smoothing, J. Oper. Res. Soc. 54 (2003), no. 8, 799–805.
[53] R. Tibshirani, Regression Shrinkage and Selection via the Lasso, J. Royal Stat. Soc. Ser. B (Method.) 58 (1996), no. 1, 267–288.
[54] M.D. Troutt, W.-K. Pang and S.-H. Hou, Behavioral estimation of mathematical programming objective function coefficients, Manag. Sci. 52 (2006), no. 3, 422–434.
[55] V. Vapnik and O. Chapelle, Bounds on error expectation for support vector machines, Neural Comput. 12 (2000), no. 9, 2013–2036.
[56] V.N. Vapnik and A.Y. Chervonenkis, Recovery of dependencies by empirical data, M.: Nauka, 1979.
[57] H. Wang and Y. Jin, A random forest-assisted evolutionary algorithm for data-driven constrained multiobjective combinatorial optimization of trauma systems, IEEE Trans. Cybernet. 50 (2018), no. 2, 536–549.
[58] M. Wang, L. Kong, Z. Li, and L. Zhang, Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples, Stat. Med. 35 (2016), no. 10, 1706–1721.
[59] W. Wang, X. Liu, and W.K.V. Chan, Imbalanced classification problem using data-driven and random forest method, Proc. 3rd Int. Conf. Data Sci. Inf. Technol., 2020, pp. 26–30.
[60] W.L. Winston, Operations Research: Applications and Algorithms, Cengage Learning, 1997.
[61] L.A. Zadeh, Information and control, Fuzzy Sets Syst. 8 (1965), 338–353.
[62] S.L. Zeger and K.Y. Liang, Feedback models for discrete and continuous time series, Stat. Sin. 1 (1991), 51–64.
[63] Z. Zhang, Too many covariates in a multivariable model may cause the problem of overfitting, J. Thoracic Disease 6 (2014), no. 9.
[64] Y. Zheng, X. Fu and Y. Xuan, Data-driven optimization based on random forest surrogate, 6th Int. Conf. Syst. Inf., IEEE, 2019, pp. 487–491.
[65] H.-J. Zimmermann, Fuzzy programming and linear programming with several objective functions, Fuzzy Sets Syst. 1 (1978), no. 1, 45–55.
[66] H. Zou, T. Hastie, and R. Tibshirani, On the “degrees of freedom” of the lasso, Annal. Statist. 35 (2007), no. 5, 2173–2192.
Volume 15, Issue 7
July 2024
Pages 325-336
  • Receive Date: 09 April 2023
  • Revise Date: 27 May 2023
  • Accept Date: 01 June 2023