Method for Choosing a Balanced Set of Fault Tolerance Techniques for Distributed Computer Cystems
https://doi.org/10.18255/1818-1015-2016-2-119-136
Abstract
In the paper we consider a method for a reliability allocation problem (RAP) of distributed computer systems (DCS) under cost constraints. In this problem we maximize reliability of DCS under constraints of system cost. The article describes considered fault tolerance mechanisms. The mathematical formulation of RAP is provided. RAP is widely discussed in the literature. A detailed description of the method is ensured. The applied method is an evolutionary algorithm with an adaptive logic control procedure. The adaptive logic control procedure analyzes the results of evolutionary algorithm work in each generation and, based on this information, adjusts parameters. The key feature of the proposed method is the use of an adaptive hybrid genetic algorithm. The results of experiments with the implemented method are presented. This method was implemented as a pilot system which works in cooperation with DYANA simulation environment. Finally, future plans for the development of the presented method and tools are briefly described.
About the Author
D. Yu. VolkanovRussian Federation
assistant lecturer
References
1. Aviˇzienis A., Laprie J.-C., Randell B., “Dependability and itsthreats: a taxonomy”, Building the Information Society, 156 (2004), 91–120.
2. Лупанов О.Б., “Об одном методе синтеза схем”, Изв. ВУЗов, Радиофизика, 1:1 (1958), 120–140; [Lupanov O. B., “Ob odnom metode sinteza schem”, Izv. VUZov,Radiophizika, 1:1 (1958), 120–140, (in Russian).]
3. Kuo W.,Wan R., “Recent Advances in Optimal Reliability Allocation”, Handbook of Military Industrial Engineering by Badiru A., Thomas M., 2009, 1–24.
4. Chern M.S., “On the computational complexity of reliability redundancy allocation in a series system”, Building the Information Society, 11:5 (1992), 309–315.
5. Tillman F.A., Hwang C.L., Kuo W., “Optimization techniques for systems reliability with redundancy – A review”, IEEE Transactions on Reliability, R-26:3 (1977), 148–155.
6. Misra K.B., “On optimal reliability design: A review”, System Science, 12 (1986), 5–30.
7. Kuo W., Prasad V.R., “An annotated overview of system-reliability optimization”, IEEE Transactions on Reliability, 49:2 (2000), 176–187.
8. Gen M., Yun Y.S., “Soft computing approach for reliability optimization: State-of-the-art survey”, Reliability Engineering & System Safety, 91:9 (2006), 1008–1026.
9. Aleti A. et al., “Software architecture optimization methods: A systematic literature review”, IEEE Transactions on Software, 39:5 (2013), 658–683.
10. Soltani R., “Reliability optimization of binary state non-repairable systems: A state of the art survey”, International Journal of Industrial Engineering Computations, 5:3 (2014), 339–364.
11. Смелянский Р.Л., “Модель функционирования распределенных вычислительных систем” , Вестник Московского Университета, 3 (1990), 3–21; [Smeliansky R. L., “Model funkcionirovaniya raspredelennyh vychislitelnyh sistem”, Vestnik Moskovskogo Universiteta, 3 (1990), 3–21, (in Russian).]
12. Wattanapongsakorn N., Levitan S.P., “Reliability optimization models for embedded systems with multiple applications”, IEEE Transactions on Reliability, 53:3 (2004), 406– 416.
13. Wattanapongsakorn N., Coit D.W., “Fault-tolerant embedded system design and optimization considering reliability estimation uncertainly”, Reliability Engineering & System Safety, 92:4 (2007), 395–407.
14. Bakhmurov A.G. et al., “Method For Choosing An Effective Set Of Fault Tolerance Mechanisms For Real-Time Embedded Systems, Based On Simulation Modeling”, Problems of dependability and modelling, 2011, 13–26.
15. Laprie J.C., Coste A., “Dependability: A Unifying Concept for Reliable Computing”, Proceedings of the 12th Fault Tolerant Computing Symposium, 1982, 18–21.
16. Xie Z., Sun H., Saluja K., “Survey of Software Fault Tolerance Techniques”, University of Wisconsin-Madison, Department of Electrical and Computer Engineering, 2006.
17. Laprie J.C. et al., “Definition and analysis of hardware and software-fault-tolerant architectures”, IEEE Computer, 23:7 (1990), 39–51.
18. Lyu M.R. (Editor-in-Chief), Handbook of software reliability engineering, McGraw-Hill: IEEE Computer Society Press, 1996.
19. Bakhmurov A.G., Kapitonova A.P., Smeliansky R.L., “DYANA: An Environment for Embedded System Design and Analysis”, Proceedings of 5-th International Conference TACAS’99, Amsterdam, The Netherlands, 1579, 1999, 390–404.
20. Бахмуров А.Г. и др., “Интегрированная среда для анализа и разработки встроенных вычислительных систем реального времени”, Программирование, 5 (2013), 35–52; [Bakhmurov A. G., “Integrated environment for the analysis and design of distributed realtime embedded computing systems”, Programming and Computer Software, 39:5 (2013), 242–254, (in Russian).]
21. Гладков Л. А., Курейчик В. В., Курейчик В. М., Генетические алгоритмы, ФИЗМАТЛИТ, 2006; [Gladkov L. A., Kureychik V. V., Kureychik V. M., Geneticheskie algoritmy, PHIZMATLIT, 2006, (in Russian).]
22. Чистяков В. П., Курс теории вероятностей, Наука, 1987; [Chistyakov V. P., Kurs teorii veroyatnostey, Nauka, 1987, (in Russian).]
Review
For citations:
Volkanov D.Yu. Method for Choosing a Balanced Set of Fault Tolerance Techniques for Distributed Computer Cystems. Modeling and Analysis of Information Systems. 2016;23(2):119-136. (In Russ.) https://doi.org/10.18255/1818-1015-2016-2-119-136