Applying Data Analytic Techniques for Fault Detection


Authors: Ha Manh Tran, Sinh Van Nguyen, Son Thanh Le, and Quy Tran Vu

Volume 31 (2017)

Abstract


Monitoring events in communication and computing systems becomes more and more challenging due to the increasing complexity and diversity of these systems. Several supporting tools have been created to assist system administrators in monitoring an enormous number of events daily. The main function of these tools is to filter as many as possible events and present highly suspected events to the administrators for fault analysis, detection and report. While these suspected events appear regularly on large and complex systems, such as cloud computing systems, analyzing them consumes much time and effort. In this study, we propose an approach for evaluating the severity level of events using a classification decision tree. The approach exploits existing fault datasets and features, such as bug reports and log events to construct a decision tree that can be used to classify the severity level of other events. The administrators refer to the result of classification to determine proper actions for the suspected events with a high severity level. We have implemented and experimented the approach for various bug report and log event datasets. The experimental results reveal that the accuracy of classifying severity levels by using the decision trees is above 80%, and some detailed analyses are also provided.