The role of graph entropy in fault localization and network evolution

Tee, Philip (2017) The role of graph entropy in fault localization and network evolution. Doctoral thesis (PhD), University of Sussex.

[img] PDF - Published Version
Download (3MB)

Abstract

The design of a communication network has a critical impact on its effectiveness at delivering service to the users of a large scale compute infrastructure. In particular, the reliability of such networks is increasingly vital in the modern world, as more and more of our commercial and social activity is conducted using digital platforms. Systems to assure service availability have been available since the emergence of Mainframes, with the System 360 in 1964, and although commercially widespread, the scientific understanding is not as deep as the problem warrants. The basic operating principle of most service assurance systems combines the gathering of status messages, which we term as events, with algorithms to deduce from the events where potential failures may be occurring. The algorithms to identify which events are causal, known as root cause analysis or fault localization, usually rely upon a detailed understanding of the network structure in order to determine those events that are most helpful in diagnosing and remediating a service threatening problem. The complex nature of root cause algorithms introduces scalability limits in terms of the number of events that can be processed per second. Unfortunately as networks grow, the volume of events produced continues to increase, often dramatically.
The dependence of root cause analysis algorithms on network structure presents a significant challenge as networks continue to grow in scale and complexity. As a consequence of this, and the growing reliance upon networks as part of the key fabric of the modern economy, the commercial importance and the scale of the engineering challenges are increasing significantly.
In this thesis I outline a novel approach to improving the scalability of event processing using a mathematical property of networks, graph entropy. In the first two papers described in this thesis, I apply an efficiently computable approximation of graph entropy to the problem of identifying important nodes in a network. In this context, importance is a measure of whether the failure of a node is more likely to result in a significant impact on the overall connectivity of the network, and therefore likely to lead to an interruption of service. I show that by ignoring events from unimportant network nodes it is possible to significantly reduce the event rate that a root cause algorithm needs to process. Further, I demonstrate that unimportant nodes produce very many events, but very few root causes. The consequence is that although some events relating to root causes are missed, this is compensated for by the reduction in overall event rate. This leads to a significant reduction of the event processing load on management systems, and therefore increases the effectiveness of current approaches to root cause analysis on large networks.
Analysis of the topology data used in the first two papers revealed interesting anomalies in the degree distribution of the network nodes. This motivated the later focus of my research to investigate how graph entropy and network design considerations could be applied to the dynamical evolution of networks structures, most commonly described using the Preferential Attachment model of Barabási and Albert. A common feature of a communication network is the presence of a constraint on the number of logical or physical connections a device can support. In the last of the three papers in the thesis I develop and present a constrained model of network evolution, which demonstrates better quantitative agreement with real world networks than the preferential attachment model. This model, developed using the continuum approach, still does not address a fundamental question of random networks as a model of network evolution. Why should a node’s degree influence the likelihood of it acquiring connections? In the same paper I attempt to answer that question by outlining a model that links vertex entropy to a node’s attachment probability. The model successfully reproduces some of the characteristics of preferential attachment, and illustrates the potential for entropic arguments in network science.
Put together, the two main bodies of work constitute a practical advance on the state of the art of fault localization, and a theoretical insight into the inner workings of dynamic networks. They open up a number of interesting avenues for further investigation.

Item Type: Thesis (Doctoral)
Schools and Departments: School of Engineering and Informatics > Informatics
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK5101 Telecommunication Including telegraphy, telephone, radio, radar, television > TK5105.5 Computer networks. General works
Depositing User: Library Cataloguing
Date Deposited: 08 Nov 2017 14:43
Last Modified: 08 Nov 2017 14:43
URI: http://sro.sussex.ac.uk/id/eprint/71052

View download statistics for this item

📧 Request an update