Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems
Raghu Ramakrishnan1 , Arvinder Kaur2
Section:Research Paper, Product Type: Journal Paper
Volume-7 ,
Issue-10 , Page no. 90-97, Oct-2019
CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i10.9097
Online published on Oct 31, 2019
Copyright © Raghu Ramakrishnan, Arvinder Kaur . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Raghu Ramakrishnan, Arvinder Kaur, “Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.10, pp.90-97, 2019.
MLA Style Citation: Raghu Ramakrishnan, Arvinder Kaur "Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems." International Journal of Computer Sciences and Engineering 7.10 (2019): 90-97.
APA Style Citation: Raghu Ramakrishnan, Arvinder Kaur, (2019). Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems. International Journal of Computer Sciences and Engineering, 7(10), 90-97.
BibTex Style Citation:
@article{Ramakrishnan_2019,
author = {Raghu Ramakrishnan, Arvinder Kaur},
title = {Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {10 2019},
volume = {7},
Issue = {10},
month = {10},
year = {2019},
issn = {2347-2693},
pages = {90-97},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4900},
doi = {https://doi.org/10.26438/ijcse/v7i10.9097}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i10.9097}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4900
TI - Reference Model for Effective Performance and Availability Monitoring in Large Scale Software Systems
T2 - International Journal of Computer Sciences and Engineering
AU - Raghu Ramakrishnan, Arvinder Kaur
PY - 2019
DA - 2019/10/31
PB - IJCSE, Indore, INDIA
SP - 90-97
IS - 10
VL - 7
SN - 2347-2693
ER -
VIEWS | XML | |
1043 | 348 downloads | 175 downloads |
Abstract
The monitoring of different parts of the software stack is essential for ensuring acceptable performance and availability of large scale heterogeneous software systems. However, given that a large amount of data is generated by various parts of the software stack, identifying the minimum set of data elements for inclusion under the initial monitoring umbrella is challenging. Although the elements are similar across the majority of the projects, we have seen that teams often spend considerable effort and time in identifying them. In this paper, we present a layered monitoring reference model, with different layers targeting different parts of the software stack using appropriate data elements. The model provides guidance on the minimum set of data elements, drawing on learnings from more than 20 real-life projects. The model also explains the data elements and the motivation for including them in the model.
Key-Words / Index Term
System monitoring, Production systems, Performance monitoring, Availability monitoring
References
[1] T.H.D. Nguyen, B. Adams, Z. M. Jiang, A. E. Hassan, M. Nasser, P. Flora, “Automated Detection of Performance Regressions Using Statistical Process Control Techniques”, In the Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, USA, pp. 299–310, 2012.
[2] H. M. Alghmadi, M. D. Syer, W. Shang, A. E. Hassan, “An Automated Approach for Recommending When to Stop Performance Tests”, In the Proceedings of the IEEE International Conference on Software Maintenance and Evolution, USA, pp. 279-289, 2016.
[3] S. Ghaith, M. Wang, P. Perry, Z. M. Jiang, P. O’Sullivan, J. Murphy, “Anomaly detection in performance regression testing by transaction profile estimation”, Journal of Software Testing, Verification and Reliability. Vol. 26, Issue. 1, pp. 4–39, 2016.
[4] H. Malik, H. Hemmati, A. E. Hassan, “Automatic detection of performance deviations in the load testing of Large Scale Systems”, In the Proceedings of the International Conference on Software Engineering, USA, pp. 1012-1021, 2013
[5] M. Acharya, V. Kommineni, “Mining Health Models for Performance Monitoring of Services”, In the Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, New Zealand, pp. 409–420, 2009.
[6] S. Iwata, Kono, K, “Narrowing Down Possible Causes of Performance Anomaly in Web Applications”, In the Proceedings of Dependable Computing Conference, USA, pp. 185–190, 2010.
[7] I. Trubin, “Capturing workload pathology by statistical exception detection systems”, In the Proceedings of International Computer Measurement Group Conference, USA, 2005.
[8] L. M. Silva, J. P. Magalhães, “Detection of Performance Anomalies in Web-Based Applications”, In the Proceedings of IEEE International Symposium on Network Computing and Applications, USA, pp. 60-67, 2010.
[9] L. Cherkasova, K. Ozonat, N. Mi, J. Symons, E. Smirni, “Anomaly? application change? or workload change? towards automated detection of application performance anomaly and change”, In the Proceedings of the IEEE International Conference on Dependable Systems and Networks With FTCS and DCC, USA, pp. 452-461, 2008.
[10] N. Mi, L. Cherkasova, K. Ozonat, J. Symons, E. Smirni, “Analysis of application performance and its change via representative application signatures”, In the Proceedings of IEEE Network Operations and Management Symposium, Brazil, pp. 216–223, 2008.
[11] Z.M. Jiang, A.E. Hassan, “A Survey on Load Testing of Large-Scale Software Systems”, IEEE Transactions on Software Engineering. Vol. 41, No. 11, pp. 1091–1118, 2015.
[12] F.M. Bereznay, “Did something change? using statistical techniques to interpret service and resource metrics”, In the Proceedings of the International Computer Measurement Group Conference, USA, 2006.
[13] Shen-Shyang Ho, “A martingale framework for concept change detection in time-varying data streams”, In the Proceedings of international conference on Machine learning, Germany, pp. 321-327, 2005.
[14] D. T. Shipmon, J. M. Gurevitch, P. M. Piselli, S. T. Edwards, “Time Series Anomaly Detection; “Detection of anomalous drops with limited features and sparse examples in noisy highly periodic data”, Vol. abs/1708.03665, CoRR, 2017.
[15] Dheeraj, K. Sharma, “Proposed 4S Quality Metrics and Automated Continuous Quality (ACQ) Metrics Dashboard to Quantify Software Product Quality”, International Journal of Computer Sciences and Engineering, Vol. 7, Issue. 1, pp.865-869, 2019.
[16] A. Agarwal, A. Dixit, “Progressive Web Applications: Architectural Structure and Service Worker Asset Caching”, International Journal of Computer Sciences and Engineering, Vol. 7, Issue. 9, pp. 127-139, 2019.