Application performance monitoring can be a resource-intensive activity, especially if you’re relying on traditional tools and manual processes. Research shows that downtime costs range from $1-60 million annually – not to mention the reputational damage when a critical service is interrupted – so it’s no wonder IT organizations spend 25% of their time on monitoring.
At AIMS, we’ve worked with customers around the world to implement a more modern approach to application performance monitoring. Applying AI and machine learning, AIMS helps companies avoid costly downtime and reduce the heavy lifting needed to keep systems live, performant and meeting business requirements.
We’ve learned a thing or two...
In this article, we’ll share 13 best practices we’ve learned from working with our clients, to help you get up and running successfully with modern performance monitoring that keeps critical systems live and ensures internal and external stakeholders are informed and engaged.
Be sure to read to the end for the two bonus best practices (no peeking)!
1) Identify the processes and/or applications where you want to start
It’s always best to start with those processes and/or applications that are most critical for your business. How to identify those? Start by identifying the risk associated with each process/app (the consequences of downtime) and then think about the likelihood of downtime occurring. When thinking about risk, consider things like expected troubleshooting time, staff competence to troubleshoot errors, regulatory/compliance issues related to the process. And don’t always assume that those processes that are top of mind with big business impact are always the best place to start – sometimes you need to dig to find those processes where knowledge is lacking and regulatory consequences are severe.
2) Identify, involve and engage your stakeholders
Success with complex, business-critical IT systems doesn’t come easy. Making sure that your stakeholders (internal and external) share the same understanding through insight and reports is key to building engaged, cross-functional teams – and to securing trust and support from the business. Consider the following stakeholder groups:
- Developers / admins
- IT Operations
- Business analysts
- Enterprise architects / IT Directors
- Executive management
3) Identify the underlying applications and infrastructure
What infrastructure supports the business processes and applications you identified in Step 1? These could include middleware, databases, server infrastructure and more. Go ahead and dig deep.
4) Install the necessary AIMS agents
Once you know what you need to monitor, and who needs to be involved with the project, you can install your application performance management solution – of course, we suggest AIMS. We recommend you install AIMS in both your Production and QA environments to be sure you catch any issues before production deployment. (Don’t worry: Installation takes only minutes, and AIMS runs with almost no performance impact.)
5) Set up the relevant stakeholders with access to AIMS
So, you’ve identified your stakeholders in Step 2. High five! Now it’s time to give them access to your monitoring solution to be sure they get the right information on performance. To avoid information overload and ensure better engagement, we recommend editing user profiles to give them access to only the most relevant systems / alerts, either through a login to AIMS or via email alerts. It makes sense to create a stakeholder matrix like the one below to map stakeholders to their relevant interests.
6) Customize any auto-generated message patterns and naming
To make sure the message gets through to stakeholders, make sure you customize any auto-generated message patterns / business process and naming to reflect the internal vocabulary at your company. You can also create reports covering individual BizTalk applications and name the reports according to the process supported.
7) Build dashboards and reports to communicate to your stakeholders
This is where it really gets fun. Use AIMS to create customized dashboards and reports that map to the stakeholder matrix you created in Step 5, giving every stakeholder exactly the information they need. To make the reports as intuitive as possible, we recommend you use the comments/text modules to explain to stakeholders what they’re looking at. And, it’s smart to educate stakeholders on key terms like anomalies and apdex up front, to avoid confusion later.
8) Train relevant stakeholders
Sounds obvious, but definitely take the time to train stakeholders! In general, we recommend doing hands-on training with technical stakeholders and rather rely on scheduling reports to the business stakeholders. But every business is different, so do what feels best for your organization.
9) Regularly file AIMS reports
Remember, always save your work! We recommend filing AIMS reports regularly for later documentation.
10) Develop or review the internal escalation procedure
How do you escalate issues internally? We recommend you develop an escalation procedure based on warnings and alerts from AIMS. Based on our experience with clients, we recommend you consider:
- Who receives alerts
- What are the escalation principles
- Actively using AIMS to troubleshoot identifying root cause
- Consider integrating AIMS with other ticketing solution
11) Schedule regular review meetings
Keep everyone informed and engaged. Use your stakeholder matrix and schedule regular review meetings with both overall and business-process stakeholders.
12) Consider using automated scripts
After using AIMS for a short period, you’ll begin to identify common issues (such as starting/stopping artifacts). We recommend you create automated scripts to resolve these issues, freeing up time to deal with more important things.
13) Keep focus on the most important events
You’ll want to keep your eye on those events with the most potential for impact:
- Anomalies: Review for impact, and possibly troubleshoot
- Infrastructure scaling: Use reports to ensure your infrastructure is scaled
- New errors / events: New error events should be reviewed
- Most frequent errors: Consider fixing most frequent errors
- Activities and changes: Keep an eye on any changes to your environment
- Top components: Monitor selected parameters for highest impact / resource use
If you follow these 13 best practices, you’ll be well on your way to running modern application performance monitoring and communicating valuable business insight to your internal and external stakeholders.
I bet you’re wondering about those two bonus best practices. Here they are:
14) Make regulatory, compliance or audit processes easier
AIMS analyzes massive amounts of data in real-time, providing a wealth of business and technical insight – not only into your IT environment, but also into things like usage patterns, service level, business KPIs and more. We recommend that you use this insight to make processes like regulatory compliance and audits easier.
15) Wash, rinse and repeat
And, last but not least, we recommend that you repeat these processes based on the business knowledge you acquire through them, and any changes that happen on a system and business level. Performance monitoring is an ongoing process that should grow and evolve with your business, so be sure to treat it that way.