One of the common challenges in troubleshooting performance issues is that the multiple dimensions belong to different teams. Co-ordinating the troubleshooting across the teams can bring its own challenges. I really like the feature of PerfStack where the dashboard URL contains all of the information required to recreate the dashboard. The net result is that I can paste that one URL into my help desk ticket to include the evidence to hand off an issue to another team. Equally, when another team sends me a ticket it can already have a dashboard to jump start my troubleshooting.
I’ve seen help desk tickets bounce around from team to team within large organizations. As any network engineer will tell you, the network is always blamed first. To prove the issue isn’t the network, you craft together some graphs showing that all the latency is in a VM. Then you paste a screenshot of the graphs into the help desk system and reassign the ticket to the virtualization team. Shortly afterward the virtualization team says they are unable to see the issue and can you provide more details. This poor handoff between departments slows the whole process. The handoff makes it difficult to resolve the problem for the application end-users. It also makes every team feel like the other teams are idiots because they cannot see the obvious problems.
With PerfStack, you are able to hand the virtualization team a live graph showing the performance issue as being a VM problem. The virtualization team can take that URL and make changes to the dashboard. They might add VM specific counters and also information from inside the operating system. The VM team may identify that the issue is happening within SQL server. They hand it off to the DBAs, with the URL for an updated dashboard. The DBAs rebuild the indices (or something) and all the performance problems go away. The important thing is that the handoff between teams has far more actionable information. Each team can take the information from the previous team and adapt it to their own view of the world. The context of each team's information remains through the URLs in the ticket. This encapsulation into an URL was one of my favorite little features of the PerfStack demonstration.
One thing to keep in mind is that collaborative troubleshooting is more productive than playing help desk ticket ping pong. It definitely helps the process to have experts across the disciplines working together in real-time. It helps both with resolving the problem at hand and with future problems. Often each team can learn a little of the other team’s specialization to better understand the overall environment. Another under-appreciated aspect is that it helps people to understand that the other teams are not complete idiots, that each specialization has its own issues and complexity.