I think I can speak for many people in IT, in saying that when crises happen (COVID-19, large scale weather emergencies, power outages, etc.) there's always an added level of stress added. Your carefully crafted business continuity plans are put to the test and you hope that your infrastructure can handle the burden. Keeping people informed is crucial when operating in crisis mode.
Recently, Microsoft posted a great article on how to build a crisis management site. I love this idea, but it's not really an IT resource is it? It's great for policies and updates, but doesn't give me the information that I need to continue to do my job effectively in stressful times.
If you are anything like me, you're busy working the issues and don't have time for a million questions. We always say that a picture is worth a thousand words. So with that in mind what's a good dashboard worth? 6.023 x 10^23 words?
In my previous role, these are what I think resources like these would have been best on a dashboard for my company:
Now, this is just what I came up with on the fly and it can't/won't apply to everyone. So my question to you is this: What makes a good dashboard for your company when you are in crisis mode? This is a discussion not a think-piece, so do not hold back on the comments and if you can share pictures, I (and the community) would love to see it.
Below is a few PowerBI dashboard views i've created recently using Solarwinds data so that we can keep an eye on the amount of remote users we have and the bandwidth they use. These were request as an executive view along side some other dashboards i'd created previously using SW data.
**The SQL data used to populate the below graphs is the exact same as used in the Performance Analyzer function of Solarwinds, THE SQL was a nightmare to figure out but now is utterly repeatable for most metrics available in Performance Anlayzer.
Monitoring Business Critical Applications in Orion with Netpath and Orion Maps.
YES! Can someone please explain how to get the Active Users for VPN chart? I have a case open with Support, but we've been working on it since Thursday with no light in sight. PLEASE SEND HELP!!! 😵
@sspring I believe you need to have the "Cisco Nexus or ASA device:" option enabled on the node (edit node).
After that is working, you should see an option to add the "Active Users/Sessions" data within the PerfStack chart, once you add the ASA node to the chart.
Does that answer you question, or were you meaning something else?
We are still not fully in production yet, but we had an issue with some of our US - London Links and I was asked to try and put something together fairly quickly. Still turned out not to bad, even managed to get the IPSLA Operation charted in PerfStack which looks pretty cool!
A work in progress I admit, but filled the initial brief 😉
Any have any pointer on how to create the dashboard would this the same as the Noc view? I tried going down that route and I can seem to add things from my SAM I am pretty good with SW but I have not created ant dashboards from scratch.
Update! I did some Powershell things and added some SAM applications for Cisco Umbrella and Clever, which are two services our organization relies on heavily. Unfortunately this means I had to move to a Summary View so I lost my custom Perfstack color palette, I'll have to track down the CSS for the widget 😉
This has been a very difficult time. I work for 911 - so each and every one of our dashboards if so very critical! There are only 14 people responsible for the day to day operations; there are 8 of us in IT.
I monitor the storage, would you believe that I have both the old VMAN Appliance running along with the new VMAN upgrade integrated with Orion. I loved the VMAN Appliance dashboard so much that I will continue to run it side by side with Orion as long as I can!
I watch the NetApp storage repository with both SolarWinds and OnCommand Unified Manager.
I monitor the Public Safety Answering Points (PSAP) - 911 phone systems located throughout the County of El Paso.
We have such a unique community where all the First Responders are all using the same phone, application and radio system, which allows us to service the community much more efficiently than those communities that have multiple distributed systems.
I monitor Time Skew on all the clients and servers running the application; the application, phone and radio must always have the same time due to legal ramifications.
Alerting goes to all 8 of us in IT; it took quite some time to tune the alerts so that we were not inundated with useless alerts. I am quite proud of the deployment as we are now proactive rather than reactive! I run SAM, NPM, NetFlow, DPA, Config Manager, VMAN, IPAM, Web Help Desk, and DameWare! THANK YOU SOLARWINDS FOR MAKING MY JOB FUN AND SO MUCH EASIER!
Every day is critical and now we are jumping hoops to accommodate the Emergency Operations Center as they hunker down to address the pandemic crossing the country.
Here are a couple of my favorite views:
911 Phone Network: (Its not a Pentagram!)
I've been working on a multi tab dashboard (Summary, Leadership, Virtualization, WAN, & our Line of Business Stacks) with my own view set to NOC rotation most of the time. I've found new ways of manipulating Orion's tools and a few headscratching moments but am thrilled to have a spot that's more than the default view.
I'm still working out how to do custom components for perf monitoring as well as glean active/disconnected session perf counters but so far I've got these up and running (sorry for all the empty fields at the moment..but infosec right?)
We're currently focusing our Pandemic response monitoring on Active VPN/Remote Users and our WAN Circuits utilization..
1. Active VPN/Remote user monitoring.. We're a Palo Alto Shop so we're using Global Protect VPN client so we monitoring our GP GW with OID - 126.96.36.199.4.1.254188.8.131.52.5.1.3 panGPGWUtilizationActiveTunnels . Which we using to create different graphs based on Total count overall and we also break down the graphs by operational regions using a custom property..
2. WAN Circuits are being selected by using another custom property applied to the interface that the provider is connected to .. Then we filter the gadget by SWQL i.CustomProperties.WAN_Monitor_Port = '1'
We're evolving the views/dashboard as we run is items that we need to keep a closer view as we never been through a global issues like this. We're also hitting new High watermark daily and going into uncharted territories. Nothing is normal any more..
Here's our home page view..BTW, our solarwinds environment only monitors the actual network, no servers other than Solarwinds itself.
Let me know if you have any questions..
Stay Healthy.. Ken
We are also focused on monitoring our VPN nodes, WAN routers and circuits and active users connections
I wanted to create an additional chart with Current Users connections instead of a table, but couldn't find correct oid for checkpoint vpn (its not 184.108.40.206.4.1.26220.127.116.11.3). So I just went for 18.104.22.168.4.1.2620.500.9000 and counted the entries.
Our network is a big hub and spoke, and any users from the spokes who want to work remotely must VPN in via the hub. This means that files pulled from spokes over the VPN consume bandwidth in the datacenter. Our top metrics are:
- Total VPN users
- Avg/Peak bandwidth on datacenter circuits
- CPU/mem on core networking hardware
Nothing fancy at all. Simple but effective. Luckily for our organization (K-8 education) a vast majority of our resources are on the internet, so there isn't a whole lot to monitor. We're not hosting our own collab services and our buildings are empty...
Love this dashboard! Is this custom built or something OOB? I am particularly interested in the AnyConnect chart you have here. Are you reporting on Active Sessions from your ASAs? If you have more than 1 ASA, did you have to create a custom poller to report a total? I know its a busy time for all of us so drop a line whenever you free up. No rush.
We're using split tunnel, thank goodness. If we weren't then the datacenter would be overrun before 8am. If we weren't it would have expedited our order for more bandwidth into the building. It would also likely mean we had made some conscious decision to monitor our users way more than we currently do, which would probably mean I'd be interested in some kind of flow data for context on what the traffic is.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.