I can't show you my Dashboard for security reasons, but I/we use the Dashboard for two reasons on our team. I use it for availability and to quickly see any denial of service type of events. My role is an SA/SE so I want to quickly see if the logs are coming in, if they are I'm pretty sure my servers are up. I also want to see if the events are trending above average or spiking. I've been able to find two application errors because I saw the spike, drilled down and found countless log events that pointed us to problems.
My ISSE and ISSO are more interested in logins and access. They use the Events graph as well but look at some rules I put in place and some of the out of the box rules, to see if they have been fired off and how many times. They have found invalid access and access that was valid but folks forgot about it.
The bottom line, I'm training folks to build their own dashboards and we talk/share the views. Teach a person how to fish, then they leave one alone to learn more funky stuff:)
Ooh, interesting! I have some follow-up questions, if you don’t mind.
Would you say the OpsCenter dashboard if more reactive or proactive when understanding issues that come up?
And what kind of customizations do you/your team make to your dashboards? Any custom resources?
It's a toss up on the proactive and reactive on issues. We saw some spikes in events/minute, found the issue then put in a change request to the application team to fix the issue. So, we pivoted to proactive. We've also noticed some usage on our file servers which the SAN lead it looking to add more space for that team accessing certain files.
We haven't made too many custom changes to the dashboard other then picking the widgets from the list provided in LEM. That is why I went with LEM and more SolarWinds products are like this, they pretty much have what folks need right out of the box. Now over time that will change as the users become more educated but for this team here they like what they got and see. The ISSO moved some of the widgets around for his dashboard and can see which rules are fired off then drills down. He found some NEs in there yesterday scanning production, so he basically told them to stop. Schedule the scans for the weekend and off hours. He also wasn't sure whey they were scanning in general other than they could, that is probably another story for him.
I did added rules for Linux, which we need better connectors for Red Hat Linux BTW. When I see root being used, which is in /var/log/secure log file we get a rule. The issue is the log file for RedHat doesn't match up with the connectors. We try to use Oracle Linux connector but that doesn't work. The answer from tech support is install rsyslog package which I can't per the customer.
I also added a rule for remote login to MS servers, it emails our team and ISSO. He likes seeing that rule, he caught an old DBA accessing the file servers. He shut him down. The person was not supposed to be using and elevated account while working on engineering projects.
Hope that helps.
I understand this question is from April, but I hope you are still able to take suggestions. I believe the OPS Center Dashboard could be really useful, however, as it is right now, I don’t use it because the widgets are so limited and hard to create/configure.
I'd like to use the Dashboard as an accurate and quick status location for my Director. I'd like it to become her 'one-stop-shop' that accurately displays 30-Day stats, 1-Week Stats, and 24-Hour stats using pre-configured Widgets based on Industry Best Practices or Regulatory Compliance Standards (PCI, HIPPA, etc.) Additionally, the ability to create Widgets based on Rules, both pre-configured rules and ones that are individually configured (using an action response within a rule).
Some of the metrics I'd like to see include:
1) Failed logon attempts (separate Server, SQL, Service & Application widgets)
2) Blocked emails (via Barracuda Gateway) due to SPF and/or DMARC Failure (and other email filtering options)
3) Recently added and disconnected Non-Agent Nodes
4) Recently added and disconnected Agent Nodes
5) SQL Injection attacks, recon, and other attack attempts based on SEM's Threat Feed