Orion Platform alerts can be sent to LEM (traps? emails? syslog?) for further analysis and correlation
"To my mind, LEM could be your most used "pane of glass",...."
For what it's worth: I agree, with some caveats
I ran a monitoring system with an event manager that collected events from several different monitoring systems. I used it as a single pane of glass and it was literally the first tool I checked when things went wrong or for researching issues and I recommended others do the same. Most other people did not use it this way though because they were the individual monitoring tool owners (and the event UI was slow to load) and since they weren't sending all alerts/events to the central system (just those that were "bad enough to send notifications on") they considered their tools to have more timely and relevant data than the central tool. It was also extremely difficult to get people to send any more events/alerts then they were absolutely required to do. I believe this was because people don't like everyone knowing all the dirty details of what goes on inside their apps:
1. Anyone could view the "single pane of glass" and they didn't want a bunch of small "non-issues" to constantly have to answer to or spend an inordinate time trying to fix (like when some device always has some metric that's always a little high than "best practices"and it's too small of an issue to bother with).
2. It was a lot more work to deal with sending all the data and making sure it's "perfect" so people hedge on sending only the perfect alerts. The "I'll send you this single "down" alert, but only after it's really, really, really down. and I'm OK with everyone knowing it's officially down because it's really down and I already notified myself via email 5 minutes before I sent it in so I don't get zinged my management by not knowing the status of my system the moment they get notified from the single pane of glass"
In addition, since the event manager had more technical data in it (due to the difficulty of hand cleaning and tagging data. most of the time we did what was needed to get the job done and no more) , it was difficult to make a "management friendly" view inside it (i.e. charts and graphs, correlation that was easy to follow, everything always perfectly green with the red always showing the correlated issue, etc.), so we spent time extracting and parsing the information into reports (and other things) that could be run and imported into "the management tools" so management could have their single pane of glass. In the end I'm not sure if anyone actually looked at these reports once they were made as they seemed to always directly ask us for the info
Oh yeah: A few teams *did* send everything in, but it was only to keep it simple on *their* side so they didn't have to make any decisions/work to clean up the data in their tools. They then asked us to finagle, modify, clean, block, correlate, and scrape all the information out of their events/alerts for them even though their tools had superior capabilities in them for some of these things.
I would like to go beyond requiring manual configuration to share alerts. My organization uses several Solarwinds products right now - NPM, NCM, IPAM, NTA, SAM and IVIM. We are looking for a new SIEM solution. We started by checking with Gartner, but they do not appear to rate LEM very highly due to limited functionality versus competing offerings.
It seems to me that the Solarwinds suite has all the functionality required if it were integrated more deeply with LEM. Ideally, LEM should have real-time access the databases of those various components, if only to import the data into its own native database. That way it would have access to all the data that the other Solarwinds components were already collecting, including SNMP polling and traps, Netflow/IPFIX flow data, traffic/application baselines, and server/application and network device configurations.
It would be equally good if, having analysed and correlated the data pulled from those sources and its other native sources (file integrity checks, OS and application logs, IDS profiles, vulnerability scans, etc.) and identified the root cause of an issue, it could then issue commands to those same applications to trigger alerts with recommended actions, or even autonomous initiate remediation procedures e.g. trigger NCM to push a config change to a switch to disable or change the VLAN of a port connected to an infected host.
To my mind, LEM could be your most used "pane of glass", showing incidents and problems at a high level, as well as their root causes and mitigation recommendations. Drilling down on an incident or problem identified would then take you into the other Solarwinds components where the specifics of the symptoms of the problem could be shown for further analysis.
If this functionality already exists or I am over-thinking it, please disregard my suggestion.
To send alerts from Orion to LEM, you should be able to fire them as SNMP traps to LEM using Orion's alert manager, then on LEM set up the Orion connector in Manage > Appliance > Connectors (or using connector discovery if you have had Orion alerts fire already) to have them generate LEM events.
To send events as traps from LEM to Orion, you can use the send SNMP trap action in rules, and you also need to enable the SNMP Active Response connector in Manage > Appliance > Connectors. On the Orion side they appear in the SNMP Traps view.
If there's stuff we can improve on here, let us know.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.