This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

What is/was your biggest challenge as a new Orion Platform admin?

Loading question...
  • >> Other (comment below)

    Absolutely our biggest headache was understanding certain facts:

    • Solarwinds isn't as easy to start using as it is made out to be.
      • certainly true if you expect it to do everything you've been used to from another tool
    • You absolutely need someone(s) that understand and can use:
      • Powershell
      • SQL
    • Understanding that Tech Support doesn't mean 'Tech Support' but rather it means break/fix
    • Understanding that Solarwinds push you to THWACK to do support without the fulltime Solarwinds resources being there to help provide said support and instead relying on their clients that pay their bills.. v.odd way to do things IMO.

    There's other stuff but the above are key.

  • As with any new monitoring tool, learning the layout of the landscape and understanding how it works in order to better use it takes some time.  Another important thing is learning its limitations with regard to what you are trying to do.  Things like what aspects of the database are exposed for the polling engine and any scripting that is involves.  Does the shell return a success if the command is issued, or is it that the shell ends successfully or is it the return code of the script that was executed.  These are just a few things that are important to know.  Then you get into fun times with maintenance windows, how are they handled by the tool, can you set up a template that you apply to various monitors and or alerting rules.  How does it handle varying maintenance windows over a weeks time?  These are some of the challenges everyone faces with new tools.

  • I am converting the Altas component monitors to the new dashboards, and that is definitely a challenge. What I also struggled with was moving the polling engine and additional web server to new systems. We renamed the systems but kept the old IP addresses. There is documentation for "new name new IP" and "existing name existing IP", but not really anything for "new name existing IP". 

    We use the Atlas dashboards to display the status for all the components for one of our major application. It changes between three screens of information. I need to move that to modern dashboards, and I am hoping that Lab 89 (that was a hint in a question earlier in this month's mission) has the answers for me.

    Just a quick aside - I went around and around with support on terminology. He said an issue we had was with our AWS. I said we don't have AWS, and he insisted that we did. That took several minutes back and forth. Then I finally figured out that he was not talking about Amazon Web Services but Additional Web Server. D'oh!

  • Trying to understand the logic (or lack thereof) used by previous staff in setting up alerts. In one case, each team set up their own alerts, so some alerts were node-specific (as in "alert me when node xyz = Down"), others were very general. Some used variables; most did not. Repeat x 400-500 alerts

  •   yeah I have learned the same thing. We used to offer 4 tiers of alerting schedules, but we eventually reduced it to 2:  24/7 (p1) and business hours only (p2). Usually everyone says they want P1 monitoring, and they we say, "OK, so that's going to wake you up at 2 AM if it goes down... that's what you want, right?"  About 1/3 of the time, they say "Wait, maybe p2 is fine".

  • I had limited SQL experience/knowledge and had to get help from other SQL admins at work dealing with SQL issues.

  • Creating meaningful, usable alerts was one of the initial challenges.  And prior to that, understanding snmp and snmp-v3, creating standards-based documentation and implementing those standards for every node (include naming conventions, snmp strings, etc.).

    Later, upgrades were hard.  And they took a lot of time.  Lately they've been easy, if more time-consuming than I'd like (but they're NOTHING like they used be--so much quicker now than back in 2005!).

    Obtaining funding for monitoring everything that ought to be monitored was a challenge.  Then obtaining funding for the additional modules, since NPM couldn't do it all, became the challenge.

    Getting training for it all--impossible.  It was all DIY and OTJ, working hard to get it right the first time and not worrying about trying to CMA.

    Finally, justifying the ongoing annual support/licensing expenses when my manager was a fan of home-built or best of breed products (especially those created using Open Source resources at no $ cost to us).

  • Justifying resource is always a headache.

    The SW docs say to add loads of memory & loads of cores, especially for large installations, but when you inspect each of the servers utilisations, with Orion, with DPA or directly on the servers, it's really difficult to find any evidence that will support the request for additional resource. Either the published resource requirements are wrong or the explanation of the effect of under-resourcing is wrong. I suspect the latter, which should be easier to fix.  

  • Additionally learning the SWQL language for coding advanced alerts and conditions that cannot be achieved by the standard GUI options.  

  • My biggest challenge to date has been and will most probably continue to be will be potentially improving the inherited Custom Properties for the environment being monitored.

    Making sure that all of the Nodes, do actually have a Custom Property value assigned even when it may not be applicable.

    I personally try to avoid this by creating drop down lists all with positive list items not duplicated from other Custom Properties.

    In addition I also try and avoid using the just Yes or No / True or False options.

    Where the Custom Property header is not applicable to a Node I believe it is better to say ‘Not Applicable’ from a Drop Down List as opposed to leaving it Blank.

    This is a Custom Property value that may also be included within an Alert or Event Notification and Reports to mean something by filling the Blanks.

    Please see my new Feature Request for: Custom Properties Custom Order List.

    Thank you in advance for Voting this Feature Request up.

    Kind Regards,

    Daniel

    DC4Networks