    Trouble shooting with ORION NPM

      I finally have ORION NPM 10.1 up and running. I have an atlas configured with all the sites displayed as green dots. Now I am trying to get used to the product and actually use it to trouble shoot an issue. I have a site in New Jersey that has been complaining for weeks that almost every morning their network speed comes to a near stand still for 30 minutes or so then seems to recover.  I see NJ on my atlas and it is green. I click on the node and the dashboard appears with the dials displaying 53ms average resposne time and 0% packets lost. This all appears fine, then I look over at the histogram on the right and see right at see a lot of short green bars hanging around 40ms and then around 8:30am there is one bar that is almsot 280ms, then it returns to the normal 40ms. This one spike is happening about when the users are complainging about network speed. Now my question; Now that I have ORION NPM up and running and definetly know there is an issue in one office at the same time each morning. What tools should I use? What is the next step? Is there a trap I can run or some kind of alert I can set? I have very limited exposure to ORION, all input is appreciated..thanks

          I guess the question is what are you looking for at this point?  Do you want to be alerted in some way every time this happens?

          To me it would seem that you have at least partially confirmed the issue that your users are experiencing.  It sounds like a network issue so my next step would be to get more data from the border device at that site, starting with SNMP monitoring if you are not already doing so to see if the link is getting saturated or if the device is getting overloaded in some way.

          If all you want is alerts every time the response time on the device goes above a specified threshold, this would be configured as an Advanced Alert in the Advanced Alert Manager console tool.  You will need to get familiar with the Advanced Alert system as it's the bread & butter of Orion.  There are lots of documents on using Advanced Alerts such as the Orion Administrators Guide and the Advanced Alert White Paper.

          Hope this helps and provides you with some options to consider.  Please let us know if you have additional questions.

              Netflow may be the right answer, but I think byrona is on the right track.

              With the router (just because that is the worst choke point), are you monitoring it with SNMP?
              Can you see CPU statistics? (If not, and it is cisco, you can use "show proc cpu history")
              How is the processor at that time?
              (I had a 100mb circuit hooked up to a cisco 3640 and it couldn't handle more than 60 and I knew it because the CPU was pegged, but the interface wasn't.)

              If it does have SNMP, did you select the interface?
              (Both inside and outside, there could be some silly routing going on that is bouncing all the inside data from a specific node to the router's inside, then to another node on the inside.)

              If the interface util is high, this is where you need netflow.  If you want a SolarWind's product, then the Orion add on NTA, the engineer tool kit, or the free one (http://www.solarwinds.com/products/freetools/netflow_analyzer.aspx)

              Let us know if it was CPU or Interface, or something else and I can try to come up with other ideas

                  Thanks for replying everyone! you all have given me a lot to think about this Thanksgiving weekend...and I thought I was going to be just relaxing and enjoying days off! I am leaning towards NETFLOW also. I am spending this weekend getting more familar with it. I will say I certianly know a lot more about the solarwinds products then I did 2 weeks ago. AT this rate, I will be SCP certified by the end of the year! LOL

                My suggestion would be to now turn to NetFlow to see "what" is causing the issue on your WAN link. You can download a copy of SW free analyzer here. http://www.solarwinds.com/products/freetools/netflow_analyzer.aspx

                Outside of this, now that you have the timeframe, you just need to monitor what is exactly happening during that time now.