25 Replies Latest reply on Aug 7, 2009 4:46 PM by Network_Guru

    Need accurate reporting of availability.

      Management is looking for an availability report that does not include the time that the node was one a dial backup link.  Orion is currently setup so that the nodes are watch for when they look for a new route to a specific device, and when this occurs it writes to the log and sends out an email. 

      Is there currently a report, or one that can be created with SQL, that would give me the availability of each node without it being on dial backup?  Can someone point me in a direction to explore?

      Thanks

        • Re: Need accurate reporting of availability.

          Would an interface availability report work?  Do the interfaces go down when in dial backup mode?

            • Re: Need accurate reporting of availability.

              Unfortunately they dont.  I was just looking at that.  When a router is up on dial backup, all the interface show up/up.  I wonder if there is a way to filter out times when the router shows up but the UnDP poller is active on that device.  The UnDP poller is what we use to tell if they have chosen a new default route.

                • Re: Need accurate reporting of availability.

                  If you have APM, you could run a query on the orion db which checks the status of the UnDP and the Device.  Anytime the query is up, you could mark the device in Dial backup.  You should be able to do the same with Excel but I'm not sure if you can report accurately on the history. 

                  Does your UnDP poller cause an event in the Orion Event log?

                    • Re: Need accurate reporting of availability.

                      We do have APM.  I am not so familiar with it but I will definately look into this.  The UnDP does cause an event, I believe.  If not, I can make it so.

                      Whether on the primary circuit or the backup circuit, all the interfaces show up.  Not completely sure I understand why this is so with the backup interface, its a standard async phone line, but the reason the primary stays up is because it is looking at, in many cases, a DSL router and in other cases an MPLS end point.  As far as the interface is concerned everything is normal.

                      Thanks for all the suggestions.. keep them coming.  =)

                        • Re: Need accurate reporting of availability.

                          The query below calculates downtime in minutes based on the Event table.  You can update the where statements to look at the particular events that you need.  This will give you results in minutes.  You can calculate this into a percentage once you have the data.  I did percentages first but was then asked for actual minutes of downtime.

                          Once you start getting data, you can message it to get what you need. Remeber that this reports uptime of a node so you are going to have to change the event type to the UnDP in the fields.

                          SELECT      
                          StartTime.EventTime,      
                          Nodes.Caption,      
                          StartTime.Message,      
                          DATEDIFF(Mi, StartTime.EventTime,       (SELECT TOP 1           EventTime           FROM Events AS Endtime           WHERE EndTime.EventTime > StartTime.EventTime AND EndTime.EventType = 5               AND EndTime.NetObjectType = 'N'               AND EndTime.NetworkNode = StartTime.NetworkNode           ORDER BY EndTime.EventTime))     
                          FROM Events StartTime INNER JOIN Nodes ON StartTime.NetworkNode = Nodes.NodeID   WHERE (StartTime.EventType = 1) AND (StartTime.NetObjectType = 'N')   ORDER BY Nodes.Caption 

                            • Re: Need accurate reporting of availability.

                              Thanks Lnavin, I think that might work very well.  However I need to pull out my newbie card and admit that I dont know where to use that statement.  I tried to open report writer and type it into the SQL box but it does not seem to allow that.

                              I opened APM (since we spoke of it earlier) but I could not find mention of reporting. 

                              Could you point me to the particular utility I should be looking for?  You help has been most appreciated.

                              Jim

                              • Re: Need accurate reporting of availability.

                                After messing with this script for a bit, I have come to realize that it isnt going to get the data that I am looking for.  The problem is that our particular server is too unstable and so it doesnt always send and alert or log a problem.  The data that came out was inconsistant.

                                I was wondering if one of you SQL gurus could help me take a different approach.  Would it be possible to write a report that showed the amount of time in the last 30 days, where there was bandwidth being used on a particular interface?

                                If I could do that then my report could include data from the primary interfaces of each router and the async interface on each router.  That should give me the amount of time in minutes that it was on primary and time it was on backup.

                                Taking it a step futher it would be nice if I could identify which interface on each router was the primary so that the report could base itself on that.  The reason is because the primary interface changes depending on the provider and type of circuit that we are using at that location.

                                  • Re: Need accurate reporting of availability.
                                    Network_Guru

                                    Hi Geoo,

                                    Why not monitor availability of a device behind the router?
                                    We have several hundred remote sites connected via VPN tunnels to a pair of head end routers. Each site has a pair of routers connected to different service providers.
                                    Rather than report on the availability of each router at the remote site, I monitor the availability of the switch that both routers are connected to.

                                    I schedule the custom SQL report mentioned above, to run every Monday morning showing the switch outage duration in minutes over the last 7 days.
                                    You could also use one of the pre-canned % availability reports based on the switch uptime.

                                    Even if you don't have a switch or other device behind the router to monitor, you should be monitoring a Loopback IP on your router. The loopback should be pingable regardless of the connection type (VPN, WAN or dial-up). If the loopback is not pingable then both connections must be down.

                                      • Re: Need accurate reporting of availability.

                                        Hey NG

                                        Thanks for the response.  Actually the problem is that it does ping regardless of interface type.  For example.  I am monitoring a node for up/down and most of the month it is up.  I want a report to show me how often it is down, but the problem is that when the node goes to dial backup, Orion still sees the node as up.  So a node might show a 99.9% uptime but in reality it is off of the primary circuit 40% of that time.  How do I show my boss what is a better circuit between the various types we have?

                                        Also, monitoring a device behind the router doesnt help for the same reason.

                                        Thanks for the suggestions though.  If I misunderstood or if you come up with something else, I am all ears (eyes).  This problem is driving me crazy and I dont know SQL well enough to get what we need.  Unfortunately.

                                          • Re: Need accurate reporting of availability.
                                            Network_Guru

                                            I understand now, you want availability of the primary circuit, not the node.
                                            I also created a weekly report several years ago which lists the minutes of dial-up time per site.

                                            I'll have to check the code, but I believe it uses the syslogs from the head-end RAS router to log dial-up events into the Orion DB.
                                            It then uses code similar to what was posted earlier, to calculate the minutes of dial-up per week.

                                              • Re: Need accurate reporting of availability.

                                                Exactly!  I would be very interested in your weekly report too.  I dont have a problem point our RAS to the Orion syslog server to get the results and i would assume i would have to change some of the parameters in your report.

                                                At the end of the day, if I could identify (previous 30 days if possible) that a node was on dial backup then I could, even manually, subtract that from the total uptime and give the people that pay for this what they are looking for. 

                                                Thanks for sticking it out with me on this one. 

                                                  • Re: Need accurate reporting of availability.
                                                    afsprau

                                                    Hello,

                                                    I am trying to do something much like this, getting a report that tells me how often the backup circuit has saved the day.  In my case the backup circuits are DSLs and T1 with Tunnels built on them, sometimes on a 2nd router sometimes on the same as primary.  

                                                    As of right now i monitor the HSRP as a node and if those go down i know the site is down, ( if no HSRP i have site marked in comments as NO HSRP my report knows to treat that node as the HSRP)

                                                     

                                                    If can pull information like flow from the backup if that goes above the set rate = site is on backup.

                                                    Then if both reports can be combined we would then know  A) when the site was down hard  and B) when it was limping along on the backup circuit.

                                                  • Re: Need accurate reporting of availability.

                                                    If I could bother the community just a little futher.  :)

                                                    I almost have something that will work as well.  First I used custom properties to identify which interfaces are primary and which are backups.  Then I created a report to show how much traffic passed a through each interface.  The total of those two should be the total usage for the entire month. 

                                                    How can I modify it so that my report shows the total, the broken down usage, and the percentage of the total for both?

                                                    Jim

                                                      • Re: Need accurate reporting of availability.
                                                        r0berth1

                                                        you can set the filter group to any instead of all, and use:

                                                        name of custom property = Primary

                                                        name of custom property = Secondary

                                                        then you can group the report by node name which will show both interfaces under that node. But i am not sure how you would get the total.

                                                          • Re: Need accurate reporting of availability.

                                                            Yep, and I am at that point.  Just not sure how to get the calculations i need.  I know how to calculate it, but i dont know how to format my SQL statements to calculate it OR if there is already something built in that will get those numbers for me.

                                                              • Re: Need accurate reporting of availability.
                                                                Network_Guru

                                                                I had a look at some reports and recalled the issues I ran into trying to use the RAS router syslogs for this.
                                                                What I finally ended up doing was creating an alert for all remote routers based on the transmit utilization of the async interface.
                                                                When the interface utilization exceeds 100bps, an alert is set (and logged in the Orion Event log). When the transmit falls below 10bps, a reset event is logged.

                                                                Since I'm only polling the interface every 5 minutes, it may not pickup a very brief dial-up, and times have a delayed offset of about 5 minutes. The report I created counts the number of dial-ups per site in a 7 day period, to highlight sites with chronic WAN issues.

                                                                I was able to use the pre-canned "events" report to create this report.

                                                                Another way to do this is based on syslogs from the remote routers.
                                                                I created a filter looking for Async up and Async down to calculate the number of dialups per site. A potential problem with this is the syslog may not make it to the Orion server if the router is off-line and in the process of dialing home.

                                                                I spent a lot of time trying to use the RAS logs, but was unable to find a way to correlate the site dialing in with the site in Orion.
                                                                Now that we have dual routers at each site, the need for dial-backup is no longer required. Instead we use the dial-up line for accessing the site if both routers/links are off-line.

                                                  • Re: Need accurate reporting of availability.
                                                    r0berth1

                                                    I also have one broken down by location for last month between the hours of 7 A.M. and 7 P.M. on the weekdays, and another broken down by location for last moth between the hours of 7 A.M. until 2 P.M. or Sat. they both require some custom fields and disply the % uptime instead of min, but they work great.

                                                    • Re: Need accurate reporting of availability.


                                                      The query below calculates downtime in minutes based on the Event table.  You can update the where statements to look at the particular events that you need.  This will give you results in minutes.  You can calculate this into a percentage once you have the data.  I did percentages first but was then asked for actual minutes of downtime.

                                                      Once you start getting data, you can message it to get what you need. Remeber that this reports uptime of a node so you are going to have to change the event type to the UnDP in the fields.

                                                      SELECT      
                                                      StartTime.EventTime,      
                                                      Nodes.Caption,      
                                                      StartTime.Message,      
                                                      DATEDIFF(Mi, StartTime.EventTime,       (SELECT TOP 1           EventTime           FROM Events AS Endtime           WHERE EndTime.EventTime > StartTime.EventTime AND EndTime.EventType = 5               AND EndTime.NetObjectType = 'N'               AND EndTime.NetworkNode = StartTime.NetworkNode           ORDER BY EndTime.EventTime))     
                                                      FROM Events StartTime INNER JOIN Nodes ON StartTime.NetworkNode = Nodes.NodeID   WHERE (StartTime.EventType = 1) AND (StartTime.NetObjectType = 'N')   ORDER BY Nodes.Caption 

                                                       

                                                      Back to this type of solution, I think.  What would I need to change in this SQL statement in order for it to look at the event message (ie; 'DialBackup Interface Down')  instead of the event type.  The reason is because if I do event types 10 and 11 to show when the interfaces came up or down then I would get all up and down interfaces, and I am only interested in the dial backup interface.

                                                       

                                                       

                                                      Thanks   

                                                        • Re: Need accurate reporting of availability.

                                                          You can update the query with an additional where clause to limit it to the Dial-up events by matching on the exact message generated by the event.  You will also nee d to add the where statment to the nested Select.

                                                          Sample Where clause

                                                          Where events.message LIKE 'Enter message here'

                                                        • Re: Need accurate reporting of availability.

                                                          If I am logging up/down dial backup messages to the Orion syslog server, can I change this to look at that instead of the event log? 

                                                    • Re: Need accurate reporting of availability.
                                                      kweise

                                                      When the router is not on backup, does the dial interface show down?  If so, you might be able to create a report and filter out times when the dial interface shows up.