2 Replies Latest reply on Jun 30, 2009 2:35 PM by lhorstma

    Interface chart lines/ sample interval after upgrade

    lhorstma

      It seems that some of my charts aren't displaying right anymore after migrating and upgrading from 8.5 to 9.1.

      Below are two screen shots, the first using a 30 minute sample interval and the second using a 10 minute sample interval.

      [IMG]http://img81.imageshack.us/img81/769/orioncharts.jpg[/IMG]

      If I extend the interval to 2 hours I get the pretty chart I expect:

      [IMG]http://img31.imageshack.us/img31/2894/orion2.jpg[/IMG]

      When I look at the raw data I see data only every 40 minutes or so. I'm set to collect statistics every 9 minutes so I would expect every interval to be filled when using the 10 minute interval. So anyone have any ideas about what's going on?

          • Re: Interface chart lines/ sample interval after upgrade
            lhorstma

            Looks like we've gotten it resolved. A week ago I had scripted a change to snmp security settings enterprise wide to add support for a third polling engine. Due to misconfiguration on a large number of servers (maybe 100) it actually removed support for our two existing polling servers. Those 100 servers or so started sending a flood of authentication failed traps to Orion, which then filed them away in the database. The database grew from about 2 GB to 16 GB in two days (I was off those days) before I found the issue, corrected the security settings on those servers, and put a filter on the traps to discard authentication failed messages. I talked to my DBA about the growth of the database and he didn't think it would cause any issues, so was left as is for several days.

             I went through the doc you provided and found the database disk subsystem was going pretty wild. The avg disk queue length was pretty high (avg between 16 and 20) as well as avg disk sec/write and avg disk sec/read hovering between 40 and 50 ms. I called support and we went through several things before truncating the traps (3.5 GB) and trapvarbind (10 GB) tables in the Orion DB. After that all the disk counters dropped to low levels and the data gaps disappeared. Now, if only I could put back all the hair I was pulling out...