NasaGeek

We had a scheduled power outage in our data center this weekend, meaning our boxes would get a free power cycle. We brought them up last night and within 2 hours the IO went nuts. I saw this before leaving and recycled the SQL service on the SQL box and it is still running stable today. Avg Disk Que Length is holding at a…

in 9.5 and SQL Performance Comment by NasaGeek January 2010

18 basic and 4 advanced. IOS change, we look for packet loss at certain areas, high utilization on others, cpu etc. Nothing that has caused us problems before. I can get you a more detailed list if you need. When I went to check in System Manager, the left plane blinks like it sometimes does, but this time it has not…

in 9.5 and SQL Performance Comment by NasaGeek January 2010

I'm not sure about Orion doing that, but you can easily do it with cron, Perl and Netsnmp, and if you wanted to graph it you could throw the results to MRTG. I do several things like this that you just can't do with Orion. Perhaps Orion will become more powerful and allow us to give it results that it can then graph. You…

in Universal Device Poller - per second calculations. Comment by NasaGeek December 2008

I was not on SP5 for 9.5. So we did this, and then restarted all the services. The problem cleared after this. We also noticed that this error was associated with our Custom Pollers not polling. I suspect that the issue was a simple bug and the restarting of the services corrected it.

in Error adding host Comment by NasaGeek March 2010

Not DNS either. This just happened recently and we have not made any changes to the system. I am opening a ticket. I will post the result from that when this is resolved.

in Error adding host Comment by NasaGeek February 2010

We had the IO staying at 30. We tried all the usual stuff to bring it back, including starting and stopping the SQl service. The only thing that worked today was stopping and restarting the NPM service on the second polling engine.

in 9.5 and SQL Performance Comment by NasaGeek January 2010

If I understand correctly, this would also solve my issue with force monitoring interfaces not in an up/up status. Having to come to a seemingly quasi-supported forum to ask for future requests seems a bit unprofessional to me. I am about to start rocking the boat, not here obvioulsy. See my post in this forum "Forced…

in Error state when non critical interface are down Comment by NasaGeek December 2007

This problem is not just isolated to 2k8. I have a 2k3 server that stopped sending email as soon as we upgraded to 9.5 SP5.

in report scheduler - 9.5 sp5 - not emailing Comment by NasaGeek March 2010

Thank you. It is a good feature that gives the users much more control. I do like the behavior to automatically not monitor "down" interfaces. The automation this provides is invaluable, but to add this control would make Orion that much more powerful, and would allow us to monitor interfaces like the Cisco SPAN ports.

in Forced Polling on Non Up/Up Interfaces Comment by NasaGeek January 2008

No syslog and no traps.

in 9.5 and SQL Performance Comment by NasaGeek January 2010

Turning on Adaptive Read Ahead has brought the IO back to nominal levels. However, this begs the question, why is Orion performing so much more reading with 9.1 then 8.5?

in Disk IO and Version 9 Comment by NasaGeek December 2008

Yes we have SP1. Before this when the maintenance would hang, the ADQL was in the 40's.

in Disk IO and Version 9 Comment by NasaGeek November 2008

Put a sniffer on that interface to see what it is. However, I have never done this and could not tell you how to do it. Look here. wiki.wireshark.org/.../Loopback

in MS TCP Loopback interface Comment by NasaGeek March 2008

The one I was thinking of is the following, but this begs the question which one is used and if one has precedence over the other? The file is SWNetPerfMon.db, also in the root folder. And the line to modify is: ! Database Command timeout in seconds CommandTimeout=300 EDIT: Web.config is in the Inetpub Solarwinds root…

in Increasing time so web reports don't time out Comment by NasaGeek February 2010

Controllers as in hard drive? Only one with two back planes. It is the PERC 5/i. No netflow, no other modules.

in 9.5 and SQL Performance Comment by NasaGeek January 2010

The firewall is not running. netstat -na | find /i "17777" TCP 0.0.0.0:17777 0.0.0.0:0 LISTENING TCP 127.0.0.1:1789 127.0.0.1:17777 ESTABLISHED TCP 127.0.0.1:17777 127.0.0.1:1789 ESTABLISHED TCP our-solarwinds-box:1788 our-secondary-poller:17777 ESTABLISHED TCP our-solarwinds-box:2547 our-secondary-poller:17777 ESTABLISHED…

in Error adding host Comment by NasaGeek February 2010

After looking at the issue again, I noticed that we were not losing any data points. I looked further into the disk IO and found that the high average count is related to reads, not writes as I first assumed. The write ADQL on this disk is .09, and this is why we are not losing data points. I do not have 'read ahead'…

in Disk IO and Version 9 Comment by NasaGeek November 2008

We have had many issues with the polling engine and the database. One of them was very similar with our DB maintenance running all night (12 hours), and during this time losing essentially all data polled. We were able to get it down to .5-1 hours, and we lose minimal data. First let me complain about SW support. Most of…

in Database shrink Comment by NasaGeek January 2007

Bounce. I am calling support today to ask for this feature and comment on the lack of response from the employees on this forum.

in Forced Polling on Non Up/Up Interfaces Comment by NasaGeek January 2008

I found out that Juniper is going to fix this down/testing issue. That indeed it should not have been "expected behavior". However, we still have the Cisco SPAN ports that report as Up/Down. I have seen other posts where people would like this feature. Any response from a Solarwinds Employee would be appreciated.

in Forced Polling on Non Up/Up Interfaces Comment by NasaGeek December 2007

Thanks for you help.Yes, I have the 'all' selected. I have an almost similar alert for a custom poller rate that works fine with Numeric Status is Greater then.... but I was trying to get an alert anytime this counter moves. If the 'Has Changed' does not work with custom pollers, is there another tick to solve this? I'm…

in Help with an Alert Comment by NasaGeek July 2008

I had the same issue of not being able to login. We are using NT domain logins for NPM. When you add NT domain users in NPM you have to include a fake password so NPM will accept the domain/user id. You use your domain id and password for NPM logins, but we found you must use the fake password when logging into Orion…

in Invalid Username / Password in Network Atlas Comment by NasaGeek March 2010

Support was unable to duplicate our issue. I suspect it has to do with differences in our server load. All federal government agencies are required to implement security benchmarks on their systems. If I am correct in my suspicions, this workaround may help others with similar benchmark requirements.…

in Scheduled Reports with Windows Pass Through Comment by NasaGeek February 2008

1.3.6.1.4.1.26866.1 - Gigamon GigavueMP http://www.gigamon.com/

in Tell us your "Unknown" devices! Comment by NasaGeek January 2009

Edit: Nevermind

in Tell us your "Unknown" devices! Comment by NasaGeek July 2008

Solarwinds still can't monitor Cisco SPAN ports. It would be nice to be able to force NPM to record these port's statistics.

in If you're curious as to what we're working on... Comment by NasaGeek March 2010

See if you are losing data during that time. We have to run our own Rebuild Index jobs to keep our databse clean. Everytime we run them we get the failover message. We don't get failover for the Solarwinds nightly maintenance. Since we don't lose any data so we just ignore it. I wish Solarwinds had chosen a different…

in Failover to Hot Standby Comment by NasaGeek February 2010

Cool, I did not know about this tool. I ran it and there was nothing in the AlertTestLog.csv. Good, if we see this again, I will run this to see if anything shows up. Thanks for the quick reply.

in Advanced Alert Stopped Working Comment by NasaGeek March 2008

We saw this on hardware that was more than enough to handle the job. Support did not figure this out, but we traced the problem to some BSD devices we were polling. Once we removed these, the application became stable again. I have noticed that the performance is worse in 8.5.1 than it was with 8.1. It got better and now…

in Gaps in Polling/Chart Data during DB Maintenance Comment by NasaGeek December 2007

NasaGeek

Comments